Tweet activity

December 2022

Your Tweets earned 830.1K impressions over this 31 day period

20.0K40.0K60.0K80.0K1020Dec 4Dec 11Dec 18Dec 25
Your Tweets
During this 31 day period, you earned 26.8K impressions per day.
  • Impressions
    Engagements
    Engagement rate
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 30 They don't have access to either in full generality: GPT is limited to scrapeable datasets like PMC or Arxiv, HTML native ones which are free etc; but none have access to raw PDFs from anywhere. (eg Meta's Galactica just uses the *abstracts* of the '40m' papers or whatever)
      9,080
      140
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 30 no one tell the logicians that all syntactic related things are worthless 'word games', they're unstable enough as it is
      96
      3
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 30 Surely he didn't invent Bayesian meta-reinforcement learning (which is all in-context learning is: solving the POMDP by conditioning on a history or sufficient statistic).
      211
      15
      7.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 29 Not sure. There's so many Asian-Americans/immigrants here, and then you have Korea etc. I assume I'm not seeing many Chinese, for obvious reasons, but beyond that...
      180
      8
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 Hawaii's definitely for couples/newlyweds & pair-bonding... Shocked xenobiologists a million years hence: "𝘏𝘰𝘮𝘰 𝘭𝘶𝘥𝘦𝘯𝘴 returns to island breeding grounds, no matter distance, to spawn; researchers believe warm shallow waters shelter them from as-yet unknown predators."
      8,140
      118
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 (Oh, I know plenty of people online who talk about it, I just didn't know any 𝘳𝘦𝘢𝘭 people. So it comes as a shock each time.)
      453
      9
      2.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 It's the donation ads. It's that bad. I even hit the dismissal buttons twice without the ads going away, merely reloading the page & losing their place, at which point I was too ashamed to keep trying and they just kept reading with a third the screen WMF spam.
      170
      7
      4.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 Yes, I'll just tell them to install uBlock on their phones and learn its CSS dialect... (I'm still working on explaining 'hotspots'.)
      212
      10
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 It's an emerging trend. A few papers here and there before, and now getting serious with PIXEL and MAE archs etc. Chinese is a pretty obvious place to use pixel encoding because it's so obvious you can't simply write down Unicode points or number of lines to reflect the *image*.
      82
      4
      4.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 Not sure I'd say that it was *capitalism* which entered the Wikimedia Foundation and seems to be behind the lust for a perpetuity.
      298
      16
      5.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 Watching relatives try to look stuff up today (even tapping didn't make the banner ads go away): When did Wikipedia go from being the least ad-infested website I use on a regular basis—to the most?
      12,131
      419
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 28 Maybe. I recall a lot of skepticism about the 'PCCs' being transformed into, and you have to admit, that's a bizarre looking funnel plot just vertically in terms of power/error: where's all the *medium* power studies...?
      5,391
      63
      1.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 27 "Wait, this algorithm is *self* supervised, not *un*supervised?! Why didn't you say so - this changes everything! ! ! !" --no one, ever
      435
      9
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 27 No, I just think the fad of 'self-supervised' is very dumb, in part because it licensed a new generation of pedants 'well akshully'ing a neologism - no, sorry, self-supervised = unsupervised, and no one has ever shown a meaningful distinction I should care about.
      540
      14
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 27 I will eat my hat if GPT-4 is not primarily or even 100% trained using prediction of tokens - you know, just like GPT-1 (the RNN), GPT-2, or GPT-3 (or Gato, or...).
      83
      6
      7.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 27 Predicting tokens has always been in 'unsupervised learning', attempts to rebrand it to 'self-supervised learning' notwithstanding, and no one claims it is supervised learning / labels in the usual sense! His statement is correct.
      379
      8
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 26 I'm kinda bemused: he's probably correct, you know, it'd probably be unsupervised (just like GPT-3), which is why it's so versatile, and even if you don't like *that* basic truth of GPT training, we already have like a dozen papers showing GPT-3 can self-improve without labels!
      8,731
      141
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 26 Moore makes a good case we need more imperialist wars of colonization, given that malaria and a bunch of other things are still around in poor countries... 🤔
      7,273
      159
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 26 Most uses of the Outside View rely on the events, not commentary/predictions, if only because it is (as you well know) so difficult to compile predictions and typically explicit predictions don't even exist, often for reasons like not knowing 'industrial revolutions' exist.
      485
      12
      2.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 26 I'd have a lot of doubts about whether that was really any good. It's not like GPT-3 suddenly starts flawlessly rhyming if you just spatter the page with whitespace. So, since ByT5 exists and is a clean test and is known to do well on exactly what we think BPEs would impede...
      72
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 25 They dropped the prices drastically and you can use the smaller one as well, so I'm not sure how bad the cost is now. And how valuable is one's time? If you know how to use the alternatives already, and can run them with zero yak-shaving ever, great... But most can't.
      77
      8
      10.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 25 Sesame certainly isn't bizarre. (Sesame-seed bagels are my favorite.) Adding it to hot dog/hamburger buns, brioche, cookies, French toast sticks, crackers etc (full list unknown), none of which had sesame or are intended to taste like sesame, on the other hand...
      61
      2
      3.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 25 Going down to rawer data to avoid feature problems is not 'better' modeling (that would be, like, adding IPA or using better tokenizations like unigram). This is 100% a scaling win. And also it *does* show scaling makes the spelling work, for specific words. Also a win.
      479
      17
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 25 “The past is never dead. It’s not even past.” At the USS 𝘈𝘳𝘪𝘻𝘰𝘯𝘢, people wearing 80th anniversary t-shirts—suddenly, oil slicks left right center, pulsing up from below in expanding rainbow shimmering circles. Beautiful, in its own way.
      8,387
      126
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 23 I'd definitely be concerned about all the downstream users of food who know X doesn't contain anything bizarre like sesame - because why would it, why would people just go around adding sesame to random things, that'd be insane...
      4,536
      44
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 How men past their prime in low-status fields have always had to compete for attention, one assumes: money. 😉 (ie. spend moar on turkers)
      547
      20
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 I think it's interesting for the safety/sociology/methdological aspect it has increasingly taken on since I highlighted in 2020 what I thought was merely an amusing minor bug soon to be fixed...
      400
      22
      5.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 (I use `rename` or dired-mode, yes, for renaming files, nbd; what burns my Frosties cereal is all the *references* to file names... Mass search and replace, nginx redirects, or worse, oy vey.)
      654
      9
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 'Use the Doomsday argument: in an increasing series you randomly sample from, you observe the median point half the time, 90th decile 10% the time, and so on. Therefore, you should expect twice as much on average, and add a 0 if the final ID is >4xx.'
      4,712
      40
      0.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 How do you decide how many leading zeros to prefix in filenames or schema when you're not sure and don't want to rename from 1234.txt to 01234.txt? Bad answers only:
      13,476
      277
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 And that's been something I've been criticizing since July 2020: "no, your BPE feature-engineering is bad. Stop that. You are erasing information from the raw text inputs, and loading up on bias you don't even notice in exchange for short-term variance. Just scale char more!"
      294
      23
      7.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 No, this is an L for feature engineering & win for scale. BPE -> char is *removing* feature engineering and relying on scale to fix it. The whole justification of byte-pair encoding was that we could hand-engineer a better representation than characters which was more compressed
      4,490
      114
      2.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 22 And just like when it was made, that post continues to be useless because it benchmarked the old OA embeddings on the *one task OA said it was bad at and you shouldn't use it for*: sentences. In the last evaluation, the actually relevant one, it does fine.
      62
      8
      12.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 You worded it poorly by not specifying that you were attacking the part no one would think you were attacking because it is such a dumb trivial point. And ChatGPT is a terrible way because that doesn't even gauge *ChatGPT* capabilities - as proven by jailbreaks!
      69
      4
      5.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 That's nice. I don't see how that is relevant to my point, which is that the NYT has gone on record as having obtained memos and sources repeating the same 'code red' use and meaning as the supposedly false notes that even Googlers were mocking for being blatantly wrong.
      69
      2
      2.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 'ChatGPT' is a bad way to gauge what GPT models understand, as the jailbreaks of greater functionality alone demonstrate. And if you worded it badly, that's not my fault. (What is there that GPT 'fully' understands?)
      27
      1
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 There have already been many demonstrations going back to at least August 2020 of using GPT to control browsers (I'm not even including WebGPT or groups like Adept). It wouldn't be hard to add images to HTML inputs either, look at CM-3.
      42
      1
      2.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Or makeup, or a lot of things... I don't think we have a good word for it yet (neither 'professionalism' nor 'sprezzatura' capture it), so the phrase will have to do for now.
      97
      5
      5.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 I continue to be amused how the single most useful feature was... the little bash script for uploading files or temporary files. Saves me so much time: it's done and ready and put the URL in copy-paste before the Dropbox/GDrive/Mega pages would even load.
      9,433
      165
      1.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Doesn't work. Think about it more behaviora-genetically: there's lots of overlapping traits there. You might as well suggest trying to distinguish by looking at novelists who had/hadn't lawyers for parents... That's why Galton started with papal adoptions etc and moved to twins.
      5,167
      48
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 The phrase 'good design is invisible' has been coming to mind a lot as I watch all the Twitter clones (& also the new Twitter management). How hard could it be to clone and improve on Twitter? Apparently 𝘱𝘳𝘦𝘵𝘵𝘺 𝘥𝘢𝘮𝘯 𝘩𝘢𝘳𝘥 (at least for the sort of people who do it).
      5,329
      86
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Yes, and people listen to baseball announcers too, doesn't mean baseball isn't the most boring sportsball of all... The fact that streamers and lets-plays can shuffle so seamlessly between games (which are often being heavily patched) shows that it's parasociality, not gaming.
      116
      7
      6.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Like dreams, it's hard to write about games without being boring: especially if it took 130h, it tends to be a 'you had to be there' experiential thing... You should only bother if you can deliver enough spice to make yourself into the main character again.
      5,930
      87
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 The one developed by the guy who couldn't remove a JS popup from Twitter after 4 weeks and quit, saying it was too hard? I guess it goes to show web dev 𝘪𝘴 in fact harder than AI or rocket science.
      180
      27
      15.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Hm, for a proof this complex, I think it'd be good to have a fully machine-checked formalized proof. Then the mathematical community could have real confidence that it's correct, as opposed to just having error-prone humans skimming it & saying it looks right. Wait...
      4,370
      48
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 (Similar issue with the text discriminant analyses you sometimes see, eg 'male vs female words'. If 1 guy says 'bitch' once and 0 girls say it, then it'll be the top hit for 'male language' as it perfectly predicts a male writer; but that doesn't mean the forum is hugely sexist.)
      63
      3
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Even without sampling error (what does that mean if you have a census/complete count? what's the 'random error'?), it'd still be hard to interpret because it is omitting base rates in favor of odds/RR. A tiny subreddit of a dozen people can have a total overlap easily.
      115
      7
      6.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 Not a great analogy because car engineers know extremely well how cars work! If car engineers knew cars the way DL researchers know how Transformers/self-attention work, they'd sit around and ask 'do cars need... 𝘸𝘩𝘦𝘦𝘭𝘴?', take them off, and find it works almost as well.
      380
      60
      15.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 21 There's more to life than a logic gate, you have to be able to set it up right. *Can* you set up a recurrent circuit with the immune system after you've used up said immune system with their circuit construction?
      689
      15
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 Never thought I'd find myself defending Taleb like this but (a) Taleb's aphorisms and gimmicks are genuinely more funny & (b) at least he's never murdered anyone and his defrauding, pump-and-dump, conman, & tax evasion activities have been limited to overpriced $29.95 hardcovers.
      2,456
      116
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 DeepMind never went anywhere, and kept building on AG with things like Gato and PoG. They'll have their 'Stable Diffusion' moment, if you will, as everything merges at scale and we start to enjoy the cherry on the top of the cake.
      859
      41
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 Nothing comes to mind offhand either. I think it would probably perform worse because the performance characteristics of generating a bunch of prefix tokens before training reals are going to be bad. Anytime your GPUs don't go brrr as much as possible, Transformers suffer on net.
      74
      1
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 Selective breeding might be a better example. No one in the world has even 1% of the picture of how 'corn' works and was evolved from ancestral maize the size of a thumbnail - it just does. (An occasional gene or biochemical pathway, but likewise for Transformer heads/circuits.)
      467
      36
      7.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 Another example: my college recently did some renovations, and changed a bunch of parking spaces to this weird 'QR code registration required' thing. No idea why, surely some excuse about parking efficiency. So... no one uses the parking spaces, and that makes parking even worse!
      337
      15
      4.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 I'm not sure about word-count problems being BPE. Spaces are consistently tokenized as a specific BPE, and punctuation is usually separate too, so it *seems* like learning '6 punctuation/whitespace-separated chunks' ought to be learnable easily long before, so that may be else.
      158
      7
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 20 Still not understanding the point. Aella's claim is that "players in the NBA are tall". He's the one going "no they aren't", and you're the one going "he's right, look, a tall person who isn't in the NBA!" ('All knights are men-at-arms, but not all men-at-arms are knights...')
      60
      9
      15.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 19 I'm not trying to do it myself, I was just hoping someone already had so I could add it as a confirmed example. If no one has, I'm just going to add it to the speculative section of /Turing-complete. Stochastic petri nets sounds like a good direction.
      64
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 19 If the polls aren't making decisions, then there's no problem in holding them accountable for making bad decisions, as they aren't making them. (Musk, of course, has his own personal mechanisms of accountability for mismanaging Twitter. Around 30 billion of them, I understand.)
      172
      8
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 19 when it comes to twitter polls about running twitter, i feel that they are definitely being held accountable to the consequences...
      5,940
      60
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 19 It would be impressive if one hadn't (like OP, apparently) paid any attention to math AI beyond AlphaZero; then it's a shocking demonstration of sufficiently advanced prediction = math, and if nothing else, a powerful generative prior for ATP. If one had, then a nothingburger.
      103
      13
      12.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 19 I'd be irritated too if I'd been talking clearly to someone and they ignored me. If I recall correctly, this was in a pool context and so I had had to take off my hearing aids. (They're much more water-resistant these days but not then.) But they didn't know that.
      293
      9
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 19 Hm... if it's a solar minimum leading to more background cosmic rays, that sounds like good news. I'm assuming that solar flares produce a very small amount of cosmic rays so that was being backed out to very large solar flares, but just some cosmic ray increase doesn't seem bad.
      5,804
      41
      0.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 How does its foxos compare with the others? They're nice, but I'm not sure one couldn't get them out of eg NAI - I've seen a lot of great landscapes from it.
      3,788
      24
      0.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 A rare miss by Alexander. If the history of computers shows anything, it is that no one has any idea what can or cannot be done with 'an army of clerks' aside from making wrong claims about what cannot be done. One can no more prescribe that in advance than high modernists could.
      4,382
      97
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 'How did such a successful entrepreneur melt down like this??? I just don't get it!!!' Gosh, it sure would be interesting if there were some mental disorder which is correlated with risk-taking, creativity, entrepreneurship, and the occasional bout of disastrous decisions. 🤔
      568
      37
      6.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 I continue to be astonished how, literally just weeks after the most famous rapper in the world melted down & became an ex-billionaire, people apparently still have no idea what 'bipolar disorder' (only one of the most common & harmful mental disorders, surpassing SCZ) looks like
      531
      32
      6.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 The idea is to bias the noise towards the final true image, to provide more training signal. This bends the overall 'trajectory' consistently towards the target. It also builds in distillation, but continuously: eventually you are regressing straight from pure noise->target.
      7,171
      28
      0.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 Idea for a training trick a bit inspired by progressive distillation: I call it 'Brownian bridge sampling', to bias the random walk. For each training pair, pixel-wise weighted-average the noised image with the original final image. Start at 0 weight, and anneal to 1 by the end.
      4,825
      25
      0.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 18 I want to like this but my eyes instantly go to all the cat artifacts. (Eyes look like blinded by cataracts; left whiskers missing entirely and muzzle smoothed; too many and too large paws.)
      5,205
      42
      0.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Cowen's review was a good deal more positive than I expected (as was Brody's), and has mostly swayed me into thinking I should go see it. (In a theater in 3D, specifically, not wait to torrent it if I ever bothered to see it.)
      915
      20
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 I don't know about totally indifferent because I tend to forget those, but as far as ones that amuse me: THEM [yelling]: "What are you, deaf‽" ME: "Yes." THEM: "Oh."
      9,337
      166
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 I don't think they've been trialed in combination yet; at least, I don't have any examples in my notes. (If that doesn't work out, oh well, plenty of others to experiment with. Only takes 1 good combo, after all.)
      163
      9
      5.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 So, I don't consider it improbable at all to hit a possible 40% (fairly arbitrary as that benchmark is given how bad bariatric is otherwise) from long-term use of refined high-dose multi-drug combos spiced up with additional new drug candidates or combinations in the next decade.
      74
      1
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 (Nor do I see any particular evidence from the combo-therapy trials or multi-arm trials that there is pernicious interaction where the first drug has all the benefits and the rest are redundant. So it looks like there's plenty more button-pushing to be done in this area.)
      92
      3
      3.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 So, since you can increase the dose and combine them effectively, if it takes 2 years for a low dose of a single one to show full plateau, and the shorter study curves usually show no plateauing, these are all underestimates by quite a bit.
      58
      2
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Your tirzepatide % is coming from trials running not longer than 1 year, IIRC, while STEP5 shows that a 2.4mg dose of semaglutide takes 2 years to plateau average weight loss. These all show dose-response, so you can increase the semaglutide, and it would take longer just itself.
      62
      3
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 All too easy to imagine how that wouldn't happen. Plus, why does fusion have to ever happen? There are many energy sources like fission which may just be better long-term.
      326
      15
      4.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Yes, I already addressed that with 'short-term'. You need a reason to think that the weight loss *stops* there at the reported trial followup like '20 weeks' etc, which is unlikely.
      69
      13
      18.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Yeah, this is kind of a weird scenario: we already know bariatric surgery causes major changes in weight (that's the point) and that (or other surgery effects) cause major changes to the gut microbiome. So the surprising result would be if the transplant effect wasn't affected.
      224
      3
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Why do you think that? (And you are allowed to have beliefs which are not 0% or 100% certainty, even before phase 3 results are reported, you know...)
      100
      6
      6.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Semaglutide is not a cure for obesity, sure, just overweight. Semaglutide+tirzepatide or the triple-agonist formulations are much closer to that, especially if maintained long-term (all the 15% etc weight loss numbers are short-term). And far more desirable than bariatric...
      288
      10
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Yes, but they need to be rolled out widely (supply bottleneck + pharmacological calvinism are barriers), made non-injectable (the good oral ones are still coming), and the better combo-therapies approved & rolled out (some are still in the pipeline, not even phase II yet).
      273
      16
      5.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 We're not going to get the second two, but you left out 'cure for obesity' which probably would be even better QALY-wise than 'cure for cancer'.
      10,102
      299
      3.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Probably subsumed under 'reaction norms'. Personally, if I wanted to further behavioral-geneticize the Buddha, I'd go with 'emergenesis'.
      371
      6
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 If it's just guessing, and it stopped training last year, how does it 'guess' the current date correctly each time, hmm? 🤔
      380
      46
      12.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Strictly speaking, he wasn't an experiment, just a social intervention, and like most such early childhood/adulthood intervention programs, wound up showing the weakness of shared-environment.
      7,824
      74
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 (That is, there's lots of anecdotes of the form, 'I believed X but couldn't prove it for 20 years until one afternoon for a lark I tried to prove ~X and whups.' This would be nonsense for a DL system: they would've forked into two, for X and ~X, at the start and solved it then!)
      156
      5
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 17 Yeah, and we still haven't tapped into their full power... For example, there's no good way to properly randomize models as samples from the full Bayesian posterior, so you can't generate 50 'hypothetical researchers' to explore an idea independently & non-redundantly.
      74
      7
      9.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 (If you don't need self-attention at all for it, and just any memory or history/context is enough when trained on diverse data distributions at scale, we can stop asking 'does arch X do Y' because these capabilities are extremely convergent and it's simply a question of when.)
      374
      29
      7.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 At this point, the case for Bayesian meta-reinforcement learning emerging in self-supervised learning Transformers trained on natural data, not just RNNs, seems pretty much done. The next question is: does this get elicited in fully-connected archs like MLP-mixers as well?
      5,107
      54
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 So, within its comfort zone, it can improvise amazing doggerel and avoids the tell-tale errors of the more unrestricted models, but if you push it out of that or expect any other kind of verse, it still doesn't work. Still needs character-level or phonetics-aware modeling.
      105
      6
      5.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 After trying to break out of the niche in ChatGPT, my conclusion is that 003/Chat have not learned phonetics/rhyme but appear to have memorized a bunch more pairs and then the RL tuning has 'mode collapsed' onto the narrow learned niche of high-confidence rhyming verse.
      109
      11
      10.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 Many examples now of things like rhyming words that are spelled similarly but pronounced differently, being unable to rhyme or write verse outside a narrow niche of short-line quatrains/couplets, rhyming even when told explicitly not to, not rhyming specified things...
      62
      2
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 The context window doubling for the embedding endpoint and the 'c100k' BPE tokenization in the new tiktokenization library make this an especially important point to be clear about now: there are a lot of tokenization changes going on!
      67
      3
      4.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 16 But does that mean it benefits from newlines being left in, or it just doesn't fail as catastrophically as before?
      226
      8
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Rhyming is great, but you can still get great stuff without it. I was playing with davinci (to compare w/003) and got this metal bit (whole thing is completion, I was actually starting w/'Complete "This Last Pain", by William Empson:'): [mrw people criticize my writing style]
      89
      4
      4.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 No, they don't work. Like the first one, you can easily trace the outside and see that it has no path into the interior, much less the whole thing be connected. Not that you would expect diffusion on pixels to guarantee connectivity anyway.
      411
      6
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Yes. There may be some clever tree construction or mathemagic which allows m-of-n arbitrary recovery, but I don't immediately see it, and the 'pick 1 random FEC packet from a random past block' is at least a concrete proof-of-principle - it's very slow but it clearly would work.
      60
      1
      1.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 You obviously can't just feed the entire blockchain in: doesn't scale, and there's no way to 'add a new block', it's all-or-nothing AFAIK. So it's not truly 'broadcast'. That's why each block is separate and merely includes a packet from a previous block's fountain encoding.
      75
      2
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Looks kinda expensive, homepage does a bad job of selling one on any advantages, excludes NSFW (?). Mm. Someone should do a NovelAI comparison.
      4,609
      79
      1.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Efficiency-wise... bandwidth, not great, because you keep re-encoding the full history, but you can prune away all the FEC packets you don't need (or want, if you can know you don't want the block it's for), so you at least don't have to store duplicates. In-progress < full.
      3,271
      10
      0.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Also censor-resistant: can't easily block current block because the fountain code just keeps spitting out new FEC packets which *eventually* recover the block, and then it contains another historical FEC packet (which eventually recovers a block with another FEC packet...).
      3,597
      14
      0.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 You can speed up listener reconstruction by including more historical FEC packets per block. Advantages: don't need to recompute any FEC or store additional FEC blocks per block, constant overhead & copying one packet from a historical block into current block.
      1,567
      10
      0.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Weird idea: a broadcast-only blockchain using FEC broadcasting packets. It can transmit only the current block, but eventually allow reconstructing the entire blockchain by including 1 FEC packet from an earlier block. Listen long enough, and you get all.
      6,116
      158
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 People worry about AIs giving you dangerous meth recipes, but they should worry more about them giving you dangerous math recipes.
      26
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 (Worth noting you can know a priori the effect size from environment changes like this must be small because the total test-retest reliability of cognitive tests or standardized tests is so high, despite making few to no efforts to control any of the environment effects.)
      2,592
      40
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 Does this go haywire when newlines or Unicode is present like the old embeddings? It's not mentioned either way.
      4,983
      69
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 You should keep at it, it works! People were showing good results for 'just generate an "image"' in GANs back in like 2019 with StyleGAN.
      294
      13
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 "A man who has to be punctually at a certain place at 5 o'clock has the whole afternoon from 1 to 5 ruined for him already." --Lin Yutang
      75
      18
      24.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 15 I don't have the Photoshop skills for it but I've always thought that Cowen+Decreux would be an amazing meme. The only question is what... "Disregard papers / acquire ethnic food"?
      296
      4
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 Yes, but that's a trick probably already used so may be in the baseline already and the question switches to 'scaled up model which can't easily be distilled down to realtime on current hardware'.
      113
      1
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 (I mean, retrieval image-gen models are a thing. Retrieving images from a live website would be easy to add. You could make a model which acts like this, it would even be useful for a number of purposes! It's just not how any major model works, is all...)
      134
      24
      17.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 Interesting demonstration of the folk psychology around how generative models work, tho. You can understand it: why *couldn't* SD be literally going out and downloading the front page of ArtStation to 'copy'? ML models can't *understand* anything, they're just 'search engines'...
      848
      39
      4.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 Reminds me of the last time I was in SF. I left a Xeon Phi on the front seat of my Zipcar, and while I was out getting a latte and sourdough, some homeless crazy broke in and left another dozen: 😭 [photo]
      7,603
      250
      3.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 Seems unlikely. Those other scholarly search engines already exist and seem to be very lucrative, so GS has failed there. The best profile I know of gives the impression it's, frighteningly, the passion project of a Googler with their equivalent of tenure:
      63
      6
      9.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 I worry about the backfiring of this quantitative mindset. YOU [to them]: "Wow, look how low your peak rate was! That cardio HIIT is really paying off!" YOU [to self]: "Look how low it is—what's wrong with me? Am I doing something wrong? Am I not as sexy as I used to be?"
      2,013
      29
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 If 'Abelian groups' are named after Norwegian mathematicians named 'Abel', it logically follows that 'non-Abelian groups' are named after all the non-Norwegian mathematicians not named 'Abel'. (The non-Norwegian Abels presumably need to step up their game.)
      2,582
      14
      0.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 Pretty much all the NMT studies are consistent with a common 'neuralese' of the embeddings/vector-space, aren't they?
      49
      6
      12.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 14 It wouldn't radically extend lifespan, anymore than it would in humans, because of competing hazards + Gompertz. Be interesting to see if they did any good at all when applied to a M-prize winning mouse approach, however.
      3,153
      48
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 They can buy a few years: the programmers & GPU clusters to handle Stable Diffusion-like popularity are big targets. (I remember Napster - it was so convenient and easy to use for 56k. Replacements like eDonkey or Kazaa took years and home broadband before they got near it.)
      1,409
      58
      4.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 The most amazing thing is that it sounds like he's a cryptominer. What a brilliantly sociopathic way to scam free electricity for your rigs.
      939
      26
      2.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 Yeah, I've never heard of that kind of unwinding. The FTX creditors will just assume ownership of the Anthropic holding. Isn't that how it's worked with other large frauds like 1MDB?
      166
      18
      10.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 "At last", you think as the water approaches, "now no one can say I am exaggerating the problem and should just reinstall."
      174
      10
      5.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 Yeah, I assume the scaling exponents are bad, if only because they are generally all using hybrid systems optimized for now rather than bitter-lesson long-terms, but I'm curious how much better they could be as-is out of the box, essentially.
      2,302
      59
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 This is something I've wondered for a long time about self-driving cars: to what extent are they held back by on-board GPU? You can't measure the effect of narrowminded R&D, of course, but presumably Waymo et al have internal "compute no object" scale-ups to benchmark some of it.
      4,053
      97
      2.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 Does it actually matter? Can you, say, name three examples of whether major scaling results by Google Brain, especially ones with non-public models, proved to be seriously exaggerated and what looked like a major advance proved not to be?
      88
      7
      8.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 I'm sure! But R&D has to start with the model that *doesn't* run on his laptop. (We're too stupid to do otherwise, which is why there's always a hardware overhang.)
      1,880
      56
      3.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 I think yes, historically all forecasts not explicitly including them tend to be of the form 'unless some major tail event occurs like all-out nuclear war, pandemic, or AGI'. Otherwise they would all have to have 99% CIs like AD 2025-2500.
      2,040
      27
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 Why? Books about programming will literally have commented programs and often include extensive question-and-answer sections. Indeed, Stack Overflow killed an entire genre of technical writing of the 'X cookbook' form.
      444
      12
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 I'm sorry you're an "impoverished goatherder on an old laptop running off solar panels" but the implications and importance of something like GPT-3 or Imagen have little to do with exactly how many A100s it takes to run it conveniently.
      1,098
      69
      6.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 ("I didn't see any samples from BigGAN-JFT300m, so that model must not exist. I didn't see any hand samples from Imagen, so they must not exist" etc. Further application of narrow windows + systematic lagging bias left as exercise for the reader.)
      1,121
      18
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 13 The more striking part is that the era of 6 fingers etc was actually ~2017–2021 (BigGAN→Imagen roughly), but people thought it was 2021–2023 (DALL·E 2→SD→?). No matter how many times you chant "the future is already here, just unevenly distributed", people won't discuss SOTA.
      8,279
      281
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 12 "...And finally, at long last, he realized that the bluehorse of happiness was with him the entire time. The End."
      297
      6
      2.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 12 This is almost certainly true of sufficiently large image models too. The obvious way to fix the endless proliferation of finetunes/forks: simply have a small cluster training 24/7 on 99% lightly-moderated user-submitted uploads / 1% LAION, & release checkpoints daily.
      1,155
      87
      7.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 12 No it won't. By the time you can extract a specific brain, you'll have been meta-learning a distribution of capable brains eons before by using weak earlier data as constraint: Same way GPT is useful long before it is any *specific* person it's trained on.
      3,136
      122
      3.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 12 The ruinous powers have ever been generous with their votaries, until they claim their price. That the lies are necessary shows that it cannot bear inquiry, and is always the mark of heresy - with which compromise is death. 𝘗𝘶𝘳𝘨𝘦 𝘵𝘩𝘦 𝘚𝘢𝘯𝘵𝘢 𝘤𝘶𝘭𝘵𝘪𝘴𝘵 𝘴𝘤𝘶𝘮!
      1,896
      24
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 12 Indeed, it is by their logic. Which is why you do them a favor by pinning down their logic *now*, so that when AI is created, they (or at least, everyone else) might actually learn something from it.
      328
      12
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 11 Kinda weird. Why would inserting padding help? The BPEs of the letters would be the same regardless, as long as there's at least one whitespace in between, I'd assume. Are you sure this works reliably and you're not just hitting the RNG until it works?
      577
      1
      0.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 11 Er, the causality is pretty obviously at least partially the other direction, there's nothing odd about that! You'll notice the absence of eating monkey 'bush meat', or Koreans abandoning practices like beating dogs to death to make soup tastier. Let's not be disingenuous here.
      153
      6
      3.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 11 (I mean, just consider how close in history you have to be to AI to even express the 𝘪𝘥𝘦𝘢 of 'AI' to begin with! (Sorry no, Hephaestus or the Mechanical Turk don't count.))
      165
      11
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 11 If you condition on observer-moments who are sophisticated enough adults to reflect 'I was born in time to see AI', then you have to remove all of the deceased children, uneducated, etc. Just the dead children alone drops that to more like 50b, so >20%.
      425
      30
      7.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 11 Maybe another two dozen articles and op-eds in the NYT & Chronicle would have helped them not be 'hacked' and prepare for 'a societal trust collapse, at scale'.
      2,242
      32
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 "… - All supplementary data files have been uploaded: [Y] - All listed authors have approved the final draft: [Y/n]" (・・;)
      735
      25
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 Yep. Feynman also has a famous version of this, with his talk on the mice maze-running experiment. (FWIW, we've never been able to track down a source for Feynman's mouse anecdote. It probably just isn't published, but... an unfortunate 🤔 caveat to an excellent point.)
      161
      12
      7.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 The real question: how well does the extracted prompt work for Jasper-like results and how well does it predict Jasper outputs (esp as a surrogate for more attacks)?
      414
      25
      6.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 (I feel like that sometimes whenever I pull together old notes, tweets, IRC comments etc that I've forgotten - that sticking the author 'Gwern' on it is misleading, and it really ought to look more like 'Gwern~2009~, Gwern~2018~, Gwern~2022~ et al'.)
      1,168
      42
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 Not just his wife or editing - the whole runup. Lucas had a circle of people to bounce off of from his indie days, and the 'scenius' helped create the trilogy (but not others). Ever read the earliest published drafts? Dire. (_Secret History of Star Wars_ is a good source).
      320
      16
      5.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 It does, however, seem like an excellent explanation for all questions of the form "why didn't anyone X for Y": if you suck at every kind of "X for Y", then you probably aren't going to do much of X or indeed, even think it possible to get more Y.
      462
      53
      11.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 10 (You think we have any idea what the choices are that go into making a Chihuahua different from a St Bernard, or the ancestral maize into today's sweet-corn? Or that Evolution itself has any idea what a 'choice' is?)
      119
      8
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 (_note bene_: 'generate every possible image by leave-n-out keywords to show the user to ablate it' is the sort of thing that fast GAN sampling makes trivial, but in the culture of poverty of diffusion image generation, sounds unthinkably extravagant & slow so no one does it.)
      131
      15
      11.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 There may be something to that. Where the text embedding just gets overloaded, or perhaps averages out on too many different latent dimensions to extreme mediocrity. Should image gen tools build in ablations automatically, and try to guide you to the smallest possible prompt?
      2,070
      39
      1.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 (However, I also think that breeding dogs for intelligence is both useless and probably very harmful to them, and no one who likes dogs & cares about their welfare should want it if they think about it for more than a few seconds.)
      184
      16
      8.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 Yes, genomic selection is pretty much always more effective. But think like 2x, not 20x. (And of course, comes with its own challenges.) Breeding dogs for intelligence fast is very possible, I'm just saying his specific numbers are garbage even granting the assumptions.
      230
      21
      9.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 Oh, it totally knows what it means. It's just that it can't quite do it. It's part of the bizarre pattern of strengths & weaknesses which has everyone completely confused. It'll do something perfectly which BPEs should make impossible... and then fail at an easier thing next line
      203
      21
      10.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 If anyone is wondering, most of the math in this is wrong because he omits heritability completely (low, in hard-to-measure dog behavioral traits), so his reinvention of truncation selection is wrong. He should have read Lynch & Walsh.
      3,705
      162
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 Yeah, ChatGPT handles the basic rhymes much better, but it's still fragile. You can get it to now write a perfectly rhyming completion... But it'll insist on ignoring the specified rhyme scheme and keep trying to make it very regular couplets or quatrains. eg:
      185
      6
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 Just to point out one salient detail, Scott's ACX is literally his job, while Cold Takes is not really Karnofsky's job. And when it was, when he was running GiveWell, he spent a lot more time on communication in the Yahoo email groups, LW, the GW blog & its open threads, etc.
      157
      7
      4.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 imagine the hubris of ignoring all the second-order effects and thinking that you can know anything about the long-term effects being net good after exponential amplification of consequences like pop growth, and simply talking about saving lives and qalys from a terrible disease
      249
      13
      5.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 9 The Tay claim remains wrong, and the BERT one seems suspicious too. There is no evidence in that it was BERT, and Google 'featured snippets' existed before (and were famously getting things wrong) well before BERT. Are they just speculating here?
      4,652
      95
      2.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 8 For Lensa, of course. VCs can't 'invest in Lensa-type apps'. That's not a thing. You can only invest in specific companies. They can explode all they like, but if they don't *keep exploding* then they're garbage as investments. Lifestyle or small businesses.
      84
      7
      8.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 8 I, uh, am not sure this comparison works given the Westermarck effect and the active policing of incest laws to extremely strong public support (and incidentally, adults cannot do what they wish in [checks WP] >48 of 50 US states).
      6,663
      97
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 8 Agreed that people underestimate UX, but... this may also be the peak. Haven't several of these already come and gone for SD alone? How many of these '$X/day' have shown legs? Can you name any really big ones still around that started based on, say, the BigGAN G release in 2019?
      5,084
      132
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 8 A curious contrast: 'The Syllabus' does popins *to the side*, fixed at the top. (No persistency.) Has the obvious failure mode of popping in at the top-right for distant links... Fitts, yes, but still feels awkward. I don't like it.
      906
      31
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 8 I always read it as in part selection: we see the BG as ice-queen-bitch-goddesses because the novels focus on scenes like 'Harkonnen assassins are trying to rape me before disposing of my corpse' ie. all the moments in time where the rubber hits the road & shit is srs bsns.
      58
      1
      1.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 8 'create another form, which takes net revenue for a small business incorporated as llc in California, and calculates server-side the federal, and state taxes to be paid. Account for all tax brackets' smh it doesn't even understand taxes are 𝘸𝘪𝘵𝘩𝘪𝘯 each bracket. 🤦‍♂️ useless
      807
      37
      4.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 My observation is that while a lot of people are OK, a lot more people don't seem to update at all. They aren't going 'my goodness, of course GPT-3 models have been getting better every day, obviously, but this is even better than I had been inferring' but 'it got better??!?!??'
      85
      4
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 If people were updating the latter, they should be disappointed as often as surprised. But it should *never* be a surprise that AI systems were slightly better today than yesterday, and will be slightly better tomorrow. (The disagreement is how big 'slightly' is and will sum to.)
      54
      8
      14.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 You're conflating things here. You should be rationally expecting the current, latent, unobserved abilities to be slightly greater each day due to continuous inputs like researchers+data+FLOPS. Whether you adjust long-term forecasts up or down at each revelation is different.
      55
      2
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 Why would it not be continuous? DL systems are certainly not discrete. There's no global clock ticking where every system worldwide gains 0.1% ImageNet accuracy when the week rolls over at 12:01AM Monday. The GPUs are always going brrrr.
      55
      8
      14.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 (Sure, they grow fast, but the disease and parasite rates indicate that there's considerable suffering and low QALYs!)
      287
      3
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 (Poisson clumping applies not just news themselves, but publicity and infrastructure, like free web interfaces. And you get clumping *even with* independence, so imagine clumping due to dependencies/correlation of foregoing. Hence: bursts of panic, then stretches of complacency.)
      115
      6
      5.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 I think that's part of it: even if you aren't publishing a paper, there's still pressure to get things released so you can talk about them, recruit with them, or just get them out the door before everyone takes a full week off to travel+recover. Then there's Poisson clumping too
      200
      10
      5.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 People *should* have been, last month and the month before etc, smoothly incorporating each day the knowledge that DL systems were slightly smarter than yesterday and never getting worse... but we aren't good at that. So updates proceed 'by creeps and jerks', to borrow a phrase.
      507
      42
      8.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 We're in the quarterly freakout where a conference triggers a bunch of bottled-up progress + "today's 10,000" and people who convinced themselves that everything is normal are reminded that's not true. Similar to the cluster with Gato + DALL-E 2 + Chinchilla + Minerva earlier.
      5,509
      195
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 I joked "and she ghosted 20 men to the machine's 10, lawd lawd lawd / then she ghosted 20 to 10! / she laid down her Tinder app, typing 'goodbye' / "anything you can do I can do better" she sighed / "I can swipe anything better..." / and logged off forever." It's better 😢 4⁄4:
      1,402
      101
      7.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 The way I put it is Kurzweil was sorta right about stuff involving software/data/information-processing or 'bits' (beyond just the AI projections based on compute - which still make me *so* mad), but then badly wrong anywhere it came to hardware or biology ('atoms').
      511
      36
      7.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 Another example: freeing one from the tyranny of either exhausting hours organizing objects or useless computer-legible orders, by instead minimizing distance between adjacent embeddings: Perfect for 'auto-sorting' nameless notes etc.
      10,956
      100
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 Note that if the overall embedding doesn't work, you can simply tune it based on the embedding of a specific point such as a query/keyword. Embed it, weighted multiply all the others by it (or something), then TSP an order.
      255
      7
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 (This is an idea I've been mulling for organizing my text snippets/notes/annotations as well, where title/date/URL don't sort well: tSNE the OA API embeddings I already have, and then use a TSP or greedy heuristic to put them into a quasi-logical order by semantic similarity.)
      11,238
      50
      0.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 The fallow assumption sounds dubious because ChatGPT is clearly sharing GPU resources with the regular API & playground; that's why the playground is constantly erroring out right now, due to ChatGPT load.
      466
      23
      4.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 This will wind up inevitably producing some abrupt transitions between clusters, but that tells you where the natural categories are, and you can easily drag-and-drop the cluster of files into folders & redo the trick inside each directory.
      225
      4
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 Categories may be too rigid for 'loose association' and still don't have internal structure. My suggestion: tSNE (preserves local geometry) CLIP down to 2D (for interpretability) then find a shortest path connecting all images. Now you can `ls` them in "loose-association order".
      519
      21
      4.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 SD2.1 might also be worth a try. The newer CLIP embeddings should have a better relationship/entity understanding so you don't get the bottom two.
      121
      1
      0.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 Yes, InstructGPT/ChatGPT exemplify the advantage: they're not really smarter than the baseline - and worse in many respects - but they're a lot easier to *use* for what we *want*. Data collection like inner-monologue or active learning is also going to 'agentfy' very soon, IMO.
      397
      20
      5.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 7 This has got to be one of the least credible epigenetic or whatever results I've ever seen. No one would ever come up with that prediction a priori.
      261
      17
      6.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 6 Possibly a resolution issue too. I couldn't get it to look like a dog when I stared at the thumbnail, and had to fullsize it to make it switch; then going back to thumbnail, I can't make it stay 'dog' easily or consistently.
      32
      1
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 6 Yes. We do that for JS/CSS/HTML already, but the templating system is brandnew (added as part of the speed optimizations + big rewrite to fix persistent bugs) and so didn't have any versioning in it. Now soon it will...
      52
      2
      3.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 6 It's a cache problem (hardest problems in CS etc): my mobile Chrome was storing an outdated template file fetched via XHR (and unaffected by refresh) which failed to fetch a *new* template now necessary to show live popins. 🤦‍♂️ Had to look up how to really flush a mobile cache...
      7,393
      34
      0.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 6 I can't believe we're implementing a JS console to debug problem (popins failing) which exists 𝘰𝘯𝘭𝘺 on Chrome smartphone, & not on any other browser or mobile simulator, because mobile Chrome won't let you use a console any other way other than remote debug (which is broken).
      6,990
      100
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 6 (Hm, does this distinction really hold considering that all LMs are being trained on datasets which have been heavily filtered and further up/downweighted, and increasingly naturalistically populated by people talking about good/bad LMs prompts/completions?)
      1,156
      19
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 5 I think it has more to do with housing regulations and flophouses costing a few bucks a day being outlawed in all cities, regardless of whether that did any good or hurt the poor.
      623
      41
      6.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 5 Please don't! Or at least, not online - I've been working on a Qanon/language model short story which ends that way and it'll be boring if everyone is going around quoting the ending but with language models before I can finish. 😭
      297
      24
      8.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 5 Why not? The NSFW+esthetic filter's collateral damage caused what looked like severe anomalies in DALL-E-2/SD-1, so I would expect that if they ramped filtering *way* up for SD2, you would see much greater damage from the wholesale deletion of modes and loss of diversity.
      105
      7
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 5 Dynamic evaluation was only for short contexts because that's about all RNNs of that era could handle. Given the much greater sample-efficiency and history of the best Transformers, why not try dynamic evaluation and see if it can help over hundreds of thousands of tokens?
      733
      27
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 5 It's like AI Dungeon 2 Dragon back in August 2020, when it worked. 😢 (But everything changed when the fire nation attacked.)
      170
      7
      4.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 4 It has better odds of working than all the other non-cheaty suggestions I see here, I think. Not claiming it has *great* odds, but what sort of probability would you expect given just 1 generic cell to target?
      562
      21
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 4 (But canonically, Buddha *was* a chad, wasn't he? He would've become a badass world-king if he hadn't conquered Mara instead.)
      2,944
      45
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 4 I would hit an orexin neuron in the hypothalamus, or any neuron in the suprachiasmatic nucleus. It's the smallest set of neurons I know of, which have a generic objective anatomical description, which could seriously sabotage him. Even tiny bits of damage cause narcolepsy etc.
      13,963
      419
      3.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 4 Lossy = lossless compression from the algorithmic information theory/compression perspective since a lossy one can just be automatically converted into lossless via arithmetic encoding etc, so nothing would be gained by a 'lossy' Hutter Prize.
      42
      4
      9.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 4 Hutter Prize is de facto this (and thus, about as relevant to actual AI as demoscene programming tricks are to regular programming).
      2,896
      31
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 4 These sorts of results usually survive blurring/downscaling, and that wouldn't test the color hypothesis more than a lot of other possible signals.
      216
      12
      5.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 (BTW, if anyone has a transcript or screenshot of one of the 'computer' sessions from AI Dungeon 2 in Aug 2020 or so, from /aidg/ or /vg/, please send me it. I didn't realize how important those hacks would become for both security & monologue and didn't carefully save them. 😢)
      3,204
      227
      7.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 I have to say, this is all very nostalgic. After all, the original inner-monologue/chain-of-thought AI Dungeon 2 work on 4chan/Twitter back in August 2020, although I think the threads may not have been archived, often centered around sitting down to "a computer" and working...
      1,132
      49
      4.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 Unfortunately, it's hard to get a decent IRC convo going, it degenerates pretty rapidly into repetition/agreement/milquetoast stuff. (The RL tuning again, presumably.) The cat discussion is particularly risible.
      1,746
      168
      9.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 To be clear, it'd probably cost $100/month anyway, but to push it to the really useful uses with full prompts as likely required to pack in factual knowledge & customization would make it cost more like $10k/month. You could burn through a crazy number of tokens stuffing in stuff
      76
      2
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 Missed a beat there, then, it should be persistent per user! I would've made it a hash (bit) based on... IP address, I think, unless you wanted to get really crazy with supercookies/persistent tracking just for the sake of a joke.
      364
      11
      3.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 Not if you want to give people a chance to use it for a while (always a fatal flaw with various personal tools) or have a market outside the 1% of the 1% or extend it to all the other sources of text context/data you need to march down the long tail of accuracy/use-cases...
      93
      1
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 (At least with the OA API, the 'finetune' won't easily pick up a single prompt of knowledge, will cost several times more to sample, and will be obsolete within minutes as the user accepts an invitation and the email assistant knowledge is updated. So, no bueno.)
      3,424
      33
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 Let's say you want an email assistant. You can fit a lot of facts about your plans/schedules into a prompt... but if you run a full prompt on all emails + token-at-a-time decoding, this would cost you like $100s/month/person on OA API! But you can't finetune it either.
      3,965
      50
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 Most obvious solution is to cache the hidden state/activations of the prompt. This can be done by exploiting the Transformer/RNN isomorphism: then you can simply run the RNN once, save hidden state, and invoke it thereafter. This would let you update it as the user updates it.
      2,262
      40
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 A missing piece from all LM APIs/SaaS AFAIK: really lightweight prompt caching. As LMs start following long detailed instructions, now we really do have usecases for >1024BPE prompts. But cost remains astronomical to reprocess a fixed prompt every time. 'Finetuning' doesn't help.
      1,723
      76
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 3 Believe this anecdote or not, I reported the same kind of spam all the time with Twitter Classique™ with little discernible effect. But there's one thing that doesn't change, whether it's old or new Twitter... 🙄
      480
      63
      13.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 And I get like a fifth of what I did in the pre-Musk era, but you don't see me broadcasting that to all my followers. Perhaps it's hard to generalize from self-selected anecdotes.
      3,616
      128
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 The rhyming remains really weird. Long flawless sequences of what look like rare rhymes, and then it'll completely flub it and not rhyme at all like in your example.
      505
      17
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 〜when the central government requires writers to encourage virtue and chastise vice in their fiction〜
      2,901
      47
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 All of which would be useful to know but is of course not reported in the essentially non-existent publications... One of these days I'd like to wander up to the NIH archives since they apparently have all his papers. I don't expect a smoking gun of fraud, but you never know.
      90
      12
      13.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 So, you say inbreeding would produce an initially healthy population but then a temporary population decline which one could then cherrypick and publish about how sickly and sterile the population became in 'mouse utopia' while omitting to publish any followups...? 🤔
      57
      15
      26.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 Sounds in line with prior results. (Doesn't shed much light on the Egyptian domestication hypothesis, though, from skimming. Lots of Egyptian ancestry... but you'd expect that either way.)
      337
      6
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 2 They're not really a step forward, though. You could do all that with davinci in July 2020 with a little prompting or luck (see my page or original tweets), and LaMDA apparently does it zero-shot right out of the box last year.
      1,346
      78
      5.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 1 ("In the end, perhaps the most 𝘩𝘶𝘮𝘢𝘯 part of the AI was its ability to 𝘪𝘮𝘢𝘨𝘪𝘯𝘦 a different world—" "You mean a better world?" "What? Oh my goodness no.")
      153
      16
      10.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 1 (Presumably, if they had been surveyed about "by what year will machines be superhuman in math, not just Putnam competition style questions", given their pessimism, it would then have been many more years beyond Putnam/2050...)
      85
      6
      7.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 1 If you have 'superhuman math', broadly defined, that would seem to require at least reaching this Putnam competition goal, a fortiori, which is why I highlighted it, as I'm not aware of any 'superhuman math' expert forecasts so this is the best we have free of hindsight/goalposts
      78
      6
      7.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Nov 30 Well, I joked that since transcludes just transclude raw HTML, we could just link the final HTML snippet in `/metadata/.../foo.html#ID` and it'd work, right? After thinking about it, I realized this was perfect. So, another transclude type & voila. Transcludes are flexible.
      7,541
      18
      0.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Nov 30 V obvious implementation just using existing link ID + transclude for page←page, or page←annotation, or annotation←page... but what about annotation↔annotation? You can't "link" an annotation, the point is that you link the URL and the annotation is a transparent wrapper!
      5,272
      29
      0.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Nov 30 Another transclude feature (inspired by IEEE's HTML papers, of all sites; even they can do something right, turns out): if we have backlinks & cross-page transcludes, why not do 𝘣𝘰𝘵𝘩 to show the reverse citation context? Harder than it should've been, but that's live now:
      2,630
      46
      1.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Nov 30 Implemented now. It's not quite off-screen rendering in the GUI/game-engine sense, since you can't swap in pixels in any meaningful sense, but close enough. Popups should have much less tail latency - feels much snappier.
      5,912
      24
      0.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Nov 30 Finite automatons are perfectly interesting and respectable Turing machines to train on predicting. Lots of my favorite Turing machines are finite automatons. (I suppose technically that includes all the TMs I've ever actually run, too.)
      74
      1
      1.4%
You've reached the end of Tweets for the selected date range. Change date selection to view more.
Engagements
Showing 31 days with daily frequency
Engagement rate
2.6%
Dec 31
2.0% engagement rate
Link clicks
6.3K
Dec 31
79 link clicks
On average, you earned 204 link clicks per day
Retweets without comments
1
Dec 31
0 Retweets without comments
On average, you earned 0 Retweets without comments per day
Likes
5.5K
Dec 31
86 likes
On average, you earned 177 likes per day
Replies
403
Dec 31
7 replies
On average, you earned 13 replies per day