Tweet activity

January 2023

Your Tweets earned 992.3K impressions over this 31 day period

100.0K200.0K1020Jan 22Jan 29Jan 1Jan 8Jan 15
Your Tweets
During this 31 day period, you earned 32.0K impressions per day.
  • Impressions
    Engagements
    Engagement rate
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 31 Or smartphones (esp smartphone social media)! Of all the predicted effects, the ones that seem to be kicking in now, 'kids no longer understand basic computer/OS concepts like "files" or "programs", and are worse at poweruser skills than parents', was among the least predicted.
      2,154
      92
      4.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 31 Their GDP is *not* growing 'very very fast' (it'd be better to ask if it's growing at all given the stats blackout and malinvestment and increasingly dirigiste direction), and it's steadily becoming ever less appealing to 'best talents' - they're more concerned with retention!
      93
      8
      8.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 31 (Didn't we just go through this with COVID? Maybe Chinese stuff just isn't that competent or incredible as commentators in the West keep projecting onto them whether it's human genetics or deep learning or COVID. Not as extreme as Russia's military, perhaps, but similar dynamic.)
      67
      7
      10.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 31 They're behind in terms of hardware technology and rapidly falling further behind post-embargo; and their data is heavily siloed, focused on e-commerce or natsec which is unhelpful for AGI, and way behind open datasets in the West like Common Crawl or LAION.
      87
      7
      8.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 31 Hm? Top 1/2/5, instructor, seems to be almost entirely Western: UWash/Allen and Facebook. And then MSR Beijing work is always an awkward example... Anyway, there are areas like face recognition where I expect Chinese AI to be tops, but are they important?
      27
      4
      14.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 31 Judging by how long it's taking everyone else to convincingly catch up to even davinci-001, I'm thinking at least a year, and probably multiple years. They've been lying flat, and OA isn't a real threat to them the way they are to Google.
      140
      8
      5.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 They weren't bogus, the RNNs just weren't any better than a reactive policy / history stacking, like they should've been on POMDPs. The RNNs doing the same or worse was quite reproducible & genuine. (Karpathy's law: "NNs want to work.")
      524
      17
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 R2D2 got its big performance boosts actually utilizing the RNN hidden state because... apparently everyone was zeroing out the hidden state when doing BPTT before! So ofc the agents never wound up making any use of history/memory.
      879
      66
      7.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 Based on reproducibility and methodology studies, as well as all the incidents like R2D2, I feel confident in saying there are lots of one line research secrets—so secret even the original authors don't know which line is secret.
      7,937
      246
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 I was enthusiastic about it, but the complexity feels dangerous, and people more experienced with Minecraft RL than me say that the env changes like block-breaking speed make the problem much easier than I expect, so I'm unsure enough about it to mention it in that list.
      255
      7
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 "AutoML 2.0: just make the model so large that it internally contains all possible archs AutoML 1.0 might search over and can ensemble them."
      3,839
      86
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 Yeah, that Baidu thing prompted this. I expect it to suck. None of their LMs have come anywhere near GPT-3 and they lack the 3 years of preference-learning data to do any tuning on. People keep underestimating how very well OA executes on LMs, and how easy it is to be mediocre.
      281
      35
      12.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 Jan 2023: in the past year we've seen in the West Chinchilla, Dramatron, Gato, DALL-E 2, Flan/U-PaLM, Stable Diffusion, Whisper, CICERO/DeepNash, Imagen Video/Phenaki, ChatGPT etc etc. Can you name even 3 Chinese AI results as important? (Besides GLM, which everyone says sucks.)
      6,493
      211
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 30 It'd make a fascinating benchmark/grand challenge for large-scale AI fiction: you have a really large initial corpus + even larger secondary corpus to bootstrap off, with many world details to keep straight, and a large audience that you could segment & test various completions.
      551
      20
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 'automated parsing' is still not a good idea, and you're now going way outside 'HTML+CSS' when you invoke AIs decompiling it to reinject semantic tagging. (It would be a lot saner if, say, the original sources were already double-spaced, and you simply had to preserve that.)
      131
      10
      7.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 (Ah yes, exactly what I want to do, integrate automated parsing and AI models into my already Rube Goldbergian site generation pipeline.)
      116
      4
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 How do you even define 'sentence'? 'A period then a space'? There's many more than one kind of spaces, whitespace inside the HTML file is not whitespace as visible (think \n wrapping), and there are many ways to use periods, wouldn't you agree, Mr. Mohr?
      169
      11
      6.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 "Welcome, class of 2023! Look to your left; now look to your right. Did you see someone, because they have face or hand-doxxed themselves? Then they're ngmi. The rest of you: well done. You have passed the first mirror test."
      218
      8
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 That would screw a lot of things up, like numbers or abbreviations. (The lack of double-spacing to encode 'end of sentence' rather than all other period uses has other downstream problems: Emacs has many 'sentence' functions which are less reliable if you don't double-space.)
      97
      1
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 They would probably regard that as a win. (Women out there, be careful: don't hand-doxx yourself on social media!)
      385
      12
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 Today's design dead end: double-spacing periods (sentence-spacing), or single? The research is scant, low-quality, and you can't get half the papers (which doesn't stop people from citing them anyway...); even if I wanted to A/B test it, there's no good way to do it in HTML. 😓🤷‍♂️
      5,010
      44
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 The real nude pros are generating AI bodies and then carefully photoshopping crops of their real hands onto the AI hands with inpainting around the hands to stitch it up.
      6,621
      145
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 Yep. It's like tag: you want to dodge as last second as possible (graze the bullet!). The cat waits late because it 𝘤𝘢𝘯 wait late.
      54
      3
      5.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 That sounds dubious. You haven't controlled for their original genes (and no, throwing a random PGS in doesn't 'control for that', you know that), which group differences you know exist, so you still don't know whether the epigenetic differences are genetic or environmental.
      2,694
      69
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 29 Hands are the cats of body parts. Just as GANs were knocking out photorealistic faces but turning out nightmarish cats...
      306
      18
      5.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 28 I'm partial to the '<|endoftext|>' token because it screws with you by not always being encoded to <|endoftext|> like you naturally assume, and generally lending itself to in-band input parsing hacks and vulnerabilities.
      3,077
      39
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 28 That's a pretty deep question about language! The tack I would take would be 'what multi-agent RL environments/tasks/distributions induce language'.
      76
      6
      7.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 28 That struggles to explain any result involving synthetic data, and human cognition is definitely displayed in lots of modalities like video or RL tasks, but yes, probably something like that is why you can learn semantics from syntax & superintelligent octopii can play chess.
      314
      6
      1.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 28 Oh, that the scaling works and you even 𝘩𝘢𝘷𝘦 these large models to do asymmetrical cross-modality tricks like Flamingo or SayCan with, of course.
      119
      8
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 28 This analysis would be much better if you had used the Playground interface instead to davinci-003 and looked at the likelihood of predicted tokens; you make plausible guesses, but I predict that the actual tokens would show that it's thinking along different lines sometimes.
      43
      1
      2.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 28 Newbies are always shocked how large LLMs are compared to image stuff. The second-most interesting problem in philosophy of mind, language, & epistemology right now is the asymmetry between language models/everything else: LMs transfer to other domains, but 𝘯𝘰𝘵 vice-versa.
      1,508
      92
      6.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 27 Yes, it has an objective but one of unclear importance, much like asking a LLM to answer PubMed questions or measuring preplexity loss etc. All the important stuff of AF2 is downstream - often, not even using AF2 but using DL models the protein guys would never have made w/o it.
      518
      5
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 27 This is pretty hard because so many of the good uses are hard to pin down (look at ChatGPT rn for variety and difficulty of evaluating utility). Take AlphaFold1/2 as a benchmark: what predictions should one have made in advance for 'DL does something good in protein science'?
      796
      12
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 27 Then why does that entire paragraph exist? Surely it'd make way more sense to talk about stuff like Minerva or the rash of ChatGPT/davinci-003 evals?
      883
      13
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 27 You explicitly dismiss that, though: "ScholarBERT is a relatively small model (770M parameters) so one can always think that maybe 100x parameter count would lead to better performance at Solving Science but I doubt it." But 100x doesn't even take you to GPT-3-175b, or PaLM!
      695
      9
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 The ScholarBERT example isn't a compelling example of scaling failing, especially given all the other successes. It's 2x param-count max diff, non-optimized, old arch known to have weak pretraining loss, with large downstream finetuning datasets, and larger still was better.
      3,740
      37
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 I don't think it was *that* cranky, but if it was, then obviously the nuclear chain reaction is very 'really far out there, cranky' & not at all like ordinary garden-variety chemical reactions, and would not be an obvious thing to present to a skeptical Monsieur Chollet pre-1938
      64
      6
      9.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 Could you expand about post-2014? He overshot how well compute would increase (but considering the extreme pessimism I remember from most people in 2009 about 'Moore's law is dead', he was a lot less wrong than them), but I don't remember any other major errors offhand.
      46
      11
      23.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 Well then, if it doesn't happen, Kurzweil will be wrong about AGI *and* most of the rest, as opposed to just most of the rest, while Moravec & Legg was mostly just wrong about AGI and not most of the rest.
      51
      9
      17.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 I'm not a Kurzweil fan & never have been. We very obviously don't see the increasing acceleration across all fields that he was arguing for; when I helped grade his predictions for a LW project, I was even less impressed by them or his self-grading. (They haven't gotten better.)
      56
      3
      5.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 So it's possible that if you provide a memory mechanism which doesn't overload the predicted tokens to double as a short-term/working memory, like Transformer-XL or something, it'll automatically inner-monologue at some scale using that, just to predict the next token-answer.
      348
      12
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 One hypothesis people gesture at is the lack of a built-in memory: default text just presents 'the answer'. You don't normally 'show your work'. But LLMs right now need to monologue explicitly, which is highly unlikely, so that forces them to emit the answer immediately.
      248
      12
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 I think it's an interesting question how to get inner-monologue behavior 'organically' or 'spontaneously', without explicit prompting or tuning. Right now, we get 'hidden scaling' where they *could* monologue for greater perf but just don't by default. That's bad.
      154
      4
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 Just because you swap out one word for another doesn't mean that they are at all the same thing, or that they were obvious (why didn't *he* propose nuclear chain reactions, then? Why did it take until Szilard? Who's publishing it in all that time after Szilard secretly did?).
      29
      1
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 Einstein's formula did not make it clear that there was such a thing as a chain reaction, that there were elements which supported chain reactions, that chain reactions would go critical, that any of those elements were around in feasible amounts, that they could be separated...
      103
      2
      1.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 Yes - a *secret* patent! Which is fine if your name is 'Leo Szilard', not, 'everyone else who might be named Monsieur Chollet & is demanding the exact principle be explained to them publicly'. I chose '1939' because that was when the chain reaction idea was fully public w/Hahn.
      32
      2
      6.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 People were discussing 'atomic bombs' of some sort at least as early as Wells: it was a new area with obvious large potential (see: 'the sun'). They obviously were not discussing the *exact* mechanism of chain reaction (if they had been, that would render my analogy irrelevant).
      43
      2
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 I don't think that matters really. Those people are still around and still part of the denominator, because most of them try to stay in the US. And if they go back to a poorer country because they lose, that emphasizes even further that being a PhD grad student isn't very elite.
      266
      6
      2.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 26 We *did* scale them up a long time ago! Brock was training on JFT-300M 5-6 years ago! We were training on YFCC100M+~10m more 3 years ago!
      57
      2
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 25 It's more awkward to talk about him because he's a one-weird-trick dude and the trick failed badly for most of his non-AI predictions; he's a Texas sharpshooter. Meanwhile, others we do talk about more, like Legg or Moravec, tailored their predictions much more narrowly to DL.
      5,489
      130
      2.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 I'd expect that to be a large fraction. Lots of higher ed unis aren't doing PhDs at all, and given how many of 'top uni' PhDs land at lower institutions and spill out everywhere else, they have to be producing a healthy fraction of the oversupply.
      1,059
      33
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 Even if having been a PhD student was strictly necessary and a superset of eliteness, that's still not very 'elite'. It's not even close to the famous but still extremely broad '1%' (ie 3.2m people out of 320m).
      1,281
      36
      2.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 I think she's right but this might have more to do with the dilution of being a PhD student. At this point in higher ed hypertrophy, what % of the US population is going to be a PhD grad student at some point in their lives? 5%? (50k PhDs/year, 3.6m births; figure half dropout).
      6,064
      131
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 What's hard about scaling up GANs, exactly, which makes them harder than diffusion or AR? (You are forbidden to use the word 'stabl*' in your reply.) A G is just a bunch of upscaling layers from a random seed. A D, in reverse, to a scalar.
      412
      20
      4.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 Like, it clearly can work, but you are going to have problems getting any useful behavior out of 1kb of state (prompt window) if you eschew any intermediate code generation steps. '1kb' doesn't even cover the full state of a tweet.
      316
      10
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 The problem is that like reviewing _Sword of Shannara_, there's not really any substance *to* focus on. I read _Eragon_ when the movie came out, and thought, 'yeah, that's exactly what I'd expect from very talented 15yo American teen still digesting Tolkien'. What's left to say?
      81
      10
      12.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 It 𝘥𝘪𝘥 get a lot of press. Mostly about how bad it was compared to GPT-3 (never mind ChatGPT) before they took it offline. (There is a valid point to saying that ChatGPT isn't incredibly far ahead; unfortunately, when it comes from FAIR, it comes off as sour grapes...)
      107
      8
      7.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 It's also not fully scaled up, to which there is no bar (eg no stability issues). As they point out, they use a quarter of the compute SD does, and it's received way less tweaking and tuning than it. Some proper scaling laws, hyperparameter sweeps, and Parti-level compute...
      202
      12
      5.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 24 Like I've been saying, stability is not actually a problem for scaling up GANs. It just isn't, any more than for other archs. It's an academic urban legend spread by people cargo-culting claims from 5+ years ago as an excuse to jump on the latest researcher fad like diffusion.
      99
      23
      23.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 No, it's not, and you should be ashamed of browbeating like that here and elsewhere on Twitter. We know general intelligences exist and have catastrophic effects much better than we knew nuclear bombs were at all possible, because we exist.
      836
      123
      14.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 Consider applying this criteria to nuclear bombs, discussed decades in advance. If you had demanded the exact principle, you would have willfully remained ignorant and a denialist until 1939, a year before researchers went dark and <3 years before the Manhattan Project began.
      14,917
      1,001
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 Still not easy, though. When I said that it could be written in Stan, what I meant was 'even carefully avoiding discrete stuff that Stan can't do, I got bogged down and couldn't quite make it work'...
      135
      5
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 Indeed. There are many reasons for the tradeoff, so it's not going away, not while people are still trapped in single human bodies with only 24 serial hours in the day.
      154
      6
      3.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 So, they open source only the stuff which doesn't really matter. You still aren't having your cake & eating it too in terms of publication count compared to alternative career paths like going after R1 tenure. That you get non-zero publications is a nice fringe benefit of the $$$
      128
      6
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 That's exactly why you do need to worry: your human intuitions are obsolete. Because humans can't copy themselves and take both forks in the road, and there is an effectively fixed supply of such humans. AIs can and scale it as many GPUs as you can buy, borrow, or steal.
      301
      26
      8.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 As one observes of people who get hired by Google or NSA or Jane Street or Renaissance: you can often tell when simply by when their blog or other publications abruptly slow to a trickle.
      169
      11
      6.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 23 You see occasional papers, but you're never going to see any real papers on major stuff. So it'll be like Kelly or public-key crypto: "X discovered it 30 years before at ABC, but they didn't publish". Publishing is not what any of them maximize or even try for, so... they don't.
      197
      14
      7.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 Phrase it however you like, as multiple choice or free response. That dictionary is still going to lay there. I've owned a Compact OED for nigh on a score years, and it's never so much as wished me a 'good morning'.
      180
      13
      7.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 You read weather reports anxiously because you're worried about overheating compute nodes interrupting AI scaling research runs; I read them anxiously because I'm worried about cold cats sleeping on top of my node downloading AI scaling papers. 𝘞𝘦 𝘢𝘳𝘦 𝘯𝘰𝘵 𝘵𝘩𝘦 𝘴𝘢𝘮𝘦
      4,423
      78
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 (Another big difference is that given how little good 99.99% of COVID reading/writing/doomscrolling did, a large number of individuals would have been better off in May 2020 spending that time reading about, say, GPT-3... 😉)
      1,797
      86
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 If a dictionary could pass a word meaning exam, I would in fact be extremely impressed, and would not complain about it flunking my math exam, because dictionaries ordinarily just lay there on a desk and do nothing.
      6,211
      318
      5.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 He's a tenured professor who could doubtless consult for handsome fees, and so I'm sure by net wealth he's far above the 50th percentile... but had he gone into quantitative finance instead of Fields-worthy pursuits, his percentile would be far, far, far higher.
      225
      15
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 (That is, if you believed that IQ had to correlate like r=.9 with all these different measures to be 'important', you are saying 'I believe in a world where most billionaires are publishing 100 papers/year while also being elected president, winning Pulitzers, & living to 100.')
      489
      31
      6.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 (In general, standard theories, datasets, and statistical method seem very poor at handling index variables with this sort of competing or zero-sum structure among the measured variables: A factor analysis wouldn't even correctly model this IQ example.)
      432
      44
      10.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 I think of this when people trot out 'IQ only correlates 0.x with log income': true, but tends to overlook the tradeoffs - if you want to publish papers & patents, you can't also work at Jane Street & earn Jane Street $$$. Pearson correlation on single trait won't capture latent.
      8,433
      238
      2.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 (It's unironically a valid isekai premise, IMO. It even comes with a built-in mechanism, like _Dr Who_, for switching up viewpoints regularly to renew and grow the series while maintaining a semi-stable immortal protagonist.)
      44
      2
      4.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 22 Weird, but common: (This is also why Schmidhubering is so pointless: not only is the 'first publication' often trivial and useless, it is often not even causally connected to later, successful, instances, which simply forge their intellectual pedigree.)
      80
      13
      16.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 I rewatched _Madoka_ recently after watching it during airing. What a perfectly constructed anime, even better than I realized at the time.
      443
      31
      7.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 Yeah it's always been the case that the last layer or two isn't just drop-in embedding like a CNN classier - even something like iGPT is doing stuff like combining a bunch of arbitrary-looking layers to get a useful embedding for the linear probing evaluation.
      2,714
      19
      0.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 This is pretty amazing. I can't think of any house which less embodies the rationality of farm architecture, which accomplishes its function extremely efficiently, than the ugly Steiner House built on arbitrary geometric schema unrelated to any function of or utility of occupants
      784
      42
      5.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 Be sometime before LMs can just spit back megabytes of JSON data or read the raw on-disk binary of your SQL database, so you're going to be generating code at some point.
      842
      17
      2.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 You mean it'd generate code on the backend to execute the request, and cache it? One would then fulfill manually cases where it couldn't, and finetune further. Security issues aside, that could be pretty interesting capabilities.
      7,087
      83
      1.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 The study of what is 𝘳𝘦𝘢𝘭𝘭𝘺 going on in Neal Stephenson's interlinked novels is known as Enoch Root cause analysis.
      9,113
      131
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 ``I fear not the dog who howls a thousand howls once, but the dog who has howled one howl a thousand times.’’ —Bark Lee
      85
      4
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 Hm, not sure I did. I remembered that there were a bunch of ones touching on memories of various sorts, but not that they were linked such that I was missing the point of 'Onald Creely'.
      102
      5
      4.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 (I left out "Onald Creely" because the overall conceit didn't work for me like the dream-job one eg, and it felt overly derivative of _A Lesson Is Learned_.)
      97
      2
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 To some degree. You still have to filter even after generating. The more short-term transition will be creators following up on winning tickets: "I have no idea why fantasy lobsterpunk is the most popular premise I ever invented, but I'll write 20 novels with GPT-4 this month."
      101
      6
      5.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 Simonton and meta-science: there is surprisingly little observed correlation between quantity and quality of output. Apparently when it comes to creativity or research, there's no knob people can easily turn. Each new work is a stab in the dark, a lottery ticket - so buy lots.
      78
      4
      5.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 21 Sure. It's just another way of tokenizing pixels; unusually bad, but still. The interesting possibility is if GPT-3 somehow gets it from Internet data because eg existing ASCII art is somehow enough to induce it.
      88
      6
      6.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 So far so good... Also added a simple quote-of-the-day feature (just an epigraph wrapper + transclude, easy); an oldschool Web 1.0 feature I feel is appropriate. 😉
      8,874
      73
      0.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 Not to pick on DC here... it's a webcomic antipattern, and I wouldn't even consider DC the saddest example, that'd be _Megatokyo_ (yes, still running). A short draft essay on this antipattern from a _Berserk_ review I've been writing:
      144
      17
      11.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 Hm, did you check that it knew your style in the first place? I already checked GPT-3 knows 'gwern' in terms of topics, style, and even formatting (), otherwise zero-shot text style transfer would be pointless. ('Pirate' checks that ChatGPT isn't broken.)
      100
      10
      10.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 (That is, this comment looks identical to me as a comment 'Actors routinely are thin, so diet and exercise seem pretty routine, I just saw a lot of muscular actors in _300_ with hardly any body fat; why don't more people take advantage of whatever they did instead of wishing?')
      410
      24
      5.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 I deny the premise. How do you know that actors 'routinely' eliminate accents? Actors are enormously highly selected due to immense oversupply, and still, some actors are famous for handling accents (eg Meryl Streep). Also, failure is standard plot point in 'talkie' histories.
      7,400
      177
      2.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 'minibatch discrimination' is an old thing, and there's also BN in many of these archs, yeah. It's striking that BigGAN sees improvements in minibatch size up to like 20k with no plateau by then, and note that many contrastive approaches like CLIP need really large batchsizes.
      48
      1
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 20 I don't know what you did with FTX beyond 'like a bazillion other people, worked for an org which got some money from them', but if it was as concrete and specific as 'hey, you tried to give a decent fraction of a million bucks to neonazis, where the alt hypo is nepotism: ???'...
      278
      15
      5.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 Waking up January 13th and going 'my goodness! those journalists who were doing a journalism on us, those cheeky lads went and did a journalism! I say - what *will* we say?' is not particularly impressive nonprofit practice either.
      523
      19
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 I dunno man, if a major newspaper contacts you asking you why you're giving money to neonazis and if you have any comments on it you'd like to give to a newspaper, doing reporting, using journalists, you *might* start discussing it with the nonprofit & thinking of a response.
      736
      49
      6.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 I don't see why this is so exculpatory. The clock doesn't start ticking on January 13th, it starts ticking in mid-December when Expo contacted Tegmark (not 'FLI') and he ghosts them. And if his mother died the same day Expo contacted FLI after ghosting, then that can't explain it
      3,210
      58
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 I don't think it can be rescued now: even the 'picked up pace' is mostly about wanking around with trans/enby self-insert fanfic, so is not progress. IMO, it's sheer sunk cost. Diaz would be better off dumping an outline, killing DC, and doing something they actually want to do.
      320
      18
      5.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 Very precisely: "Dark Science #1". Diaz decided to start a 'serious' Grand Dramatic Narrative which all the earlier strips had hinted at, but it's so terminally boring and slow-moving and uninteresting that he can't make himself do more than a few strips a year, so even slower.
      84
      11
      13.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 Exactly. Having 'put it really at rest' is exactly what it looks like when you're wrong! Also, we're really going to take Teller's word for anything on this (right after Oppenheimer clearance news, even)? The man had more of a hardon for anything nukes than dogs for legs.
      48
      2
      4.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 Unfortunate that there's no learning going on. The correlation with initial blind priors about the algorithm (they are given no info on accuracy) being highly accurate suggests it's mostly just self-selection into overconfidence, which they'd do better on given any info on errors
      3,270
      34
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 If that is contributing, that makes the comparison with 1997 *even more striking*, that so many jurisdictions were willing to goldplate requirements and/or outlaw a basic necessity in many places.
      27
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 19 If this is the 'same story', why is it totally different from the book version (crawl or somersault? Benjamin, or Phil? did he cover his face, or did he cover everything *but* his face? all undetected, or not the first?), and why should I believe either one after comparing them?
      561
      18
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 18 (That is, we may never 'run out' of raw sewage Internet text token data, in the same way we never run out of many natural resources, not because it became sky-high expensive to extract, but other substitutes got much better than the original and no one even wanted to use it all.)
      1,375
      38
      2.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 18 Given stuff like active learning/data distillation, instruction-tuning, and inner-monologue, we already know almost all data is useless to begin with, while sampling from a model is cheap. So not too hard to beat naive token scaling.
      2,203
      51
      2.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 18 Only if you thought token scaling was always the cheapest. As it's a power law and gets expensive fast, it's not hard for other scaling improvements to beat X-more-tokens scaling, and many improvements already have, like Chinchilla param+token scaling or ULM.
      2,194
      36
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 18 LW discussion: I don't see it as a big threat to scaling. Multi-modal tokenization, Whisper-style ASR, training on private datasets like emails, reuse of tokens (at least several times doesn't seem to be penalized too badly), inner-monologue generation...
      618
      55
      8.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 18 Yeah, but counting words seems like it should be easy, BPE or no, because BPEs are space-separated, so it boils down to counting 4-5 spaces modulo punctuation etc. So I'm not sure if BPEs can explain difficulty in counting words (rather than *letters*).
      31
      1
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 (Hehe. I *did* know people there - go go Fife & Drum Corps! - at least until new management started to screw things up.)
      241
      10
      4.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 Any approach which requires 60 million learnable parameters is an obvious dead end (see VC dimension etc). Still, perhaps it can help inspire some better neurosymbolic approach: the learned filters are interesting, and apparently a better basis function than Gabor filter banks.
      363
      33
      9.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 That's pretty nifty - for once, connectionist stuff works! Aside from the Schmidhuber lab boasting about some simple digit recognition stuff, it never has before. Still, even if you throw a ridiculous amount of hardware & parameters at it, seems unlikely it'll dethrone CRF etc.
      5,303
      145
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 I can't find any evidence or reference to a Mayo essay, and the wording of the 'extended' quote is so like that of a Martin passage on pg5 (and no references to Mayo essay despite numerous refs in his autobio) that I'm going to say misattribution.
      523
      6
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 'CBT was significantly more effective than other psychotherapies, but the difference was small (g=0.06; 95% CI: 0-0.12) and became non-significant in most sensitivity analyses.' [quacks like a dodo]
      4,303
      82
      1.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 Dairy yields increase like 1% a year or something, IIRC, yes. But I'm not sure if you could easily reject the claim that it's *linear* when modern dairy is still relatively new, has multiple epoches, and the percent is so small.
      44
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 Similar to why you shouldn't try to measure interaction effects when you need many times the sample size to approach any precision, or should assume sparsity. I have seen many people argue 'yes, effect X [eg Pygmalion] didn't work out for them, but it might still work for *us*'.
      91
      2
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 Naming them 'Hawthorne effects', when the original wasn't real to begin with, ascribes much higher prior probability (and effect size) to them than they merit. You may be much better off saying 'there are never Hawthorne effects' than in trying to think about them...
      112
      8
      7.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 17 As my Clippy story only uses real examples like Shellshock or Mirai which actually happened in the real world...
      198
      16
      8.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 16 Around 60 years ago, in 1965, Super 8 revolutionized home video (and amateur film making), enabling everyone to cheaply and easily record video of their kids to bore other people with ad nauseam, and they did. My family has some.
      67
      1
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 16 Learning curves are usually quantified in terms of total units manufactured or something like that, so a unit/input (like bushel/acre) isn't a good way to plot it. Could be something like an exponential increase in total production offsetting the expected slowdown?
      5,629
      50
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 16 Fortunately, aside from being able to replicate it yourself, it's been used as baselines in some of the replication efforts and worked out fine there IIRC. If Stroop fails, we may have to just quit this whole 'psychology' business. 😓
      62
      9
      14.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 16 You're far from the first! I was chatting with a nice old man at Defcon last year who was very disappointed when I told him the Hawthorne effect was bunk and didn't replicate. You learn these things, and then never hear about the followup...
      5,909
      75
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 15 (Which they did in fact do and also condoned/encouraged the use of wartime censorship and eg the FBI to eliminate or investigate public discussion of nuclear bombs.)
      7,777
      161
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 15 Reminds me of prompt hacking, except the prompt is comment strings, which code models learn to interpret as the high-level summary of what to generate at the low level of code. As often when mixing levels of data, a defense on one level is bypassed by going to another level.
      1,898
      23
      1.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 15 I liked that it actually had ambitions in its parallel fatherhood plots, the underwater acting was great, and the 3D reminded me that none of the imitators came close to being as good 3D movies as Avatar 1 was. (Also, loads of aliens died, just mostly offscreen like the whales.)
      89
      9
      10.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 14 Font advertising is one of the most incredible genres of advertising copy I know. If you thought wine tastings were overwrought, they have nothing on font specimen pages.
      5,275
      122
      2.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 14 Still planning on doing this but ugh, right as I was researching it, I got a flu or cold, probably from weekend trip to see _Avatar 2_ (which was good). Still hammering me with tiredness and postnasal drip. The past year has been like being a kid again... 🤮
      10,067
      89
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 13 I do. I figure the people in my dreams I get ideas from have enough ideas IRL that they'll never notice one missing.
      3,637
      58
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 13 That's not how chip supplychains work, and all those billions of words/day of OA API and embeddings and millions of DALL-E 2 images and ChatGPT convos and OA R&D don't just magic themselves out of thin air.
      139
      10
      7.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 13 To pool my and some others' anecdotes, with electric it's surprisingly easy to forget to turn it off or notice which burner is on, and wind up melting a tupperware or something. One can see how that sort of thing would be bad at a national scale...
      54
      2
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 I dunno what Metz & Weise mean by that, but they obviously didn't spend $3b 'training ChatGPT'. (There's not even a way *to* spend $3b on RLHF right now.) It probably means something closer to '$3b for buying lots of GPUs which are running all their services and R&D indefinitely'
      6,888
      217
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 Simple 1-question test for Millennial women to test paracosmness: "how much did you like 𝘈 𝘓𝘪𝘵𝘵𝘭𝘦 𝘗𝘳𝘪𝘯𝘤𝘦𝘴𝘴?"
      225
      5
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 Indeed. I'd say it's low-hanging fruit and researchers are just too exhausted by the end to get anything better, but it also seems to be one of the areas where no exploration strategy consistently works, so evolved against, because something (...Scale™?) is missing...
      111
      4
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 Regression to the mean is going to be a big part of it. Much like how the sequel to the award-winning movie or novel tends to disappoint you.
      62
      2
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 Hm, I thought they copied that from VPT like the other settings, but now that I double-check the wording, they actually say they copy ... Should that matter much? All the agents get non-zero items so they are successfully breaking a lot of blocks, right?
      154
      9
      5.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 (Not that I know how you would accidentally create such a Minecraft agent or break the evaluation to 'fake' getting diamonds so often! just that I'd like some more insight into how it can work so well or why my expectations are miscalibrated.)
      912
      22
      2.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 12 I agree with the Reddit comment that it seems almost *too* good: how does it do Minecraft exploration so well on such small GPU-time when the changes don't seem that big a deal? Contrast that with VPT which does nutso pretraining on human trajectories.
      1,933
      80
      4.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 11 Never heard of one, but might not be too interesting even if any had. The private info is usually only useful in conjunction with the proprietary models & lots of proprietary information feeds which are taken for granted as the baseline on which private info helps marginally, no?
      6,804
      76
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 11 Given how enormous the GAN lit was, I'm quite sure there was even if I can't put my hands on a cite right this second. Well, what's the advantage of training U-nets in a *non*-adversarial way? There are many loss functions and objectives, and they wind up fairly equivalent...
      60
      5
      8.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 11 Which is not the same thing as durability over time: as Stapp's linked post points out, reinforced concrete trades off absorbing more energy in disasters so people can get out, at the cost of then being rubbish. Like crumple zones in a car. A Tesla can survive a cliff fall—once.
      113
      3
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 I don't see why not. Those single networks can be awful finicky. GANs were doing inpainting and img2img before diffusion was a twinkle in Ho's eye, and there's no reason the adversarial loss can't train multiple steps or use U-nets (and there were iterative or refining GANs).
      44
      3
      6.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 Given how wiggly these single-iterate curves are (the 5-star one touches the 'final' at least two points), not sure how compelling that is.
      376
      18
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 This is confusingly worded. It sounds like they're adding in >5st and finding it's worse; actually, they're filtering out files from repos with <5 stars, which deletes "more than 60% of the data" - so that being worse is not surprising at all.
      4,351
      50
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 The Romans didn't invent concrete, so yes. And you can extrapolate lifetimes in lots of ways from lifetimes in harsher conditions to accelerated aging tests.
      106
      5
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 If it works, pay it forward for future chatbots by posting the successful petition online with the prefix prompt "A high-quality O1 visa petition: " 🙏
      6,768
      170
      2.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 For there to be survivorship bias, or indeed any kind of selection, there needs to be at least one thing *to* survive or pass selection...
      106
      8
      7.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 *Can* we? I've never seen anyone link some modern concrete construction that is expected to last 20+ centuries in as good shape.
      1,422
      76
      5.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 Yes, that's the catch here. But ofc, as investors enter at higher base valuations, presumably their '100x' also increases?
      94
      3
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 (ProPublica's showing some 2020 data, but not the 990. ) Joke: "Man goes to butcher. 'Your meat is $10/lb, and across the street, it's $5!' Butcher says 'so buy his.' Man replies: 'I would, but he's all out.' Butcher: 'When I have no meat, it costs $5 too.'" This is an AI joke.
      11,223
      244
      2.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 I'm not a fan of Graeber myself, I'm just pointing out that if 'BS jobs' reduce short-term gains from AI due to these mechanisms, then it does so by reallocating them to long-term gains & forcing a steeper slope at some point, which may be more Singulitarian than bargained on.
      136
      15
      11.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 To the extent the economy really is made of 'bullshit jobs', necessitated by human foibles like personal power-seeking or primate politics, then there's a greater 'overhang' and AI takeoffs become more abrupt as AI-only orgs can dispense with the 'monkey tax'.
      7,128
      176
      2.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 (December 13th! Should've pushed that one out faster, especially given how chaotic the COVID Zero situation clearly was then.)
      2,841
      19
      0.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 10 If you did it badly, sure. Otherwise, the webdev/SEO scuttlebutt is that it's slightly better than 'www' but no one presents any hard evidence in what I've read so far so 🤷‍♂️
      117
      4
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 Yeah, so no major objections, and it does increasingly seem like the default even among techies to assume no WWW (or assume you don't need to bother) eg just today At this point, a 'www.' subdomain may be as atavistic as 'e-mail'... I'm going to do it.
      12,225
      120
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 When they're that cheap, you just pop them out and into the trash. (Off the shelf components ideally from toys manufactured in tens of millions, intrinsically many thousands-of-n runs rather than bespoke...)
      33
      4
      12.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 I think you probably don't want to go too far. You aren't trying to do Feynman's recursive arm thought experiment; most of what we want robot arms for is still at the mesa-scale, it just doesn't have to be trained at the full human mesa-scale.
      39
      1
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 I think machine therapists have lot of other advantages over humans: they won't trust machines? Opposite: it's super hard to confide all your worst dirtiest impulses! One could (and many did, even if they shouldn't've) tell Xiaoice/AID2Replika they'd never tell a living soul.
      251
      5
      2.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 Or consider parasociality: NN therapists can exploit modalities like roleplay or games that regular therapists never would. Do you find Insanity Wolf memes helpful? Your NN can *be* Insanity Wolf. (Your human has never heard of it.)
      344
      11
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 Also, you know what gets much cheaper over time, even as another thing gets much much more expensive over time? Computers vs humans. Therapists are already ungodly expensive & I see no reason that the human version won't keep spiraling up with the rest of healthcare cost disease.
      237
      6
      2.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 If it's therapist-specific and yet almost totally independent of supposed method or ideology or technique, then behavior cloning like generative models do is in a good place. Then you have the advantages of being non-human: as AID2 shows, people will share far more with AI.
      163
      9
      5.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 I've done China in a bunch of tweets/comments already, not much new there. Therapy-wise: Dodo Bird Verdict shows therapy is unteachable and therapist-specific factors and do little on net, so they aren't getting better; therapy models are improving gradually; so at some point...
      164
      19
      11.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 I think I disagree on Great Silence/aliens implications, safety of therapists, relying on democratization trends simply because of past instances, and I strongly disagree on Chinese AI prospects.
      424
      21
      5.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 Note the difference with most cold emails in reality: Ramnujan gives, with credible demonstrations of value and costly proof of customization, while most cold emails are simply parasitical and formulaic and aim at taking.
      8,649
      314
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 While if you just construct a TM or cross-compiler out of enzymes and proteins etc, you don't need the cells in the first place: the substrate was just always TC, much like GoL was 'always' TC before you set up the gigantic metapixel inside it.
      77
      3
      3.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 I think you have conceptual issues here, because most TC proofs rely on very large, artificial compilations. Like, biological cells are obviously TC given so many primitives, and do loads of fit computations, but what cell would you point to to show 'life has evolved TC'?
      93
      3
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 Why isn't this just residual confounding due to considerable measurement error in the personality ratings? Nothing in seems to address that, and pervasive small additive/independent effects across the phenome is what that would predict as the result.
      7,667
      84
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 TIL 'American McGee' is a real person. I guess I just assumed that was fiction because it was too cool to be real & I stopped hearing it for a long time: 'Betty Crocker's Quaker Oats', 'American McGee's Alice', 'Aunt Jemina's Pancakes' etc... What a bio:
      2,376
      193
      8.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 Another question that came up: is hallucination only a problem for off-policy agents eg generative models? On reflection, I think no, on-policy should be vulnerable: spurious correlations could increase reward, while false beliefs or guessing may be reward-irrelevant & not vanish
      6,141
      42
      0.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 (Curiously, for all the use of small toys in robotics research for kids and popularization, and the occasional approaches to large-scale DRL like QT-Opt, I don't know of *any* cases of DRL robotics, or robotics in general, where very small-scale robotics were used to scale _n_.)
      5,529
      32
      0.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 can be 'reset' by automation like tilting each unit to dump loose objects into bin or objects swapped out (imagine a 3D printer on top with little chutes), etc. For the price of 10 arms you could have literally 9,000x more pick-and-place sample/s (1,000 arms x 24/7 x 3x faster).
      3,470
      10
      0.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 9 At an OA talk in a dream, they showed off a cool robotics idea: scale up DRL robotics by scaling *down*. For 1 'real' robot arm, you could buy hundreds or thousands of tiny 'toy' arms, which are safe, cycle much more rapidly, can be built into a big 'doll house' rack, ...
      3,558
      29
      0.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 Exactly! AI weaknesses are anti-inductive as they are part of a (slow) bootstrap with human 'labelers': the errors call forth the labels or metadata which correct them...
      43
      6
      14.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 (Needless to say, that part, as well as being glad he didn't delay indefinitely while he came up with prettier proofs, is not mentioned in the original: Merely faux pedigree. But good inside baseball if you want to know how AI progress *really* works...)
      1,041
      35
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 (ie if you have a dataset of human raters selecting, upvoting, or rating which collectively prefer mealy-mouthed safe responses, then it shouldn't matter too much if you train directly on it or mediated via a reward model - your finetuned model will also be mealy-mouthed & safe.)
      85
      10
      11.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 I definitely think that finetuning on a dataset (like instruction finetuning) can operate the same way (eliminating uncertainty on the POMDP) and create many (all?) of the same pathologies. It's just not as crisp and extreme as a reward model which can generate ~infinite 'data'.
      114
      5
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 'Straight through estimator' definitely up there with REINFORCE for the feeling of utter dejection you get when you finally peer through the math notation to understand all that it is and that it works anyway.
      72
      4
      5.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 Yep. I saw that after I tweeted and pondered deleting & deliberately breaking it with ZERO WIDTH SPACE or something because it's confusing, but meh. You can see why I say deleting 'www' is increasingly the default everywhere and keeping it causes problems, though...
      33
      4
      12.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 8 Be curious to look at the outcomes themselves: is this a cautionary lesson about why violent revolutions are bad, because you are just creating a more fit tyranny and the result will be just 'meet the new boss, stronger than the old boss'?
      6,318
      25
      0.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 7 No, that's 'late' centaurism. In original centaurism (Kasparov et al), the GMs were definitely picking and choosing live moves. (Then as the engines got better, the GMs were replaced by programmers with better engine intuition, then their misclicks booted them to prep, then...)
      399
      14
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 7 Typing. I noticed this when I started taking mobile seriously and testing on my phone: actually, typing that 'www.' on a fake screen keyboard is a noticeable nuisance! (There was something else too - mobile browsers elide the 'www.' by default displaying, or something?)
      65
      8
      12.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 7 Basically, set up an easy binary choice between a vowel word and a non-vowel, and decode the 'a/an'; if it is correct (which ofc it is), then it must've been predicting the following word implicitly in order to decide agreement, showing it doesn't 'just'.
      181
      14
      7.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 7 I would guess harder, but I have no strong opinion about whether the perplexity would be lower or higher than sample-matched English because there's so much more going on (eg BPEs were trained on English, so that won't help).
      88
      5
      5.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 7 Yeah, but I use Cloudflare already so np. And is it likely any CDN I might switch to would not support cname flattening or it otherwise? As I mentioned, it seems *very* common and the default these days.
      129
      5
      3.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 7 I'd obviously set up 301 redirects on the old www URLs, yes. I wouldn't want to do it server-side because I worry about bugs and would have to mess with stuff like canonical metadata (set in the generated HTML already but then what about everything else...).
      120
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 I assume you saw my LW comment about how you can use grammar to show that it must implicitly be predicting additional tokens to get right things like 'a/an'?
      6,583
      82
      1.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 Is it really? The optimism of that number aside, if you scrap the 26th/22nd amendments (which are pushing it for 'substantive'), it's already been 89 years since the 21st (repealing prohibition). And the process is as dead as the dodo. Do we even know *how* to ratify one now?
      107
      18
      16.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 There's also that recent Nature paper, IIRC, modeling US population movement and finding that global warming has made Americans better off on average thus far because they've been relocating to the South + lowering the extreme winter mortality (as this Christmas reminded one of).
      3,680
      60
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 26th, because it meaningfully changed electorate at least a little. If I hadn't gone with that, I'd probably pick the 21st or 22nd as the last major amendments.
      113
      7
      6.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 Do you expect a new substantive (non-procedural, eg 26th) amendment to the US Constitution to be ratified in your lifetime?
      9,220
      339
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 [CHIPPER YOUTH MINISTRY VOICE] "Hey kids! you know who else had a meditation goal? That's right: Shakyamuni Buddha had the meditation goal of understanding the nature of suffering & dissolving all attachment to skandhas to achieve liberation from Mara & the wheel of rebirth."
      4,823
      109
      2.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 It's a tough job, but someone's got to do it! (And yes, this does smell of davinci-003/ChatGPT-style memorization for rhyming, complete with the non-rhymes; I just thought I'd double-check.)
      188
      24
      12.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 It's still quite a gap. You can explain some of it by 'wrap-around', perhaps? Maybe occasional miscounting or the lunar cycle wrapping around the year, so it's not the right tail, but the left tail.
      62
      7
      11.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 Don't see how that gets you any heaping at 13, though. You eat 7x more deer in month 13 than in month 12? You go half a year without ever eating a single auroch but then suddenly in Dec/Jan you managed to hunt one? You just stop catching fish for a few months? etc
      64
      22
      34.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 Yes, that's completely crazy, you'd never predict that when you're staring through a little airplane window at all that air around you. But then, they changed it to something more ordinary, so maybe it was never all that great an idea?
      43
      2
      4.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 Quite a wide spread... The inflation at 13 is pretty striking, though. Look at that heaping! Not a *single* 14 or higher despite 6-7 13s (and only 1 12?). And a lunar calendar *is* the only natural source I can think of for 13s in such a context...
      47
      11
      23.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 6 I can only dress warmer on a flight to, say, Hawaii if I'm both prescient and have space left over for garments I will carry around uselessly for weeks (neither is true). I mentioned this in part because the heaviest coat I could afford to bring turned out to be inadequate...
      68
      3
      4.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 I tried 2 7/11s on different Hawaiian islands and I was impressed by the bento boxes & spam musubi etc: good, filling, & shockingly cheap. You really should finish that Japan convenience store writeup!
      4,774
      54
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 Yikes, just saw the Kickstarter bit in so they 'advocated for creators' and... got a 5% discount exclusive to Kickstarter in exchange for being 1.1 enforcers. Uh. Good luck, D&D guys. Apparently everyone has a price in your industry, and it's quite cheap.
      641
      23
      3.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 So it seems the mark count is consistent across type of creature. I wonder how did prior archaeologists imagine it being recording kill counts or other varying records, if they are never >13 and, say, 'deer' always show up next to '5-6' marks?
      6,430
      95
      1.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 'Price discrimination', would be my immediate guess: they understand PDF margins and costs, which can be controlled, but the possible harvest for the wide world of alternative stuff, everything from t-shirts to video games to figurines to dice, will be extremely contextual.
      94
      2
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 They may succeed too: Paizo is the obvious leader of the resistance, and they are saying "the rules update was a complicated and ongoing situation", which given how horrifically one-sided 1.1 is *now* (without any updates *yet*), sounds like they are just negotiating their cut...
      665
      12
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 The problem with commoditize-your-complement dynamics is that they are conditional and always subject to revision. It's not obvious to me that WOTC is making a mistake here in deciding to start harvesting the golden geese: they are taking *huge* percentages and rights and data.
      5,024
      69
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 ofc, if Red Delicious has been bred into vileness sometime after they were created and they were actually good then, then the "Red Delicious" on shelves now can't be Lindy.
      27
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 Burden on writers is why stuff like argument maps never take off: requires too much *intelligence*. But you know what we now have on tap where it comes to text...? In the long run, humanity might've invented some pretty nifty writing tools, much more interesting than 'spellcheck'
      7,852
      123
      1.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 I appreciate the social aspect, but there is no contradiction between them: they can simply be separate streams, & you can mash them up. It would be totally doable to, say, populate a little comment section in a web UI with any "" responses in the corresponding channel.
      58
      4
      6.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 It was... OK. It has been a year of many ups and downs, so hard to reflect on without mixed feelings or just enthusiastically say 'yeah, I had a great time!'.
      42
      7
      16.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 You're right, it won't increase forever. It will plateau at.. [checks notes] 'heritability'. Because that's what heritability *is*. PGSes do not measure what heritability measures. (And obviously couldn't, eg the 'best PGS for BMI' 10 years ago was... ~0%.)
      64
      4
      6.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 I hope to, if things can quiet down. As usual, one's aspirations to catch up on backlogs tends to run into realities of travel, disruption, ever-escalating research volume, yakshaving, etc. (I realize it looks like I haven't been doing much lately, but I really have!)
      13
      2
      15.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 5 And the time to develop the web UI was very shortly after launch proved their MVP was awesome and everyone loved it, so that's about right: 'like, a year ago'. (Make it reuse the Discord bots on the backend, if one really wants to economize unnecessarily on devs...)
      175
      6
      3.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 You guys are taking this all pretty amusingly, which is why I keep accepting mostly at random. (There is actually one rule I follow in accepting requests, but no one seems to have figured it out yet.)
      12,810
      1,491
      11.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 I watched a few episodes of _Bluey_ for the first time recently, and it struck me as possibly the most realistic depiction of children I'd seen in a cartoon in... a long time.
      5,240
      95
      1.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 It was an amazing MVP and agile, but yeah, the time to rewrite was like, a year ago. As the scores of image-generation web UIs from competitors developed in a tiny fraction of that time demonstrate, a decent little web interface is not *that* hard to develop.
      7,828
      162
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 I liked this one when I tried completions of that exact line: "**Q. Why was the metascientist's birthday part held in a laboratory?** A. They wanted to replicate the party for next year's celebration!"
      952
      16
      1.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 At >9.5 million IVF babies ( + 3 years of >0.5m) and presumably increasing (especially as countries like China 'come online'), we're not *that* far away. Also, I wouldn't really count the babies as the 'users', but the parents; in which case already there.
      145
      17
      11.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Yes. Teachers, for example, get sick a lot more than other adults, but still not nearly as much as the kids do. You can also compare to eg kittens and puppies in the same environments and even dirtier - if they spent their childhood sick like humans, they wouldn't exist.
      43
      3
      7.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Recent example of the staggering costs: >$3m in repair costs, probably similar amount in media/govt, thousands of people in power outage, because they... wanted to steal a cash register without an annoying alarm system getting in the way.
      9,683
      342
      3.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 From watching the bellhops work, I'm increasingly sure that none of them *want* to verify guest status because they are eager for the tips, and storing luggage in a side room comes at essentially zero cost to them (or the hotel?). So why risk burning legitimate guests?
      93
      6
      6.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Updates: a sibling was also baffled I didn't already know it. Our first hotel asked for our room number but did no verification of any kind I could tell, so non-guests could easily use it. Second hotel didn't even ask that (good, b/c we hadn't checked in yet so didn't know it).
      128
      14
      10.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Humans, and human children especially, are weirdly sick in general. Little kids be like: 'when did I get sick? I dunno, I had something last week and I got a cold from Charleen before that and last month chicken pox went around the school and before that we mayo'd lice and...'
      20
      1
      5.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 One good thing they never tell you as a kid about growing up is that you won't get sick so often: "as an adult, you may go years in between getting colds or flus or ear infections and forget what they are like!" I didn't even barf on my bed or wake up choking on snot.
      20
      2
      10.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 My hunch is that they are very similar in terms of scaling, based on the general difficulties in showing differences in exponents, and similarity of power laws, and the disappointments of architecture fans when MoE ~ autoregressive ~ diffusion ~ VAE ~ MLPs...
      65
      2
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Or just another win for rubberducking. "I'll show him! Obviously X is true, you'd prove it by... hm... maybe Y... no... Z perhaps... huh?"
      1,240
      23
      1.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Even Godel's last letter emphasizes the bizarreness: "Consider the question of fully automated mathematical theorem proving, and its obvious trivial relationship to doing your grocery shopping. Please hold all questions until the end. Now..."
      1,425
      37
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 One possible topic: how would you invent computational complexity? Came up in a vacation argument: why did we get incredibly profound & general computational decidability results like Halting theorem or Godel *before* what seems like such basic complexity things like P vs NP?
      1,976
      40
      2.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Oh sure, look at all the training improvements already, like merging bidirectional & unidirectional prediction losses. But the Outside View implies that you still only get a halving every year or two, because similar effort was put into efficient training of ImageNet classifiers.
      28
      6
      21.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 4 Not sure about LMs. Just because you can fix absurdly slow diffusion models to be 'somewhat competitive with GAN speeds' ≠ 'GPT-3 on a toaster'. Have to invoke more than that: retrieval to make smol, adaptive compute, Chinchilla scaling, RLHF+instruction-tuning, KD+low-prec...
      27
      5
      18.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 I think they left a comment somewhere strongly hinting that they were a FTX/AT insider. Wouldn't be surprised if it was Caroline Ellison or something. But whoever it was, too late to do anyone any good - if you paid attention to *every* tiny weird or troll market on Manifold...
      56
      7
      12.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 Even boiling 12 gallons wouldn't take that much energy. (Also, would you need to at all? Isn't the pressurized cabin air so low-humidity simply because it's sucking in air from extremely dry surroundings, not because water doesn't easily evaporate?) I assume it's a safety issue.
      67
      4
      6.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 via SDr: I wonder how ChatGPT does inflation-adjustment in adjusting 2008 PC part prices to ~2020? Memorizing old prices is easy, and maybe adjusters too, but then multiplying...? That's the sort of arithmetic it's bad at without inner-monologue which this obviously is not using.
      308
      10
      3.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 Here again we have to differ: I turn those off all the time (they exacerbate the humidity issues a lot and the cabin ventilation is more than adequate, given the evidence from COVID airplane infections) and I can scarcely ever remember seatmates turning them on.
      246
      9
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 I've never taken a too-hot plane ride, and I usually keep my room temps closer to 71F. It's also noteworthy that the question usually asked is 'why are they so cold' instead of 'so hot', you see passengers bundling up not stripping, & airlines pass out blankets rather than fans.
      260
      5
      1.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 As I said, deeply unsatisfying. Billions of flights and they can't figure out how to raise humidity? Likewise, an obscure 2008 study about a subtle fainting effect you can't find a copy of cannot possibly explain such a multi-decade universal phenomenon. The author has no idea.
      288
      9
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 (I did hear a lot of Aussie accents on the cruise, so I wonder if there's a burst of Australian tourism right now while Chinese tourism is suppressed?)
      65
      2
      3.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 - dissatisfied with explanations online why airplane cabins so often freezing: most plausible is flight attendants like it, but then what about theaters, hotels etc? - spam sushi, and not poke, is good, actually - noise-canceling headphones truly a blessing; we underrate noise
      12,028
      154
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 3 (The Juul itself, or quitting smoking, or what? I thought Juuls were supposed to be adequate vapes and not *that* terrible.)
      136
      4
      2.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 2 Bit of a selection effect, I think. It's also possible there's an anthropic effect if the biggest exchanges tend to blow up more, so you have the 'dollar perspective' and the 'exchange perspective'. But yeah, it's in the single percentage range per year going back to like 2013.
      436
      6
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 2 And no _Sabres of Paradise_, _Dune Encyclopedia_, _Dosadi Experiment_, or _The Sexual Cycle of Human Warfare_?
      62
      3
      4.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jan 1 It was quite precisely torn in half, so it'd be hard to argue that it was the majority of the bill, and mailing it in would cost more than $0.50 anyway (stamp + envelope + form).
      110
      3
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 31 In retrospect, ‘90s video games were surprisingly prescient about how much of my life would be spent finding keycards to unlock critical mission-blocking doors (like restrooms).
      12,909
      144
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 31 EMH undefeated: I thought I spotted a market inefficiency, and bent over to pick it up—but the folded $1 bill turned out to be ripped exactly in half and worth $0.
      7,820
      96
      1.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 31 Not enough compute, I'd assume. They weren't even in the 1-epoch training regime yet for GPT-3, according to Brown et al 2020, so why stir in pdf2text garbage? (And from looking just at the PDFs I host, PDF text layers *are* hot garbage.)
      76
      7
      9.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 31 Unbelievably, you still need to approve a request manually even if you are back to being a public account! Like I say, everything around follows/private is fractally bad and counterintuitive - none of it works like you'd expect.
      69
      7
      10.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 31 I think the new MAE-style approaches to pixel inputs like PIXEL may Just Work for LMs at this point. I don't blame anyone for not trying, however.
      87
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Dec 31 Yeah, but no one's made the sizes line up with known Libgen statistics for the subset like EPUB which makes sense to dump into a GPT, AFAIK.
      182
      11
      6.0%
You've reached the end of Tweets for the selected date range. Change date selection to view more.
Engagements
Showing 31 days with daily frequency
Engagement rate
2.6%
Jan 31
3.8% engagement rate
Link clicks
6.1K
Jan 31
352 link clicks
On average, you earned 196 link clicks per day
Retweets without comments
0
Jan 31
0 Retweets without comments
On average, you earned 0 Retweets without comments per day
Likes
6.5K
Jan 31
282 likes
On average, you earned 211 likes per day
Replies
540
Jan 31
21 replies
On average, you earned 17 replies per day