𝔊𝔴𝔢𝔯𝔫@gwernDec 30(because PDFs are basically just big blobs of pixels designed to be printed, open source or no)
185
22
11.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 30They don't have access to either in full generality: GPT is limited to scrapeable datasets like PMC or Arxiv, HTML native ones which are free etc; but none have access to raw PDFs from anywhere. (eg Meta's Galactica just uses the *abstracts* of the '40m' papers or whatever)
9,080
140
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 30no one tell the logicians that all syntactic related things are worthless 'word games', they're unstable enough as it is
96
3
3.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 30Surely he didn't invent Bayesian meta-reinforcement learning (which is all in-context learning is: solving the POMDP by conditioning on a history or sufficient statistic).
𝔊𝔴𝔢𝔯𝔫@gwernDec 29Not sure. There's so many Asian-Americans/immigrants here, and then you have Korea etc. I assume I'm not seeing many Chinese, for obvious reasons, but beyond that...
180
8
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 29As always, "sampling can show the presence of knowledge but not the absence"...
𝔊𝔴𝔢𝔯𝔫@gwernDec 28(If you think about it, like so many things on the bird site, it's actually Elon Musk's fault.)
8,684
494
5.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28Hawaii's definitely for couples/newlyweds & pair-bonding...
Shocked xenobiologists a million years hence: "𝘏𝘰𝘮𝘰 𝘭𝘶𝘥𝘦𝘯𝘴 returns to island breeding grounds, no matter distance, to spawn; researchers believe warm shallow waters shelter them from as-yet unknown predators."
8,140
118
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28(Oh, I know plenty of people online who talk about it, I just didn't know any 𝘳𝘦𝘢𝘭 people. So it comes as a shock each time.)
453
9
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28One mention came when a waitress was explaining why the thing we were jonesing for was out of stock.
𝔊𝔴𝔢𝔯𝔫@gwernDec 28It's the donation ads. It's that bad. I even hit the dismissal buttons twice without the ads going away, merely reloading the page & losing their place, at which point I was too ashamed to keep trying and they just kept reading with a third the screen WMF spam.
170
7
4.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28Yes, I'll just tell them to install uBlock on their phones and learn its CSS dialect... (I'm still working on explaining 'hotspots'.)
𝔊𝔴𝔢𝔯𝔫@gwernDec 28It's an emerging trend. A few papers here and there before, and now getting serious with PIXEL and MAE archs etc. Chinese is a pretty obvious place to use pixel encoding because it's so obvious you can't simply write down Unicode points or number of lines to reflect the *image*.
82
4
4.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28Not sure I'd say that it was *capitalism* which entered the Wikimedia Foundation and seems to be behind the lust for a perpetuity.
298
16
5.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28Watching relatives try to look stuff up today (even tapping didn't make the banner ads go away):
When did Wikipedia go from being the least ad-infested website I use on a regular basis—to the most?
12,131
419
3.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28Maybe. I recall a lot of skepticism about the 'PCCs' being transformed into, and you have to admit, that's a bizarre looking funnel plot just vertically in terms of power/error: where's all the *medium* power studies...?
5,391
63
1.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 28What are the different inductive biases, and why and who does it help so much?
452
8
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 27Also cleaning up garbage PDF OCR: gwern.net/GPT-3-nonficti… But like Whisper, you do have to be careful about the *semantic* errors like substitutions...
10,044
225
2.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 27"Wait, this algorithm is *self* supervised, not *un*supervised?! Why didn't you say so - this changes everything! ! ! !" --no one, ever
435
9
2.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 27No, I just think the fad of 'self-supervised' is very dumb, in part because it licensed a new generation of pedants 'well akshully'ing a neologism - no, sorry, self-supervised = unsupervised, and no one has ever shown a meaningful distinction I should care about.
540
14
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 27Doing the entire textbook was your mistake: need different associations for different pages if you want to manipulate them! Galton got it right: gwern.net/docs/psycholog…
1,261
42
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 27I will eat my hat if GPT-4 is not primarily or even 100% trained using prediction of tokens - you know, just like GPT-1 (the RNN), GPT-2, or GPT-3 (or Gato, or...).
83
6
7.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 27Predicting tokens has always been in 'unsupervised learning', attempts to rebrand it to 'self-supervised learning' notwithstanding, and no one claims it is supervised learning / labels in the usual sense! His statement is correct.
379
8
2.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 26I'm kinda bemused: he's probably correct, you know, it'd probably be unsupervised (just like GPT-3), which is why it's so versatile, and even if you don't like *that* basic truth of GPT training, we already have like a dozen papers showing GPT-3 can self-improve without labels!
8,731
141
1.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 26Moore makes a good case we need more imperialist wars of colonization, given that malaria and a bunch of other things are still around in poor countries... 🤔
7,273
159
2.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 26Most uses of the Outside View rely on the events, not commentary/predictions, if only because it is (as you well know) so difficult to compile predictions and typically explicit predictions don't even exist, often for reasons like not knowing 'industrial revolutions' exist.
485
12
2.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 26I'd have a lot of doubts about whether that was really any good. It's not like GPT-3 suddenly starts flawlessly rhyming if you just spatter the page with whitespace.
So, since ByT5 exists and is a clean test and is known to do well on exactly what we think BPEs would impede...
72
0
0.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 25They dropped the prices drastically and you can use the smaller one as well, so I'm not sure how bad the cost is now. And how valuable is one's time? If you know how to use the alternatives already, and can run them with zero yak-shaving ever, great... But most can't.
77
8
10.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 25Sesame certainly isn't bizarre. (Sesame-seed bagels are my favorite.) Adding it to hot dog/hamburger buns, brioche, cookies, French toast sticks, crackers etc (full list unknown), none of which had sesame or are intended to taste like sesame, on the other hand...
61
2
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 25Going down to rawer data to avoid feature problems is not 'better' modeling (that would be, like, adding IPA or using better tokenizations like unigram). This is 100% a scaling win.
And also it *does* show scaling makes the spelling work, for specific words. Also a win.
𝔊𝔴𝔢𝔯𝔫@gwernDec 25“The past is never dead. It’s not even past.”
At the USS 𝘈𝘳𝘪𝘻𝘰𝘯𝘢, people wearing 80th anniversary t-shirts—suddenly, oil slicks left right center, pulsing up from below in expanding rainbow shimmering circles.
Beautiful, in its own way.
8,387
126
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 25[Every DRL agent nodding in agreement extremely rapidly at APM limit.]
9,425
93
1.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 23I don't think that's true. I once looked for RAPM and couldn't find it.
103
2
1.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 23I'd definitely be concerned about all the downstream users of food who know X doesn't contain anything bizarre like sesame - because why would it, why would people just go around adding sesame to random things, that'd be insane...
4,536
44
1.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22How men past their prime in low-status fields have always had to compete for attention, one assumes: money. 😉 (ie. spend moar on turkers)
𝔊𝔴𝔢𝔯𝔫@gwernDec 22I think it's interesting for the safety/sociology/methdological aspect it has increasingly taken on since I highlighted in 2020 what I thought was merely an amusing minor bug soon to be fixed...
𝔊𝔴𝔢𝔯𝔫@gwernDec 22Ah, but what if you might generate more than 9 files a day, eh?
410
15
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22(I use `rename` or dired-mode, yes, for renaming files, nbd; what burns my Frosties cereal is all the *references* to file names... Mass search and replace, nginx redirects, or worse, oy vey.)
654
9
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22(It does better on sentence*s*, yes, see buried all the way at the end.)
83
2
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22'Use the Doomsday argument: in an increasing series you randomly sample from, you observe the median point half the time, 90th decile 10% the time, and so on. Therefore, you should expect twice as much on average, and add a 0 if the final ID is >4xx.'
4,712
40
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22How do you decide how many leading zeros to prefix in filenames or schema when you're not sure and don't want to rename from 1234.txt to 01234.txt?
Bad answers only:
𝔊𝔴𝔢𝔯𝔫@gwernDec 22And that's been something I've been criticizing since July 2020: "no, your BPE feature-engineering is bad. Stop that. You are erasing information from the raw text inputs, and loading up on bias you don't even notice in exchange for short-term variance. Just scale char more!"
294
23
7.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22No, this is an L for feature engineering & win for scale. BPE -> char is *removing* feature engineering and relying on scale to fix it. The whole justification of byte-pair encoding was that we could hand-engineer a better representation than characters which was more compressed
4,490
114
2.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 22And just like when it was made, that post continues to be useless because it benchmarked the old OA embeddings on the *one task OA said it was bad at and you shouldn't use it for*: sentences. In the last evaluation, the actually relevant one, it does fine.
62
8
12.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21You worded it poorly by not specifying that you were attacking the part no one would think you were attacking because it is such a dumb trivial point.
And ChatGPT is a terrible way because that doesn't even gauge *ChatGPT* capabilities - as proven by jailbreaks!
69
4
5.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21That's nice. I don't see how that is relevant to my point, which is that the NYT has gone on record as having obtained memos and sources repeating the same 'code red' use and meaning as the supposedly false notes that even Googlers were mocking for being blatantly wrong.
69
2
2.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21'ChatGPT' is a bad way to gauge what GPT models understand, as the jailbreaks of greater functionality alone demonstrate.
And if you worded it badly, that's not my fault. (What is there that GPT 'fully' understands?)
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Did you actually mean to make a point here or just be insulting? Of course I read it...
27
1
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21There have already been many demonstrations going back to at least August 2020 of using GPT to control browsers (I'm not even including WebGPT or groups like Adept). It wouldn't be hard to add images to HTML inputs either, look at CM-3.
42
1
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Or makeup, or a lot of things... I don't think we have a good word for it yet (neither 'professionalism' nor 'sprezzatura' capture it), so the phrase will have to do for now.
97
5
5.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21I continue to be amused how the single most useful gwern.net feature was... the little bash script for uploading files or temporary files. Saves me so much time: it's done and ready and put the URL in copy-paste before the Dropbox/GDrive/Mega pages would even load.
9,433
165
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21A remarkably pure instance of Outside vs Inside view.
9,769
233
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Doesn't work. Think about it more behaviora-genetically: there's lots of overlapping traits there. You might as well suggest trying to distinguish by looking at novelists who had/hadn't lawyers for parents... That's why Galton started with papal adoptions etc and moved to twins.
𝔊𝔴𝔢𝔯𝔫@gwernDec 21The phrase 'good design is invisible' has been coming to mind a lot as I watch all the Twitter clones (& also the new Twitter management).
How hard could it be to clone and improve on Twitter? Apparently 𝘱𝘳𝘦𝘵𝘵𝘺 𝘥𝘢𝘮𝘯 𝘩𝘢𝘳𝘥 (at least for the sort of people who do it).
5,329
86
1.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21"Links on the Internet last forever or a year, whichever inconveniences you more."
8,520
278
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Yes, and people listen to baseball announcers too, doesn't mean baseball isn't the most boring sportsball of all... The fact that streamers and lets-plays can shuffle so seamlessly between games (which are often being heavily patched) shows that it's parasociality, not gaming.
116
7
6.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21every.to/napkin-math/6-… highlights (section 6) 'SOOT' soot.com which is doing this to some degree, but not projecting the induced orderings back into convenient linearized file hierarchies.
3,994
81
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Like dreams, it's hard to write about games without being boring: especially if it took 130h, it tends to be a 'you had to be there' experiential thing...
You should only bother if you can deliver enough spice to make yourself into the main character again.
𝔊𝔴𝔢𝔯𝔫@gwernDec 21The one developed by the guy who couldn't remove a JS popup from Twitter after 4 weeks and quit, saying it was too hard? I guess it goes to show web dev 𝘪𝘴 in fact harder than AI or rocket science.
180
27
15.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Hm, for a proof this complex, I think it'd be good to have a fully machine-checked formalized proof. Then the mathematical community could have real confidence that it's correct, as opposed to just having error-prone humans skimming it & saying it looks right.
Wait...
4,370
48
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21(Similar issue with the text discriminant analyses you sometimes see, eg 'male vs female words'. If 1 guy says 'bitch' once and 0 girls say it, then it'll be the top hit for 'male language' as it perfectly predicts a male writer; but that doesn't mean the forum is hugely sexist.)
63
3
4.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Even without sampling error (what does that mean if you have a census/complete count? what's the 'random error'?), it'd still be hard to interpret because it is omitting base rates in favor of odds/RR. A tiny subreddit of a dozen people can have a total overlap easily.
𝔊𝔴𝔢𝔯𝔫@gwernDec 21Not a great analogy because car engineers know extremely well how cars work!
If car engineers knew cars the way DL researchers know how Transformers/self-attention work, they'd sit around and ask 'do cars need... 𝘸𝘩𝘦𝘦𝘭𝘴?', take them off, and find it works almost as well.
𝔊𝔴𝔢𝔯𝔫@gwernDec 21There's more to life than a logic gate, you have to be able to set it up right. *Can* you set up a recurrent circuit with the immune system after you've used up said immune system with their circuit construction?
𝔊𝔴𝔢𝔯𝔫@gwernDec 20Never thought I'd find myself defending Taleb like this but (a) Taleb's aphorisms and gimmicks are genuinely more funny & (b) at least he's never murdered anyone and his defrauding, pump-and-dump, conman, & tax evasion activities have been limited to overpriced $29.95 hardcovers.
𝔊𝔴𝔢𝔯𝔫@gwernDec 20Considering that that was more like 1900, I'm guessing that ascribing it to the Haber process is a bit oversimplified. I'd start with en.wikipedia.org/wiki/Green_Rev… personally...
116
2
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20I wouldn't say that. The evidence that 'reward is enough' has never been stronger as more and more things can be meta-learned or learned end-to-end. eg twitter.com/Luke_Metz/stat…
𝔊𝔴𝔢𝔯𝔫@gwernDec 20DeepMind never went anywhere, and kept building on AG with things like Gato and PoG. They'll have their 'Stable Diffusion' moment, if you will, as everything merges at scale and we start to enjoy the cherry on the top of the cake.
𝔊𝔴𝔢𝔯𝔫@gwernDec 20That's because They™ won't fund proper IES and genome synthesis of corn! 🦾😠
500
16
3.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20Nothing comes to mind offhand either. I think it would probably perform worse because the performance characteristics of generating a bunch of prefix tokens before training reals are going to be bad. Anytime your GPUs don't go brrr as much as possible, Transformers suffer on net.
74
1
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20Selective breeding might be a better example. No one in the world has even 1% of the picture of how 'corn' works and was evolved from ancestral maize the size of a thumbnail - it just does.
(An occasional gene or biochemical pathway, but likewise for Transformer heads/circuits.)
467
36
7.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20I'm not sure. Possibly confusion over what is intended by 'word'.
132
4
3.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20Another example: my college recently did some renovations, and changed a bunch of parking spaces to this weird 'QR code registration required' thing. No idea why, surely some excuse about parking efficiency. So... no one uses the parking spaces, and that makes parking even worse!
337
15
4.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20I'm not sure about word-count problems being BPE. Spaces are consistently tokenized as a specific BPE, and punctuation is usually separate too, so it *seems* like learning '6 punctuation/whitespace-separated chunks' ought to be learnable easily long before, so that may be else.
158
7
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 20Still not understanding the point. Aella's claim is that "players in the NBA are tall". He's the one going "no they aren't", and you're the one going "he's right, look, a tall person who isn't in the NBA!"
('All knights are men-at-arms, but not all men-at-arms are knights...')
60
9
15.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19I'm not trying to do it myself, I was just hoping someone already had so I could add it as a confirmed example. If no one has, I'm just going to add it to the speculative section of /Turing-complete. Stochastic petri nets sounds like a good direction.
64
0
0.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19Has anyone proven immune systems Turing-complete yet?
They obviously are complicated/smart enough to be TC, but searching, all I find is this fairly dubious/abstract 'mobile membrane' computational model: core.ac.uk/download/pdf/8…acad.ro/sectii2002/pro… Unsatisfying.
6,693
182
2.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19Yes, but it's hard to make a defensible estimate of that, so I omit it for brevity's sake.
129
2
1.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19If the polls aren't making decisions, then there's no problem in holding them accountable for making bad decisions, as they aren't making them.
(Musk, of course, has his own personal mechanisms of accountability for mismanaging Twitter. Around 30 billion of them, I understand.)
172
8
4.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19when it comes to twitter polls about running twitter, i feel that they are definitely being held accountable to the consequences...
𝔊𝔴𝔢𝔯𝔫@gwernDec 19It would be impressive if one hadn't (like OP, apparently) paid any attention to math AI beyond AlphaZero; then it's a shocking demonstration of sufficiently advanced prediction = math, and if nothing else, a powerful generative prior for ATP.
If one had, then a nothingburger.
103
13
12.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19I'd be irritated too if I'd been talking clearly to someone and they ignored me. If I recall correctly, this was in a pool context and so I had had to take off my hearing aids. (They're much more water-resistant these days but not then.) But they didn't know that.
293
9
3.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19Even Alexander's misses are much better than the average take.
54
5
9.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19en.wikibooks.org/wiki/Haskell/M… Hm... I think the endofunctor part may be wrong and they're just monoids. Evaluating a Transformer layer by layer does look monoidal at the token level: padding tokens are identity element, and you can evaluate the rest in parallel... 🤔
8,378
169
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19Hm... if it's a solar minimum leading to more background cosmic rays, that sounds like good news. I'm assuming that solar flares produce a very small amount of cosmic rays so that was being backed out to very large solar flares, but just some cosmic ray increase doesn't seem bad.
5,804
41
0.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 19I was curious what happened to him, and the answer is: nothing, he's still cranking out a book a year en.wikipedia.org/wiki/Piers_Ant… at age 88 (after remarrying before his wife's ashes were cold). And they're not all formulaic Xanth novels either!
Respect for the longevity, at least.
47
1
2.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 18How does its foxos compare with the others? They're nice, but I'm not sure one couldn't get them out of eg NAI - I've seen a lot of great landscapes from it.
3,788
24
0.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 18A rare miss by Alexander. If the history of computers shows anything, it is that no one has any idea what can or cannot be done with 'an army of clerks' aside from making wrong claims about what cannot be done. One can no more prescribe that in advance than high modernists could.
4,382
97
2.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 18'How did such a successful entrepreneur melt down like this??? I just don't get it!!!'
Gosh, it sure would be interesting if there were some mental disorder which is correlated with risk-taking, creativity, entrepreneurship, and the occasional bout of disastrous decisions. 🤔
568
37
6.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 18I continue to be astonished how, literally just weeks after the most famous rapper in the world melted down & became an ex-billionaire, people apparently still have no idea what 'bipolar disorder' (only one of the most common & harmful mental disorders, surpassing SCZ) looks like
𝔊𝔴𝔢𝔯𝔫@gwernDec 18The idea is to bias the noise towards the final true image, to provide more training signal. This bends the overall 'trajectory' consistently towards the target.
It also builds in distillation, but continuously: eventually you are regressing straight from pure noise->target.
7,171
28
0.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 18Idea for a training trick a bit inspired by progressive distillation: I call it 'Brownian bridge sampling', to bias the random walk.
For each training pair, pixel-wise weighted-average the noised image with the original final image. Start at 0 weight, and anneal to 1 by the end.
4,825
25
0.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 18I want to like this but my eyes instantly go to all the cat artifacts. (Eyes look like blinded by cataracts; left whiskers missing entirely and muzzle smoothed; too many and too large paws.)
5,205
42
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Cowen's review was a good deal more positive than I expected (as was Brody's), and has mostly swayed me into thinking I should go see it. (In a theater in 3D, specifically, not wait to torrent it if I ever bothered to see it.)
915
20
2.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17I don't know about totally indifferent because I tend to forget those, but as far as ones that amuse me:
THEM [yelling]: "What are you, deaf‽"
ME: "Yes."
THEM: "Oh."
9,337
166
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17I don't think they've been trialed in combination yet; at least, I don't have any examples in my notes. (If that doesn't work out, oh well, plenty of others to experiment with. Only takes 1 good combo, after all.)
163
9
5.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17So, I don't consider it improbable at all to hit a possible 40% (fairly arbitrary as that benchmark is given how bad bariatric is otherwise) from long-term use of refined high-dose multi-drug combos spiced up with additional new drug candidates or combinations in the next decade.
74
1
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17(Nor do I see any particular evidence from the combo-therapy trials or multi-arm trials that there is pernicious interaction where the first drug has all the benefits and the rest are redundant. So it looks like there's plenty more button-pushing to be done in this area.)
92
3
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17So, since you can increase the dose and combine them effectively, if it takes 2 years for a low dose of a single one to show full plateau, and the shorter study curves usually show no plateauing, these are all underestimates by quite a bit.
58
2
3.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Your tirzepatide % is coming from trials running not longer than 1 year, IIRC, while STEP5 shows that a 2.4mg dose of semaglutide takes 2 years to plateau average weight loss. These all show dose-response, so you can increase the semaglutide, and it would take longer just itself.
62
3
4.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17All too easy to imagine how that wouldn't happen. Plus, why does fusion have to ever happen? There are many energy sources like fission which may just be better long-term.
326
15
4.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Yes, I already addressed that with 'short-term'. You need a reason to think that the weight loss *stops* there at the reported trial followup like '20 weeks' etc, which is unlikely.
69
13
18.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Yeah, this is kind of a weird scenario: we already know bariatric surgery causes major changes in weight (that's the point) and that (or other surgery effects) cause major changes to the gut microbiome.
So the surprising result would be if the transplant effect wasn't affected.
224
3
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Why do you think that?
(And you are allowed to have beliefs which are not 0% or 100% certainty, even before phase 3 results are reported, you know...)
100
6
6.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Semaglutide is not a cure for obesity, sure, just overweight. Semaglutide+tirzepatide or the triple-agonist formulations are much closer to that, especially if maintained long-term (all the 15% etc weight loss numbers are short-term). And far more desirable than bariatric...
288
10
3.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Yes, but they need to be rolled out widely (supply bottleneck + pharmacological calvinism are barriers), made non-injectable (the good oral ones are still coming), and the better combo-therapies approved & rolled out (some are still in the pipeline, not even phase II yet).
273
16
5.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17We're not going to get the second two, but you left out 'cure for obesity' which probably would be even better QALY-wise than 'cure for cancer'.
10,102
299
3.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Probably subsumed under 'reaction norms'. Personally, if I wanted to further behavioral-geneticize the Buddha, I'd go with 'emergenesis'.
371
6
1.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17If it's just guessing, and it stopped training last year, how does it 'guess' the current date correctly each time, hmm? 🤔
380
46
12.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Strictly speaking, he wasn't an experiment, just a social intervention, and like most such early childhood/adulthood intervention programs, wound up showing the weakness of shared-environment.
7,824
74
0.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17(That is, there's lots of anecdotes of the form, 'I believed X but couldn't prove it for 20 years until one afternoon for a lark I tried to prove ~X and whups.'
This would be nonsense for a DL system: they would've forked into two, for X and ~X, at the start and solved it then!)
156
5
3.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 17Yeah, and we still haven't tapped into their full power... For example, there's no good way to properly randomize models as samples from the full Bayesian posterior, so you can't generate 50 'hypothetical researchers' to explore an idea independently & non-redundantly.
𝔊𝔴𝔢𝔯𝔫@gwernDec 16(If you don't need self-attention at all for it, and just any memory or history/context is enough when trained on diverse data distributions at scale, we can stop asking 'does arch X do Y' because these capabilities are extremely convergent and it's simply a question of when.)
374
29
7.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16At this point, the case for Bayesian meta-reinforcement learning emerging in self-supervised learning Transformers trained on natural data, not just RNNs, seems pretty much done. The next question is: does this get elicited in fully-connected archs like MLP-mixers as well?
5,107
54
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16This is so many layers of irony and philosophy deep that I have no idea anymore.
137
5
3.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16So, within its comfort zone, it can improvise amazing doggerel and avoids the tell-tale errors of the more unrestricted models, but if you push it out of that or expect any other kind of verse, it still doesn't work.
Still needs character-level or phonetics-aware modeling.
105
6
5.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16After trying to break out of the niche in ChatGPT, my conclusion is that 003/Chat have not learned phonetics/rhyme but appear to have memorized a bunch more pairs and then the RL tuning has 'mode collapsed' onto the narrow learned niche of high-confidence rhyming verse.
109
11
10.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16Many examples now of things like rhyming words that are spelled similarly but pronounced differently, being unable to rhyme or write verse outside a narrow niche of short-line quatrains/couplets, rhyming even when told explicitly not to, not rhyming specified things...
62
2
3.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16The context window doubling for the embedding endpoint and the 'c100k' BPE tokenization in the new tiktokenization library make this an especially important point to be clear about now: there are a lot of tokenization changes going on!
67
3
4.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 16[Musk does things]
[everyone]
ME: "Don't make me tap the sign."
𝔊𝔴𝔢𝔯𝔫@gwernDec 16But does that mean it benefits from newlines being left in, or it just doesn't fail as catastrophically as before?
226
8
3.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Huh. I thought Mallarmé was saying you should write your own book if you wanted to criticize his. But ChatGPT's interpretation is pretty compelling too. pic.twitter.com/sNliUjl6DV
213
20
9.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Rhyming is great, but you can still get great stuff without it.
I was playing with davinci (to compare w/003) and got this metal bit (whole thing is completion, I was actually starting w/'Complete "This Last Pain", by William Empson:'):
[mrw people criticize my writing style] pic.twitter.com/Qukt9bffE2
89
4
4.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15No, they don't work. Like the first one, you can easily trace the outside and see that it has no path into the interior, much less the whole thing be connected. Not that you would expect diffusion on pixels to guarantee connectivity anyway.
411
6
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Yes. There may be some clever tree construction or mathemagic which allows m-of-n arbitrary recovery, but I don't immediately see it, and the 'pick 1 random FEC packet from a random past block' is at least a concrete proof-of-principle - it's very slow but it clearly would work.
60
1
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15You obviously can't just feed the entire blockchain in: doesn't scale, and there's no way to 'add a new block', it's all-or-nothing AFAIK. So it's not truly 'broadcast'.
That's why each block is separate and merely includes a packet from a previous block's fountain encoding.
75
2
2.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Looks kinda expensive, homepage does a bad job of selling one on any advantages, excludes NSFW (?). Mm. Someone should do a NovelAI comparison.
4,609
79
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Efficiency-wise... bandwidth, not great, because you keep re-encoding the full history, but you can prune away all the FEC packets you don't need (or want, if you can know you don't want the block it's for), so you at least don't have to store duplicates. In-progress < full.
3,271
10
0.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Also censor-resistant: can't easily block current block because the fountain code just keeps spitting out new FEC packets which *eventually* recover the block, and then it contains another historical FEC packet (which eventually recovers a block with another FEC packet...).
3,597
14
0.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15You can speed up listener reconstruction by including more historical FEC packets per block.
Advantages: don't need to recompute any FEC or store additional FEC blocks per block, constant overhead & copying one packet from a historical block into current block.
1,567
10
0.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Weird idea: a broadcast-only blockchain using en.wikipedia.org/wiki/Fountain_… FEC broadcasting packets. It can transmit only the current block, but eventually allow reconstructing the entire blockchain by including 1 FEC packet from an earlier block. Listen long enough, and you get all.
6,116
158
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15People worry about AIs giving you dangerous meth recipes, but they should worry more about them giving you dangerous math recipes.
𝔊𝔴𝔢𝔯𝔫@gwernDec 15(Worth noting you can know a priori the effect size from environment changes like this must be small because the total test-retest reliability of cognitive tests or standardized tests is so high, despite making few to no efforts to control any of the environment effects.)
2,592
40
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15Does this go haywire when newlines or Unicode is present like the old embeddings? It's not mentioned either way.
4,983
69
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15You should keep at it, it works! People were showing good results for 'just generate an "image"' in GANs back in like 2019 with StyleGAN.
294
13
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 15"A man who has to be punctually at a certain place at 5 o'clock has the whole afternoon from 1 to 5 ruined for him already." --Lin Yutang
𝔊𝔴𝔢𝔯𝔫@gwernDec 15I don't have the Photoshop skills for it but I've always thought that Cowen+Decreux would be an amazing meme. The only question is what... "Disregard papers / acquire ethnic food"?
296
4
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14Yes, but that's a trick probably already used so may be in the baseline already and the question switches to 'scaled up model which can't easily be distilled down to realtime on current hardware'.
113
1
0.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14(I mean, retrieval image-gen models are a thing. Retrieving images from a live website would be easy to add. You could make a model which acts like this, it would even be useful for a number of purposes! It's just not how any major model works, is all...)
134
24
17.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14Interesting demonstration of the folk psychology around how generative models work, tho. You can understand it: why *couldn't* SD be literally going out and downloading the front page of ArtStation to 'copy'? ML models can't *understand* anything, they're just 'search engines'...
848
39
4.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14Reminds me of the last time I was in SF. I left a Xeon Phi on the front seat of my Zipcar, and while I was out getting a latte and sourdough, some homeless crazy broke in and left another dozen: 😭
[photo]
7,603
250
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14Seems unlikely. Those other scholarly search engines already exist and seem to be very lucrative, so GS has failed there. The best profile I know of gives the impression it's, frighteningly, the passion project of a Googler with their equivalent of tenure: wired.com/2014/10/the-ge…
63
6
9.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14I worry about the backfiring of this quantitative mindset.
YOU [to them]: "Wow, look how low your peak rate was! That cardio HIIT is really paying off!"
YOU [to self]: "Look how low it is—what's wrong with me? Am I doing something wrong? Am I not as sexy as I used to be?"
2,013
29
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14That looks very 𝘶𝘯related in the relevant region...?
330
24
7.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14If 'Abelian groups' are named after Norwegian mathematicians named 'Abel', it logically follows that 'non-Abelian groups' are named after all the non-Norwegian mathematicians not named 'Abel'.
(The non-Norwegian Abels presumably need to step up their game.)
2,582
14
0.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14Pretty much all the NMT studies are consistent with a common 'neuralese' of the embeddings/vector-space, aren't they?
49
6
12.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14(What the Outside View giveth, reference class tennis taketh.)
3,815
46
1.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 14It wouldn't radically extend lifespan, anymore than it would in humans, because of competing hazards + Gompertz. Be interesting to see if they did any good at all when applied to a M-prize winning mouse approach, however.
3,153
48
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13They can buy a few years: the programmers & GPU clusters to handle Stable Diffusion-like popularity are big targets.
(I remember Napster - it was so convenient and easy to use for 56k. Replacements like eDonkey or Kazaa took years and home broadband before they got near it.)
1,409
58
4.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13The most amazing thing is that it sounds like he's a cryptominer. What a brilliantly sociopathic way to scam free electricity for your rigs.
939
26
2.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13Yeah, I've never heard of that kind of unwinding. The FTX creditors will just assume ownership of the Anthropic holding. Isn't that how it's worked with other large frauds like 1MDB?
166
18
10.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13"At last", you think as the water approaches, "now no one can say I am exaggerating the problem and should just reinstall."
𝔊𝔴𝔢𝔯𝔫@gwernDec 13Yeah, I assume the scaling exponents are bad, if only because they are generally all using hybrid systems optimized for now rather than bitter-lesson long-terms, but I'm curious how much better they could be as-is out of the box, essentially.
2,302
59
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13This is something I've wondered for a long time about self-driving cars: to what extent are they held back by on-board GPU? You can't measure the effect of narrowminded R&D, of course, but presumably Waymo et al have internal "compute no object" scale-ups to benchmark some of it.
4,053
97
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13Does it actually matter? Can you, say, name three examples of whether major scaling results by Google Brain, especially ones with non-public models, proved to be seriously exaggerated and what looked like a major advance proved not to be?
88
7
8.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13(Not exactly 'general use' when they're imposing non-commercial licensing terms and trying to sell Seek.art subscriptions.)
2,021
37
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13I'm sure! But R&D has to start with the model that *doesn't* run on his laptop. (We're too stupid to do otherwise, which is why there's always a hardware overhang.)
𝔊𝔴𝔢𝔯𝔫@gwernDec 13I think yes, historically all forecasts not explicitly including them tend to be of the form 'unless some major tail event occurs like all-out nuclear war, pandemic, or AGI'. Otherwise they would all have to have 99% CIs like AD 2025-2500.
2,040
27
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13Why? Books about programming will literally have commented programs and often include extensive question-and-answer sections. Indeed, Stack Overflow killed an entire genre of technical writing of the 'X cookbook' form.
444
12
2.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13Does this distinction really hold given that many LMs, especially the original GPT-3, were already trained on tons of source code just pervasively available in Common Crawl & book corpuses (eg books about programming)? twitter.com/gwern/status/1…
531
18
3.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13I'm sorry you're an "impoverished goatherder on an old laptop running off solar panels" but the implications and importance of something like GPT-3 or Imagen have little to do with exactly how many A100s it takes to run it conveniently.
1,098
69
6.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13("I didn't see any samples from BigGAN-JFT300m, so that model must not exist. I didn't see any hand samples from Imagen, so they must not exist" etc.
Further application of narrow windows + systematic lagging bias left as exercise for the reader.)
1,121
18
1.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 13The more striking part is that the era of 6 fingers etc was actually ~2017–2021 (BigGAN→Imagen roughly), but people thought it was 2021–2023 (DALL·E 2→SD→?).
No matter how many times you chant "the future is already here, just unevenly distributed", people won't discuss SOTA.
𝔊𝔴𝔢𝔯𝔫@gwernDec 12"...And finally, at long last, he realized that the bluehorse of happiness was with him the entire time. The End."
297
6
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 12@EMostaque This is almost certainly true of sufficiently large image models too. The obvious way to fix the endless proliferation of finetunes/forks: simply have a small cluster training 24/7 on 99% lightly-moderated user-submitted uploads / 1% LAION, & release checkpoints daily. pic.twitter.com/c1LshASVFo
1,155
87
7.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 12No it won't. By the time you can extract a specific brain, you'll have been meta-learning a distribution of capable brains eons before by using weak earlier data as constraint: reddit.com/r/reinforcemen… Same way GPT is useful long before it is any *specific* person it's trained on.
3,136
122
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 12The ruinous powers have ever been generous with their votaries, until they claim their price. That the lies are necessary shows that it cannot bear inquiry, and is always the mark of heresy - with which compromise is death. 𝘗𝘶𝘳𝘨𝘦 𝘵𝘩𝘦 𝘚𝘢𝘯𝘵𝘢 𝘤𝘶𝘭𝘵𝘪𝘴𝘵 𝘴𝘤𝘶𝘮!
1,896
24
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 12Indeed, it is by their logic. Which is why you do them a favor by pinning down their logic *now*, so that when AI is created, they (or at least, everyone else) might actually learn something from it.
328
12
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 12Because modus tollens en.wikipedia.org/wiki/Modus_tol… . If (souls exist) -> ~(creation of AI), then by modus tollens, if (creation of AI) -> ~(souls exist).
(Or more informally: "if souls mean something is impossible, and you do the impossible thing, then there must not be souls.")
148
7
4.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 12To reiterate: "there is little effect on total fertility."
39
3
7.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 11(Amusing intro. Shades of "and now it is our turn to study statistical mechanics"...)
2,091
22
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 11Kinda weird. Why would inserting padding help? The BPEs of the letters would be the same regardless, as long as there's at least one whitespace in between, I'd assume. Are you sure this works reliably and you're not just hitting the RNG until it works?
577
1
0.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 11Er, the causality is pretty obviously at least partially the other direction, there's nothing odd about that! You'll notice the absence of eating monkey 'bush meat', or Koreans abandoning practices like beating dogs to death to make soup tastier. Let's not be disingenuous here.
153
6
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 11(I mean, just consider how close in history you have to be to AI to even express the 𝘪𝘥𝘦𝘢 of 'AI' to begin with! (Sorry no, Hephaestus or the Mechanical Turk don't count.))
165
11
6.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 11If you condition on observer-moments who are sophisticated enough adults to reflect 'I was born in time to see AI', then you have to remove all of the deceased children, uneducated, etc. Just the dead children alone drops that to more like 50b, so >20%.
425
30
7.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 11Maybe another two dozen articles and op-eds in the NYT & Chronicle would have helped them not be 'hacked' and prepare for 'a societal trust collapse, at scale'.
2,242
32
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10"…
- All supplementary data files have been uploaded: [Y]
- All listed authors have approved the final draft: [Y/n]"
(・・;)
735
25
3.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10Yep. Feynman also has a famous version of this, with his talk on the mice maze-running experiment.
(FWIW, we've never been able to track down a source for Feynman's mouse anecdote. It probably just isn't published, but... an unfortunate 🤔 caveat to an excellent point.)
161
12
7.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10The good news is that once we do replicate human intelligence, we will have disproven the existence of souls (and God lesswrong.com/posts/NKaPFf98…), so we don't need to worry anymore afterwards about our souls being damned for hubris or anything!
𝔊𝔴𝔢𝔯𝔫@gwernDec 10The real question: how well does the extracted prompt work for Jasper-like results and how well does it predict Jasper outputs (esp as a surrogate for more attacks)?
414
25
6.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10(I feel like that sometimes whenever I pull together old notes, tweets, IRC comments etc that I've forgotten - that sticking the author 'Gwern' on it is misleading, and it really ought to look more like 'Gwern~2009~, Gwern~2018~, Gwern~2022~ et al'.)
𝔊𝔴𝔢𝔯𝔫@gwernDec 10Not just his wife or editing - the whole runup. Lucas had a circle of people to bounce off of from his indie days, and the 'scenius' helped create the trilogy (but not others). Ever read the earliest published drafts? Dire. (_Secret History of Star Wars_ is a good source).
320
16
5.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10FWIW I don't recommend using Hakyll for anything complex.
37
0
0.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10It does, however, seem like an excellent explanation for all questions of the form "why didn't anyone X for Y": if you suck at every kind of "X for Y", then you probably aren't going to do much of X or indeed, even think it possible to get more Y.
462
53
11.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 10(You think we have any idea what the choices are that go into making a Chihuahua different from a St Bernard, or the ancestral maize into today's sweet-corn? Or that Evolution itself has any idea what a 'choice' is?)
𝔊𝔴𝔢𝔯𝔫@gwernDec 9(_note bene_: 'generate every possible image by leave-n-out keywords to show the user to ablate it' is the sort of thing that fast GAN sampling makes trivial, but in the culture of poverty of diffusion image generation, sounds unthinkably extravagant & slow so no one does it.)
131
15
11.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9There may be something to that. Where the text embedding just gets overloaded, or perhaps averages out on too many different latent dimensions to extreme mediocrity.
Should image gen tools build in ablations automatically, and try to guide you to the smallest possible prompt?
2,070
39
1.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9(However, I also think that breeding dogs for intelligence is both useless and probably very harmful to them, and no one who likes dogs & cares about their welfare should want it if they think about it for more than a few seconds.)
184
16
8.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9Yes, genomic selection is pretty much always more effective. But think like 2x, not 20x. (And of course, comes with its own challenges.) Breeding dogs for intelligence fast is very possible, I'm just saying his specific numbers are garbage even granting the assumptions.
𝔊𝔴𝔢𝔯𝔫@gwernDec 9Oh, it totally knows what it means. It's just that it can't quite do it. It's part of the bizarre pattern of strengths & weaknesses which has everyone completely confused. It'll do something perfectly which BPEs should make impossible... and then fail at an easier thing next line pic.twitter.com/3k50tdQwAE
203
21
10.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9If anyone is wondering, most of the math in this is wrong because he omits heritability completely (low, in hard-to-measure dog behavioral traits), so his reinvention of truncation selection is wrong. He should have read Lynch & Walsh.
3,705
162
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9Yeah, ChatGPT handles the basic rhymes much better, but it's still fragile. You can get it to now write a perfectly rhyming completion... But it'll insist on ignoring the specified rhyme scheme and keep trying to make it very regular couplets or quatrains. eg: pic.twitter.com/fXvbzrg0Cm
185
6
3.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9Just to point out one salient detail, Scott's ACX is literally his job, while Cold Takes is not really Karnofsky's job. And when it was, when he was running GiveWell, he spent a lot more time on communication in the Yahoo email groups, LW, the GW blog & its open threads, etc.
157
7
4.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9imagine the hubris of ignoring all the second-order effects and thinking that you can know anything about the long-term effects being net good after exponential amplification of consequences like pop growth, and simply talking about saving lives and qalys from a terrible disease
249
13
5.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9I wasn't impressed because their prompts arxiv.org/pdf/2210.14986… didn't even include a inner-monologue, so I stopped reading there. These authors know better. Lots of hard work is no substitute for trying the single most obvious prompt improvement. cf reddit.com/r/MachineLearn…
2,306
99
4.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 9The Tay claim remains wrong, and the BERT one seems suspicious too. There is no evidence in twitter.com/soft/status/14… that it was BERT, and Google 'featured snippets' existed before (and were famously getting things wrong) well before BERT. Are they just speculating here?
4,652
95
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 8For Lensa, of course. VCs can't 'invest in Lensa-type apps'. That's not a thing. You can only invest in specific companies. They can explode all they like, but if they don't *keep exploding* then they're garbage as investments. Lifestyle or small businesses.
𝔊𝔴𝔢𝔯𝔫@gwernDec 8I, uh, am not sure this comparison works given the Westermarck effect and the active policing of incest laws to extremely strong public support (and incidentally, adults cannot do what they wish in [checks WP] >48 of 50 US states).
6,663
97
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 8Agreed that people underestimate UX, but... this may also be the peak. Haven't several of these already come and gone for SD alone? How many of these '$X/day' have shown legs? Can you name any really big ones still around that started based on, say, the BigGAN G release in 2019?
5,084
132
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 8If she was any more wan she'd have to be dying of tuberculosis.
294
10
3.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 8A curious contrast: 'The Syllabus' does popins *to the side*, fixed at the top. (No persistency.) Has the obvious failure mode of popping in at the top-right for distant links... Fitts, yes, but still feels awkward. I don't like it. pic.twitter.com/xT6FhEzf8W
906
31
3.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 8I always read it as in part selection: we see the BG as ice-queen-bitch-goddesses because the novels focus on scenes like 'Harkonnen assassins are trying to rape me before disposing of my corpse' ie. all the moments in time where the rubber hits the road & shit is srs bsns.
58
1
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 8'create another form, which takes net revenue for a small business incorporated as llc in California, and calculates server-side the federal, and state taxes to be paid. Account for all tax brackets'
smh it doesn't even understand taxes are 𝘸𝘪𝘵𝘩𝘪𝘯 each bracket. 🤦♂️ useless pic.twitter.com/Cz23Z7nZIA
𝔊𝔴𝔢𝔯𝔫@gwernDec 7My observation is that while a lot of people are OK, a lot more people don't seem to update at all. They aren't going 'my goodness, of course GPT-3 models have been getting better every day, obviously, but this is even better than I had been inferring' but 'it got better??!?!??'
85
4
4.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7If people were updating the latter, they should be disappointed as often as surprised. But it should *never* be a surprise that AI systems were slightly better today than yesterday, and will be slightly better tomorrow. (The disagreement is how big 'slightly' is and will sum to.)
54
8
14.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7You're conflating things here. You should be rationally expecting the current, latent, unobserved abilities to be slightly greater each day due to continuous inputs like researchers+data+FLOPS. Whether you adjust long-term forecasts up or down at each revelation is different.
55
2
3.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7Why would it not be continuous? DL systems are certainly not discrete. There's no global clock ticking where every system worldwide gains 0.1% ImageNet accuracy when the week rolls over at 12:01AM Monday. The GPUs are always going brrrr.
𝔊𝔴𝔢𝔯𝔫@gwernDec 7(Sure, they grow fast, but the disease and parasite rates indicate that there's considerable suffering and low QALYs!)
287
3
1.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7(Poisson clumping applies not just news themselves, but publicity and infrastructure, like free web interfaces. And you get clumping *even with* independence, so imagine clumping due to dependencies/correlation of foregoing. Hence: bursts of panic, then stretches of complacency.)
115
6
5.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7I think that's part of it: even if you aren't publishing a paper, there's still pressure to get things released so you can talk about them, recruit with them, or just get them out the door before everyone takes a full week off to travel+recover.
Then there's Poisson clumping too
200
10
5.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7People *should* have been, last month and the month before etc, smoothly incorporating each day the knowledge that DL systems were slightly smarter than yesterday and never getting worse... but we aren't good at that. So updates proceed 'by creeps and jerks', to borrow a phrase.
507
42
8.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7We're in the quarterly freakout where a conference triggers a bunch of bottled-up progress + "today's 10,000" and people who convinced themselves that everything is normal are reminded that's not true.
Similar to the cluster with Gato + DALL-E 2 + Chinchilla + Minerva earlier.
5,509
195
3.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7I joked "and she ghosted 20 men to the machine's 10, lawd lawd lawd / then she ghosted 20 to 10! / she laid down her Tinder app, typing 'goodbye' / "anything you can do I can do better" she sighed / "I can swipe anything better..." / and logged off forever."
It's better 😢 4⁄4: pic.twitter.com/lGk1uSddmh
1,402
101
7.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7The way I put it is Kurzweil was sorta right about stuff involving software/data/information-processing or 'bits' (beyond just the AI projections based on compute - which still make me *so* mad), but then badly wrong anywhere it came to hardware or biology ('atoms').
511
36
7.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7Another example: freeing one from the tyranny of either exhausting hours organizing objects or useless computer-legible orders, by instead minimizing distance between adjacent embeddings: twitter.com/gwern/status/1…
Perfect for 'auto-sorting' nameless notes etc.
10,956
100
0.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7Note that if the overall embedding doesn't work, you can simply tune it based on the embedding of a specific point such as a query/keyword. Embed it, weighted multiply all the others by it (or something), then TSP an order.
255
7
2.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7(This is an idea I've been mulling for organizing my text snippets/notes/annotations as well, where title/date/URL don't sort well: tSNE the OA API embeddings I already have, and then use a TSP or greedy heuristic to put them into a quasi-logical order by semantic similarity.)
11,238
50
0.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7The fallow assumption sounds dubious because ChatGPT is clearly sharing GPU resources with the regular API & playground; that's why the playground is constantly erroring out right now, due to ChatGPT load.
466
23
4.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7This will wind up inevitably producing some abrupt transitions between clusters, but that tells you where the natural categories are, and you can easily drag-and-drop the cluster of files into folders & redo the trick inside each directory.
225
4
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7Categories may be too rigid for 'loose association' and still don't have internal structure. My suggestion: tSNE (preserves local geometry) CLIP down to 2D (for interpretability) then find a shortest path connecting all images. Now you can `ls` them in "loose-association order".
519
21
4.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7SD2.1 might also be worth a try. The newer CLIP embeddings should have a better relationship/entity understanding so you don't get the bottom two.
121
1
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 7Yes, InstructGPT/ChatGPT exemplify the advantage: they're not really smarter than the baseline - and worse in many respects - but they're a lot easier to *use* for what we *want*.
Data collection like inner-monologue or active learning is also going to 'agentfy' very soon, IMO.
𝔊𝔴𝔢𝔯𝔫@gwernDec 7This has got to be one of the least credible epigenetic or whatever results I've ever seen. No one would ever come up with that prediction a priori.
𝔊𝔴𝔢𝔯𝔫@gwernDec 6Possibly a resolution issue too. I couldn't get it to look like a dog when I stared at the thumbnail, and had to fullsize it to make it switch; then going back to thumbnail, I can't make it stay 'dog' easily or consistently.
32
1
3.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 6Yes. We do that for JS/CSS/HTML already, but the templating system is brandnew (added as part of the speed optimizations + big rewrite to fix persistent bugs) and so didn't have any versioning in it. Now soon it will...
52
2
3.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 6Have you tried anything like that yet? How's it going?
87
0
0.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 6It's a cache problem (hardest problems in CS etc): my mobile Chrome was storing an outdated template file fetched via XHR (and unaffected by refresh) which failed to fetch a *new* template now necessary to show live popins. 🤦♂️ Had to look up how to really flush a mobile cache...
7,393
34
0.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 6(Hm, more of a fantasy vibe IMO. "—and all will love me and despair!" etc)
𝔊𝔴𝔢𝔯𝔫@gwernDec 6I can't believe we're implementing a JS console to debug problem (popins failing) which exists 𝘰𝘯𝘭𝘺 on Chrome smartphone, & not on any other browser or mobile simulator, because mobile Chrome won't let you use a console any other way other than remote debug (which is broken). pic.twitter.com/FyyjFMKFwj
𝔊𝔴𝔢𝔯𝔫@gwernDec 6(Hm, does this distinction really hold considering that all LMs are being trained on datasets which have been heavily filtered and further up/downweighted, and increasingly naturalistically populated by people talking about good/bad LMs prompts/completions?)
1,156
19
1.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 5I think it has more to do with housing regulations and flophouses costing a few bucks a day being outlawed in all cities, regardless of whether that did any good or hurt the poor.
𝔊𝔴𝔢𝔯𝔫@gwernDec 5Please don't! Or at least, not online - I've been working on a Qanon/language model short story which ends that way and it'll be boring if everyone is going around quoting the ending but with language models before I can finish. 😭
297
24
8.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 5Why not? The NSFW+esthetic filter's collateral damage caused what looked like severe anomalies in DALL-E-2/SD-1, so I would expect that if they ramped filtering *way* up for SD2, you would see much greater damage from the wholesale deletion of modes and loss of diversity.
105
7
6.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 5Dynamic evaluation was only for short contexts because that's about all RNNs of that era could handle. Given the much greater sample-efficiency and history of the best Transformers, why not try dynamic evaluation and see if it can help over hundreds of thousands of tokens?
𝔊𝔴𝔢𝔯𝔫@gwernDec 5Or eg ahrm.github.io/jekyll/update/…
I've had mixed results trying to use the perplexity to find key words/mistakes. Sometimes it works, sometimes it doesn't. Characters/BPEs/words might just be the wrong level to work at.
76
14
18.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 5They're not >37x less parameter-efficient, that's for sure...
54
3
5.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4Some overselling here... Obviously byte/character-level models, which would be the logical comparison, like ByT5, do in fact exist and do all that: pic.twitter.com/zX3otZuzNu
1,854
34
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4(Well I mean what else 𝘸𝘰𝘶𝘭𝘥 the text have been ? ? ?)
1,756
90
5.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4It has better odds of working than all the other non-cheaty suggestions I see here, I think. Not claiming it has *great* odds, but what sort of probability would you expect given just 1 generic cell to target?
562
21
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4And they were right. Perceptrons were much closer to how human brains worked than all prior models like mechanical calculators; and RNNs were closer; and LSTM RNNs closer; and Transformers closer still... hermiene.net/essays-trans/r…
460
55
12.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4(But canonically, Buddha *was* a chad, wasn't he? He would've become a badass world-king if he hadn't conquered Mara instead.)
2,944
45
1.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4I would hit an orexin neuron in the hypothalamus, or any neuron in the suprachiasmatic nucleus. It's the smallest set of neurons I know of, which have a generic objective anatomical description, which could seriously sabotage him. Even tiny bits of damage cause narcolepsy etc.
13,963
419
3.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4Lossy = lossless compression from the algorithmic information theory/compression perspective since a lossy one can just be automatically converted into lossless via arithmetic encoding etc, so nothing would be gained by a 'lossy' Hutter Prize.
𝔊𝔴𝔢𝔯𝔫@gwernDec 4Hutter Prize is de facto this (and thus, about as relevant to actual AI as demoscene programming tricks are to regular programming).
2,896
31
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 4These sorts of results usually survive blurring/downscaling, and that wouldn't test the color hypothesis more than a lot of other possible signals.
𝔊𝔴𝔢𝔯𝔫@gwernDec 4(I should've tried "hunter2" to see if that worked too!)
93
7
7.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3(BTW, if anyone has a transcript or screenshot of one of the 'computer' sessions from AI Dungeon 2 in Aug 2020 or so, from /aidg/ or /vg/, please send me it. I didn't realize how important those hacks would become for both security & monologue and didn't carefully save them. 😢)
3,204
227
7.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3I have to say, this is all very nostalgic. After all, the original inner-monologue/chain-of-thought AI Dungeon 2 work on 4chan/Twitter back in August 2020, although I think the threads may not have been archived, often centered around sitting down to "a computer" and working...
𝔊𝔴𝔢𝔯𝔫@gwernDec 3I tried my luck by sshing into the OpenAI server to talk to the prototype ChatGPT directly (after, 😮💨, making sure to `pip install requirements.txt` first), but didn't seem to help much... pic.twitter.com/8wzerb4ZpS
1,985
122
6.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Unfortunately, it's hard to get a decent IRC convo going, it degenerates pretty rapidly into repetition/agreement/milquetoast stuff. (The RL tuning again, presumably.)
The cat discussion is particularly risible. pic.twitter.com/U8hoffR9VL
1,746
168
9.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Ugh, I swear, is there anything 𝘮𝘰𝘳𝘦 tedious than installing dependencies on a brand new machine, when you just want to hop on IRC to chat? pic.twitter.com/0miZwKnNIC
2,290
367
16.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3To be clear, it'd probably cost $100/month anyway, but to push it to the really useful uses with full prompts as likely required to pack in factual knowledge & customization would make it cost more like $10k/month. You could burn through a crazy number of tokens stuffing in stuff
76
2
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3If you have to hunt for the junk, then I think that rather renders the point moot.
72
2
2.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Missed a beat there, then, it should be persistent per user! I would've made it a hash (bit) based on... IP address, I think, unless you wanted to get really crazy with supercookies/persistent tracking just for the sake of a joke.
364
11
3.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Not if you want to give people a chance to use it for a while (always a fatal flaw with various personal tools) or have a market outside the 1% of the 1% or extend it to all the other sources of text context/data you need to march down the long tail of accuracy/use-cases...
93
1
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3(At least with the OA API, the 'finetune' won't easily pick up a single prompt of knowledge, will cost several times more to sample, and will be obsolete within minutes as the user accepts an invitation and the email assistant knowledge is updated. So, no bueno.)
3,424
33
1.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Let's say you want an email assistant. You can fit a lot of facts about your plans/schedules into a prompt... but if you run a full prompt on all emails + token-at-a-time decoding, this would cost you like $100s/month/person on OA API! But you can't finetune it either.
3,965
50
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Most obvious solution is to cache the hidden state/activations of the prompt. This can be done by exploiting the Transformer/RNN isomorphism: then you can simply run the RNN once, save hidden state, and invoke it thereafter. This would let you update it as the user updates it.
2,262
40
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3A missing piece from all LM APIs/SaaS AFAIK: really lightweight prompt caching. As LMs start following long detailed instructions, now we really do have usecases for >1024BPE prompts. But cost remains astronomical to reprocess a fixed prompt every time. 'Finetuning' doesn't help.
1,723
76
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 3Believe this anecdote or not, I reported the same kind of spam all the time with Twitter Classique™ with little discernible effect.
But there's one thing that doesn't change, whether it's old or new Twitter... 🙄 pic.twitter.com/spowesMQbI
480
63
13.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 2And I get like a fifth of what I did in the pre-Musk era, but you don't see me broadcasting that to all my followers. Perhaps it's hard to generalize from self-selected anecdotes.
𝔊𝔴𝔢𝔯𝔫@gwernDec 2The rhyming remains really weird. Long flawless sequences of what look like rare rhymes, and then it'll completely flub it and not rhyme at all like in your example.
505
17
3.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 2〜when the central government requires writers to encourage virtue and chastise vice in their fiction〜
𝔊𝔴𝔢𝔯𝔫@gwernDec 2(Everyone who cared read about it back in July, though.)
4,290
87
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 2All of which would be useful to know but is of course not reported in the essentially non-existent publications...
One of these days I'd like to wander up to the NIH archives since they apparently have all his papers. I don't expect a smoking gun of fraud, but you never know.
90
12
13.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 2So, you say inbreeding would produce an initially healthy population but then a temporary population decline which one could then cherrypick and publish about how sickly and sterile the population became in 'mouse utopia' while omitting to publish any followups...? 🤔
57
15
26.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 2Sounds in line with prior results. (Doesn't shed much light on the Egyptian domestication hypothesis, though, from skimming. Lots of Egyptian ancestry... but you'd expect that either way.)
337
6
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 2They're not really a step forward, though. You could do all that with davinci in July 2020 with a little prompting or luck (see my page or original tweets), and LaMDA apparently does it zero-shot right out of the box last year.
𝔊𝔴𝔢𝔯𝔫@gwernDec 1("In the end, perhaps the most 𝘩𝘶𝘮𝘢𝘯 part of the AI was its ability to 𝘪𝘮𝘢𝘨𝘪𝘯𝘦 a different world—"
"You mean a better world?"
"What? Oh my goodness no.")
153
16
10.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 1(GPT-3 has always been amazing at text style transfer.)
𝔊𝔴𝔢𝔯𝔫@gwernDec 1(Presumably, if they had been surveyed about "by what year will machines be superhuman in math, not just Putnam competition style questions", given their pessimism, it would then have been many more years beyond Putnam/2050...)
85
6
7.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernDec 1If you have 'superhuman math', broadly defined, that would seem to require at least reaching this Putnam competition goal, a fortiori, which is why I highlighted it, as I'm not aware of any 'superhuman math' expert forecasts so this is the best we have free of hindsight/goalposts
78
6
7.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30Well, I joked that since transcludes just transclude raw HTML, we could just link the final HTML snippet in `/metadata/.../foo.html#ID` and it'd work, right? After thinking about it, I realized this was perfect. So, another transclude type & voila.
Transcludes are flexible.
7,541
18
0.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30V obvious implementation just using existing link ID + transclude for page←page, or page←annotation, or annotation←page... but what about annotation↔annotation? You can't "link" an annotation, the point is that you link the URL and the annotation is a transparent wrapper! pic.twitter.com/8KGwvwNt29
5,272
29
0.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30Another transclude feature (inspired by IEEE's HTML papers, of all sites; even they can do something right, turns out): if we have backlinks & cross-page transcludes, why not do 𝘣𝘰𝘵𝘩 to show the reverse citation context?
Harder than it should've been, but that's live now: pic.twitter.com/FDericx577
2,630
46
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30Implemented now. It's not quite off-screen rendering in the GUI/game-engine sense, since you can't swap in pixels in any meaningful sense, but close enough. Popups should have much less tail latency - feels much snappier.
5,912
24
0.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30sorry, 'death of the author', I don't make the rules: this AGI tweet nao
455
20
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30Finite automatons are perfectly interesting and respectable Turing machines to train on predicting. Lots of my favorite Turing machines are finite automatons. (I suppose technically that includes all the TMs I've ever actually run, too.)
74
1
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30One interesting pretraining dataset would be OEIS: en.wikipedia.org/wiki/On-Line_E… Not used AFAIK, doesn't look easily exposed to CC, and it's unique in being the largest curated selection of interesting small Turing machines known to mankind.
4,034
66
1.6%
View Tweet activity
You've reached the end of Tweets for the selected date range. Change date selection to view more.
Get your Tweets in front of more people.
Use Tweet Activity to track how your Tweets are doing.
Engagements
Showing 31 days with daily frequency
Engagement rate
2.6%
Dec 31
2.0% engagement rate
Link clicks
6.3K
Dec 31
79 link clicks
On average, you earned 204 link clicks per day
Retweets without comments
1
Dec 31
0 Retweets without comments
On average, you earned 0 Retweets without comments per day