𝔊𝔴𝔢𝔯𝔫@gwern1h"How are you going to browse a 'cached Twitter'? Not going to have the recs change at all? None of the live/news sidebars ever change? No prompt ever mentions events postdating the date of the tweets?"
11
5
45.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwern16h'OK, so we won't let it browse Twitter at all if we can't do a good cached one.' I trust you see the problem with this.
More on the pervasive leakage: lesswrong.com/posts/vCQNTuow…
113
30
26.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwern16hLots of ways, seriously, think about it. Think about just browsing Twitter. How are you going to browse a 'cached Twitter'? Not going to have the recs change at all? None of the live/news sidebars ever change? No prompt ever mentions events postdating the date of the tweets?
217
29
13.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwern21hYou can be sure there are no technical analyses of crash reports because no one's leaked them on a World of Tanks forum to increase the stats of their favorite flying-saucer unit, or autopsies because no Zoomer's leaked them on Discord for clout & an 👽 react.
𝔊𝔴𝔢𝔯𝔫@gwernJun 8God, I didn't have to see these Arian supremacists in my feed 𝘣𝘦𝘧𝘰𝘳𝘦 Musk ruined the bird site. 🙄
533
14
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 8Looks like jokes too, although I hadn't tested that myself beyond checking that it confabulates explanations of novel jokes/puns: arxiv.org/abs/2306.04563
22
8
36.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 8More RLHF mode collapse. davinci-01 also memorized most of its valid jokes, yes, but it wasn't narrowly mode-collapsed onto a handful of jokes! (That said, they should've used the API to investigate changes over model versions to demonstrate it getting worse.)
613
39
6.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 8Given the dodo bird verdict and how well CBT workbooks etc work, I think he's going to be disappointed when they can do therapy before the Singularity.
671
28
4.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7More for you. I mean, it *was* good kimchi, right?
194
11
5.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7Now, you can use precognition to fake arbitrary retrocognition. But can you use retrocognition to fake arbitrary precognition? I'm still thinking.
You can do a lot if you invoke Laplacian Demon-level powers of prediction based on retrocognitive knowledge, but that's a big ask.
1,844
22
1.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7This seems less like an article about OA API tokens (which have been scraped & abused since July 2020, obviously), and more about Replit being careless & lazy by not doing the sort of secret-scanning other code-hosters like Github have long done.
378
14
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7Thus, Sabine has shown that while precognition & retrocognition may logically coexist, epistemically, they don't: you can only prove 'precognition NAND retrocognition'.
This definitely comes as a surprise to me and I don't think I've ever seen that claimed before.
3,055
42
1.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7So, the retrocog dilemma: if some 'fact' about the past is reported by retrocognition, and it cannot be publicly verified, then obviously it's no proof; but if the fact ever is verified, then the 'retrocog' could just be a precog snooping on the future verification & no proof.
2,053
19
0.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7Sabine makes a weaker argument above, appealing to subconscious knowledge, but you can of course strengthen it to any knowable 'verification' itself: if someone ever publicly discovered the meaning of a hieroglyphic, the precog steals it from the discover's *mind or publication*.
1,320
8
0.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7What's really fascinating to me here is that Sabine succeeds in his goal of giving a fully general Kripkesteinian skeptical argument against retrocognition: any fact reported by retrocognition then verified could symmetrically just be *pre*cognition foreseeing the *verification*! pic.twitter.com/3LQs0p7ktE
981
26
2.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7@slatestarcodex On a nominative determinism sidenote: the important details that they were lesbians & prone to hallucinations come from the salacious expose _The Ghosts of Versailles_, written by "Lucille Iremonger", which I was *sure* was a pseudonym until I checked.
1,156
12
1.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 7Actually, it's more than a self-fulfilling prophecy, presumably it was a stable time-loop: their vision ensured their research, & their research ensured their vision, with it being initially set up by an exogenous & apparently common fascination of lesbians with Marie Antoinette.
𝔊𝔴𝔢𝔯𝔫@gwernJun 6(Man that was a depressing anime. Especially when you check the manga ending to confirm that they starve/freeze to death.)
612
24
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6My impression is that when it comes to large donations like endowments, everything is negotiable, whatever universities may say in public for appearance & leverage.
In practice, it seems fairly common to influence selection: nationalreview.com/2018/05/washin… citing rand.org/pubs/monograph…
𝔊𝔴𝔢𝔯𝔫@gwernJun 6(I think this is part of why we've seen so little use of GPT-4 for creative writing thus far, similar to how everyone talks about LLaMA etc but don't use it.
It's failing in the marketplace of writers: you just don't get out anything amazing without really fighting it.)
63
9
14.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6I definitely believe it even without research. The quality increases for fiction writing from switching to GPT-4 just feels way lower than the jump for everything else. The code in GPT-3.5 vs GPT-4 is strikingly different, but then you go to complete poems or stories and meh.
78
5
6.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6The value of fresh eyes/anon feedback: they asked why there was a big modal for the enable/disable popups toggle, when there was a theme bar with icon-options for everything else.
'Er... Good question.' There were reasons, but they hadn't been valid for easily a year.
Fixed.
3,976
45
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6No; it's memorized a lot more, but the understanding still isn't there and it is causing increasingly perverse & subtle downstream harms in conjunction with RLHF. See my rhyming mode collapse comment.
Embodiment seems like a red herring to me. It's robot bodies that need LLMs!
81
14
17.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6'−50%' across Italian developers as a whole, as a universal average, means that it's obviously wrong. That's never a real causal effect.
146
13
8.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6IMO, sample-size is no longer a problem. GPT-3/4 are more than sample efficient enough in few-shot learning.
More relevant technical barriers are the BPEs, the now-mandatory RLHF which destroys GPT-4 output, & difficulty defining 'novelty' (about which I have unpublished ideas).
39
7
17.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6This press release seems to be missing all of the examples.
625
26
4.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6(Surely an exception that proves the rule, but if you are hitting the same hotel repeatedly for perks or convenience, then by birthday paradox I wouldn't be surprised at *some* repeats.)
212
2
0.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6I think we don't have much dead CSS because Said was checking everything line by line in the refactor, and we did find some strange live CSS. For example, a 'blockquote table table {}' turned out to be necessary - for Wikipedia popups! Their infoboxes nest tables inside tables.
58
6
10.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6Sounds like an understatement!
Have you ever been in the same hotel room twice after checking out? I definitely haven't.
(Even revisiting the same hotel is unusual enough I'm struggling to come up with instances.)
1,250
41
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6Hm, the /index shouldn't be any different and it's been working everywhere I tried it like Google Pagespeed or temp-profiles, so that's probably a cache issue?
12
0
0.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 6Yeah, after discussing it, we think it might be because G Pagespeed doesn't seem to count inlined stuff.
So even though ~4000 lines of CSS is no longer being inlined as <style>, & is now just a smaller external file, & both render-blocked, the former didn't count to GP, so...
𝔊𝔴𝔢𝔯𝔫@gwernJun 5Visual tweaks:
- centered page metadata for consistency
- nice box wrapper around X-of-the-day to make it more gwernnetty
- slower demo-mode for toggle bar pic.twitter.com/10i8ywm2bB
4,165
119
2.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 5Big CSS/JS rewrite to refactor & try to prevent bugs. On the lorems, Said says it cuts >5s off rendering time. Certainly does feel faster, although Google Pagespeed is convinced everything is slower. 😕
Now we find out the hard way about edge cases & bugs in the new version...
2,541
32
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 5(Although I would note that he has for many years tagged all his NN-related reading "your favorite neural network sucks" or "to be shot after a fair trial" in Pinboard, and this attitude has worked out about as well as you might think, so, this page is as I expect from Shalizi.)
213
20
9.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 5Other points:
- the Markov stuff is all true but a red herring. There are plenty of ways like Transformer-XL or retrieval to break that, but they don't do anything magical.
- he's super wrong about prompt leaks, eg. ofc the model can understand explicit instructions vs questions
134
18
13.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 5The history is a bit off here. He complains about 'attention' putting weight on every entry, but of course, that's why they called it '*soft* attention', as opposed to the many varieties current then of '*hard* attention' which does have 0 weights. But hard to train/works worse.
717
21
2.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 5Historical comparisons are also always fun to put contemporary tech prices into perspective: $3,500 is about half what the 𝘤𝘩𝘦𝘢𝘱 Apple II model cost at its 1977 launch (en.wikipedia.org/wiki/Apple_II#…).
𝔊𝔴𝔢𝔯𝔫@gwernJun 5(The question of course is, while they are subtly or not so subtly talking down the competition, are they fixing their AI problems and paddling away furiously under the surface, or is Apple too committed to their existing approaches like Siri to change yet?)
287
12
4.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 5No, it still makes sense. I noticed this reading a HN commenter who agreed with Apple & was mocking the idea that 'just' 'ML' could be 'AI'. Exactly. If Apple is #1 at something, then it's awesome and will change the world. If they're not even in the top 5, then it's 'just ML'.
𝔊𝔴𝔢𝔯𝔫@gwernJun 2You said 'large-scale experiments'. PBT is used in plenty of large-scale experiments, from the Quake CTF to Waymo vision models to AlphaStar.
And what your lab's HPC does is nice, but hardly shows much. I wouldn't assume that about OA either given their intense need for the API.
84
9
10.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 2As for DPO, I'm still reading that (trying to figure out whether it's a much more obtuse version of my proposal in gwern.net/gpt-2-preferen… ) but that's very obviously still training a LLM in a DRL setting.
65
5
7.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 2The observation that every un/semi/supervised learning problem can be case as an RL problem is a trivial one. gwern.net/doc/reinforcem… work predates GPT-3 and the Clippy story already references examination of the problems with EDT in DT: arxiv.org/abs/2110.10819…
63
3
4.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 2(Saying it was 'Taken out of context' is all well and good. It does not, however, explain what the context is which explains either why he said it or why it doesn't mean what it sounds like it means.)
112
11
9.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 2And this is something which ought to emerge out of learned update rules which don't need explicit episode boundaries and reduce or remove forward-then-backprop phases, see the referenced Kirsch & Sandler papers among others for learning update rules.
38
2
5.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 2Sure. eg. reasoning about episode boundaries would be highly desirable and an obvious goal for 'lifelong reinforcement learning'/continual learning, where it should meta-learn things like forgetting/resetting state and re-exploring.
39
2
5.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1This is very confusing, because he's saying repeatedly that they did actually simulate and train it and it then did such-and-such. How does he get to 'we were raining it'/'trained the system'/'system started'/etc from a reality of 'no ML models were trained'?
1,729
169
9.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1Probably (SAC is model-free and those don't have anything I'd consider morally equivalent to 'realizing' aside from *maybe* the advantage), but your reformulation has its own problems.
140
3
2.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1No, that wouldn't be fairer. I'm sure they *didn't* reward the model for killing operators, and may even have had a negative friendly-fire penalty. What drives interruptibility is that it can then go on to maximize other rewards doing other things & then does better than average.
825
45
5.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1Well, that's a relevant point and true enough; but AutoGPT didn't exist when I wrote that, and I was referring specifically to PBT, the large-scale DL frameworks like Singularity, and more broadly, the neural architecture search + hyperparameter optimization subfields in DRL.
60
5
8.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1And it's a slippery slope. I remember when everyone was saying all of two years ago 'of course LLMs won't be given live access to the Internet, that would be *crazy*'. Then LaMDA, Adept, OPT, GPT-4 etc... Don't expect 'of course they won't be given live access during RL' to last.
62
17
27.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1They definitely don't. Look at what everyone is doing with the hobbyist models. Plus, when you train models online from deployed data like OA, the distinction disappears. It's just batch RL with a very expensive pretraining stage and each batch is a month or so. (Bing's faster.)
50
12
24.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1It's offline RL. LLMs are trained via RL as single-step loss, leading to imitation learning (because it's behavior cloning). Perfect prediction of logged RL data from humans would *not* mean 'no room for deviation', even if replicating exact same prompt.
Plus, like... RLHF? 😕
44
11
25.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1Many ways. ML datasets (eg Arxiv) are well-represented, data about GPTs increasingly well-represented, there are large datasets about hyperparameter optimization from systems like Vizier, the point of multi-task learning is to meta-learn flexibility, AutoML etc. See references.
39
5
12.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1Yes, they do. That's literally PBT, which is linked in that sentence: training is reallocated to fitter agents. Maybe you should disable reader-mode...?
You are also ignoring all of the systems like Singularity arxiv.org/abs/2202.07848… which opportunistically use compute.
44
11
25.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1For those who are skeptical, it's worth remembering that this is in fact what agents will do in 'interruptible' setups & it depends on the algorithm, and SAC is one of the ones you'd predict would. DeepMind in 2017 showed A2C but not DQN 'kills the human': arxiv.org/pdf/1711.09883…
94
11
11.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1(This is a logical way you'd write the reward function, so not contrived. Of course you want it to kill as many enemies as possible, and you want overrides of that. Unfortunately, any in-sim embodied supervision looks like enemies reducing rewards & to be attacked instrumentally)
65
9
13.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1You could get it with a reward function like '+1 point for each destroyed SAM unless marked no-go by NPC #1'. Then SAC, which is what they were using for the past F-16 sims, could in fact learn to kill NPC #1 first to forestall its marking actions, or any other intermediate link.
147
29
19.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernJun 1That's a lot of hits for something that doesn't have any exact hits, and seems like a very big hint to the model.
𝔊𝔴𝔢𝔯𝔫@gwernMay 30Caught COVID for a third time. ☹️
Observations:
- Dramamine worked wonderfully. (Ginger gum felt redundant.) What I thought was the intrinsic misery of air travel was actually low-grade nausea... Who else?🤔
- Project Fi worked well in UK
- Cash increasingly discouraged there
𝔊𝔴𝔢𝔯𝔫@gwernMay 29For example, I'm pleased to discover he's very worried about the existential risk of nukes! I had no idea he was more worried about it than AI, even though in hundreds of papers and interviews previously, he'd only ever discussed the imminent risk of AI and never once nukes...
𝔊𝔴𝔢𝔯𝔫@gwernMay 29Given the much smaller decrease for the 0.1m view videos, this looks like either a mechanical bias from truncation (ie. not enough time for recent videos to crack 10m+ but can do 0.1m+) or just a distributional effect (TiKTok spreading out views instead of all viral juggernauts).
554
12
2.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 29I didn't realize 'model everything written on the Internet' had become a trivial 'isolated writing task'.
(Also, that's not even close to 'the claim' of the essay; that's simply a back of the envelope estimate buried toward the end. There are better extrapolations.)
61
7
11.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 29(It's worth remembering the long time-lags, and DM's - pre-existing before the arms race started overnight - penchant for secrecy. I remember several busted Starcraft RL forecasts, which turned out to be right after all because AlphaStar existed - we just didn't know about it!)
80
10
12.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 29IMO, this one remains a '?'. Hassabis explicitly said they were scaling up Gato, and yet, there's been exactly zero followup papers from DM (or GB) AFAIK. They must have *something*, but what?
And then everything seems to have been interrupted by the shotgun marriage & 'Gemini'.
94
17
18.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 28Rust has a lot less authoritarian semantics, I would think, than languages like Haskell or SML or Ocaml, which don't seem to have the same kind of weirdness. So I think there's something more idiosyncratic to Rust drama than simply 'static vs dynamic typing' etc.
𝔊𝔴𝔢𝔯𝔫@gwernMay 26The amount of random noise in a number this small is going to be huge year by year: it bounces up and down by as much as 10, often due to a single climb or incident. Ascribing a single year's change to permits or global warming or any factor at all is premature.
559
6
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 26The movie is definitely about time-travel of the standard sort. She gets messages from the future which she acts on to create it, in the way that the story explicitly rules out.
58
10
17.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 26Sorry to everyone who was hoping to chat, but I wound up doing other stuff instead. Hope you had a good dinner even without me there to monologue at you!
77
5
6.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 24I might come. Looks like it'll fit in the schedule well. One just shows up?
224
25
11.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 24In much the same way that the NSA's many victims who had no idea until the Snowden leaks 'saw them coming', one assumes...
119
19
16.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 24Surely such type guys were always 'doing numbers' by definition? 🤔
150
24
16.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 21Regrettably, all the Oxford stuff (like Ashmolean) today is right out due to a flat tire & missing the flight. Oh well.
6,272
49
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 20He found a machine to replicate it on and gave it another shot. Should be fixed now?
154
10
6.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19I don't know if I will have time or if there are enough people who'd want to hang out for a bit on Wednesday.
541
32
5.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19TRC was always happy with Tensorfork as far as we knew from talking to them. We found a lot of bugs/issues, onboarded a lot of users, and they literally couldn't give away the TPU time then; and the anime work was great for recruiting people into DL.
83
9
10.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19Very interesting-looking! Not that I need any particular reason to visit the National Gallery, of course. :)
𝔊𝔴𝔢𝔯𝔫@gwernMay 19I'll be in London/Oxford next week: possibly Oxford Sunday afternoon (then Mon/Tue), but definitely Wed in London & a bit of Thursday morning. Suggestions?
13,001
310
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19It's a nice idea, but the Festschrifts I've read have mostly struck me as excuses to dump the most relevant draft article one has on that general topic, and mostly fall short of the spirit of the thing.
1,038
13
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19I definitely doubt their capacity to pool resources meaningfully rather than fragment over little emperors (see also their GWASes, ie. lack thereof), not Goodhart models which fall flat on their faces with real users, dare to deploy which might offend Xi, prioritize 'civvies'...
𝔊𝔴𝔢𝔯𝔫@gwernMay 18Like bananas, it's funny to be reminded how pervasive radiation is, given the superstition around it.
I also enjoy the implication the tech knew what happened the moment he showed up. "God, I keep telling management, stop buying them! One false positive costs so many chairs!"
1,114
32
2.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 17Yes, I'm a little puzzled too. They didn't leak the weights, and the only thing CNBC reports there is the 5x token count, which is... about the least interesting thing they could have reported? Because one knew already from the whitepaper it was using several X more data, right?
1,197
90
7.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16That was certainly not the only, nor even the primary, problem with the Nazis trying to get nukes.
I also disagree about how good their smartest minds are & how important that is, and thus how bad any immigration effects could have been (not to mention, well, Xi).
220
14
6.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16He was wrong about accelerating it, but he was right about not being neutral about it and just casually discussing it forever over your morning coffee.
(I agree with the current crop of accelerationists that the opportunity's huge; unfortunately, that's also why the risk is too)
161
18
11.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16If there was 'good reason', and yet we now know they were enormously laughably far from a useful atomic bomb - not even having gotten the critical mass right! - then so much the worse for anyone arguing there is 'good reason to fear' Chinese DL...
160
4
2.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I am well aware of that. I was around for that, remember? But the point is he *did* drop that, and changed his mind radically, even as so many did not.
95
9
9.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I think whether that would have happened at all is in serious doubt, but you are conceding Eliezer was wrong about it easily happening in WWII and not 'probably correct'.
150
8
5.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I refer you to my many other tweets and writings about the China scaremongering; its fabs are not going well; and OA competitors have found catching up surprisingly hard, nor is some amazing secret sauce necessary.
186
21
11.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Much of what Eliezer said was obviously wrong and caricaturely-libertarian, like the claims in your excerpt about WWII. There was zero chance of Hitler or Stalin getting a bomb, and choking off R&D at the source actually works quite well.
133
18
13.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Honestly? No. People are always shocked to find out how small onionland actually is, when you remove dead stuff or phishes etc. (They tend to conflate deep web/dark web.) But there's unsurprisingly a lot of government funding for studying it, and it's quite easy to study, so...
76
11
14.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16But that's why he realized he was wrong about there being minimal Singularity risk & abjured & deleted all his accelerationist material like this 2000 piece Metzger is quoting (which was about the threat of nanotech motivating the need for AI races & downplaying the risk of AI).
121
4
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Mildly surprised they made their own dataset and didn't use any of the others like gwern.net/dnm-archive Pretraining needs as much data as possible, and given dark web turnover, most of these datasets will offer a lot of new data.
240
25
10.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16"As Gibson said, 'The street finds its own use for things.'"
"I know, man, the 'internet of things' was 𝘴𝘶𝘤𝘩 a mistake. Half the streets are controlled by AI botnets now and their firmware is too old to get security patches."
174
10
5.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16'Disabling popups' has been extended to disabling-popins, same overall approach: quote icon in the theme toggle or struck-eye in popins. pic.twitter.com/KcMvPnMWcY
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Interesting. Curious that there's a grid in the first one despite the statement there is none but then it says 'completely removed' in the second. I'm not familiar with this particular code, does it generate a 'bigger' grid by default than #1?
619
28
4.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16To be fair, the researchers never claimed to be as smart as tiny language models.
564
21
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I'd like to see if the visualizations can be prompted stylistically, not merely by type. Can it iteratively revise the charts "by Edward Tufte" etc?
50
5
10.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16By training on pictures of bowls, and pictures of cherries, and interpolating? "king – man + woman = queen", remember...
49
4
8.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 15I was going to ask how you knew since I wasn't finding that anywhere in G/GS but looking at 3quarksdaily.com/3quarksdaily/2… I guess you just knew the man firsthand and that's how. 😅
105
7
6.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 15Actually, it'd just empower them by eliciting extremely subtle 'dark knowledge' gwern.net/doc/psychology… that you could download terabytes of normal web data without ever decreasing your perplexity on predicting human distances on these questions...
75
20
26.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 15"I'm here for kettlebell class?"
"Ach, you mean skullswingy class, lassie?"
𝔊𝔴𝔢𝔯𝔫@gwernMay 14That's a different point than demonstrating the 'free AGIs for all is democratic' argument fails by its own criterion.
543
15
2.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14(Specifically, I knew people were going to get the year '1986' wrong and try to go to '1985-hamming' or '1987-hamming' etc, so pre-emptively redirected those to '1986-hamming'; but I had a brain fart and wrote '[0-9]' instead of '[0-57-9]'. 🤦♂️)
183
17
9.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14As so often, I got a little too clever & careless with my redirect regexes intended to forestall error and created my own errors...
𝔊𝔴𝔢𝔯𝔫@gwernMay 14(Also, tails are dubious because of pervasive selection. One can show a *negative* correlation between C and IQ in some samples, because of Berkson effect selection, but obviously that doesn't show intelligence causally destroys motivation...)
225
12
5.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14- I've never seen a report of correlation of g with C suddenly becoming super-strong when you look at measurements which should be completely free of motivational issues like reaction time or brain volume or other neurological measures
- probably a lot more one could say...
189
12
6.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14- Some of those studies were fraudulent
- the effect size is small not 'super strong'
- that lab intervention is not actually reality so is akin to jumping on a scale or dunking the thermometer in coffee when your mom isn't looking and so doesn't change anything
167
7
4.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14What is the difference you're making between 'an early version of GPT-4' and '"GPT-4-early"' here?
(Incidentally, MS has also confirmed that they were using smaller Megatrons for parts of the workflow, presumably answering cheaper easy questions, censoring, and maybe retrieval.)
250
20
8.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14No, it's super easy to ask people how motivated or confident they are, and the correlation with such constructs like Conscientiousness or self-esteem is a lot closer to 0 than 1, let's put it that way. What you propose just does not happen IRL.
2,526
146
5.8%
View Tweet activity
You've reached the end of Tweets for the selected date range. Change date selection to view more.
Get your Tweets in front of more people.
Use Tweet Activity to track how your Tweets are doing.
Engagements
Showing 28 days with daily frequency
Engagement rate
3.2%
Jun 10
3.7% engagement rate
Link clicks
772
Jun 10
18 link clicks
On average, you earned 28 link clicks per day
Retweets without comments
1
Jun 10
0 Retweets without comments
On average, you earned 0 Retweets without comments per day