𝔊𝔴𝔢𝔯𝔫@gwernMay 30Caught COVID for a third time. ☹️
Observations:
- Dramamine worked wonderfully. (Ginger gum felt redundant.) What I thought was the intrinsic misery of air travel was actually low-grade nausea... Who else?🤔
- Project Fi worked well in UK
- Cash increasingly discouraged there
𝔊𝔴𝔢𝔯𝔫@gwernMay 29For example, I'm pleased to discover he's very worried about the existential risk of nukes! I had no idea he was more worried about it than AI, even though in hundreds of papers and interviews previously, he'd only ever discussed the imminent risk of AI and never once nukes...
𝔊𝔴𝔢𝔯𝔫@gwernMay 29Given the much smaller decrease for the 0.1m view videos, this looks like either a mechanical bias from truncation (ie. not enough time for recent videos to crack 10m+ but can do 0.1m+) or just a distributional effect (TiKTok spreading out views instead of all viral juggernauts).
554
12
2.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 29I didn't realize 'model everything written on the Internet' had become a trivial 'isolated writing task'.
(Also, that's not even close to 'the claim' of the essay; that's simply a back of the envelope estimate buried toward the end. There are better extrapolations.)
61
7
11.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 29(It's worth remembering the long time-lags, and DM's - pre-existing before the arms race started overnight - penchant for secrecy. I remember several busted Starcraft RL forecasts, which turned out to be right after all because AlphaStar existed - we just didn't know about it!)
80
10
12.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 29IMO, this one remains a '?'. Hassabis explicitly said they were scaling up Gato, and yet, there's been exactly zero followup papers from DM (or GB) AFAIK. They must have *something*, but what?
And then everything seems to have been interrupted by the shotgun marriage & 'Gemini'.
94
17
18.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 28Rust has a lot less authoritarian semantics, I would think, than languages like Haskell or SML or Ocaml, which don't seem to have the same kind of weirdness. So I think there's something more idiosyncratic to Rust drama than simply 'static vs dynamic typing' etc.
𝔊𝔴𝔢𝔯𝔫@gwernMay 26The amount of random noise in a number this small is going to be huge year by year: it bounces up and down by as much as 10, often due to a single climb or incident. Ascribing a single year's change to permits or global warming or any factor at all is premature.
558
6
1.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 26The movie is definitely about time-travel of the standard sort. She gets messages from the future which she acts on to create it, in the way that the story explicitly rules out.
58
10
17.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 26Sorry to everyone who was hoping to chat, but I wound up doing other stuff instead. Hope you had a good dinner even without me there to monologue at you!
77
5
6.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 24I might come. Looks like it'll fit in the schedule well. One just shows up?
224
25
11.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 24In much the same way that the NSA's many victims who had no idea until the Snowden leaks 'saw them coming', one assumes...
119
19
16.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 24Surely such type guys were always 'doing numbers' by definition? 🤔
150
24
16.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 21Regrettably, all the Oxford stuff (like Ashmolean) today is right out due to a flat tire & missing the flight. Oh well.
6,272
49
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 20He found a machine to replicate it on and gave it another shot. Should be fixed now?
154
10
6.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19I don't know if I will have time or if there are enough people who'd want to hang out for a bit on Wednesday.
541
32
5.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19TRC was always happy with Tensorfork as far as we knew from talking to them. We found a lot of bugs/issues, onboarded a lot of users, and they literally couldn't give away the TPU time then; and the anime work was great for recruiting people into DL.
83
9
10.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19Very interesting-looking! Not that I need any particular reason to visit the National Gallery, of course. :)
𝔊𝔴𝔢𝔯𝔫@gwernMay 19I'll be in London/Oxford next week: possibly Oxford Sunday afternoon (then Mon/Tue), but definitely Wed in London & a bit of Thursday morning. Suggestions?
13,001
310
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19It's a nice idea, but the Festschrifts I've read have mostly struck me as excuses to dump the most relevant draft article one has on that general topic, and mostly fall short of the spirit of the thing.
1,038
13
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 19I definitely doubt their capacity to pool resources meaningfully rather than fragment over little emperors (see also their GWASes, ie. lack thereof), not Goodhart models which fall flat on their faces with real users, dare to deploy which might offend Xi, prioritize 'civvies'...
𝔊𝔴𝔢𝔯𝔫@gwernMay 18Like bananas, it's funny to be reminded how pervasive radiation is, given the superstition around it.
I also enjoy the implication the tech knew what happened the moment he showed up. "God, I keep telling management, stop buying them! One false positive costs so many chairs!"
1,114
32
2.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 17Yes, I'm a little puzzled too. They didn't leak the weights, and the only thing CNBC reports there is the 5x token count, which is... about the least interesting thing they could have reported? Because one knew already from the whitepaper it was using several X more data, right?
1,197
90
7.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16That was certainly not the only, nor even the primary, problem with the Nazis trying to get nukes.
I also disagree about how good their smartest minds are & how important that is, and thus how bad any immigration effects could have been (not to mention, well, Xi).
220
14
6.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16He was wrong about accelerating it, but he was right about not being neutral about it and just casually discussing it forever over your morning coffee.
(I agree with the current crop of accelerationists that the opportunity's huge; unfortunately, that's also why the risk is too)
161
18
11.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16If there was 'good reason', and yet we now know they were enormously laughably far from a useful atomic bomb - not even having gotten the critical mass right! - then so much the worse for anyone arguing there is 'good reason to fear' Chinese DL...
160
4
2.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I am well aware of that. I was around for that, remember? But the point is he *did* drop that, and changed his mind radically, even as so many did not.
95
9
9.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I think whether that would have happened at all is in serious doubt, but you are conceding Eliezer was wrong about it easily happening in WWII and not 'probably correct'.
150
8
5.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I refer you to my many other tweets and writings about the China scaremongering; its fabs are not going well; and OA competitors have found catching up surprisingly hard, nor is some amazing secret sauce necessary.
186
21
11.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Much of what Eliezer said was obviously wrong and caricaturely-libertarian, like the claims in your excerpt about WWII. There was zero chance of Hitler or Stalin getting a bomb, and choking off R&D at the source actually works quite well.
133
18
13.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Honestly? No. People are always shocked to find out how small onionland actually is, when you remove dead stuff or phishes etc. (They tend to conflate deep web/dark web.) But there's unsurprisingly a lot of government funding for studying it, and it's quite easy to study, so...
76
11
14.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16But that's why he realized he was wrong about there being minimal Singularity risk & abjured & deleted all his accelerationist material like this 2000 piece Metzger is quoting (which was about the threat of nanotech motivating the need for AI races & downplaying the risk of AI).
121
4
3.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Mildly surprised they made their own dataset and didn't use any of the others like gwern.net/dnm-archive Pretraining needs as much data as possible, and given dark web turnover, most of these datasets will offer a lot of new data.
240
25
10.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16"As Gibson said, 'The street finds its own use for things.'"
"I know, man, the 'internet of things' was 𝘴𝘶𝘤𝘩 a mistake. Half the streets are controlled by AI botnets now and their firmware is too old to get security patches."
174
10
5.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16'Disabling popups' has been extended to disabling-popins, same overall approach: quote icon in the theme toggle or struck-eye in popins. pic.twitter.com/KcMvPnMWcY
𝔊𝔴𝔢𝔯𝔫@gwernMay 16Interesting. Curious that there's a grid in the first one despite the statement there is none but then it says 'completely removed' in the second. I'm not familiar with this particular code, does it generate a 'bigger' grid by default than #1?
619
28
4.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16To be fair, the researchers never claimed to be as smart as tiny language models.
564
21
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16I'd like to see if the visualizations can be prompted stylistically, not merely by type. Can it iteratively revise the charts "by Edward Tufte" etc?
50
5
10.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 16By training on pictures of bowls, and pictures of cherries, and interpolating? "king – man + woman = queen", remember...
49
4
8.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 15I was going to ask how you knew since I wasn't finding that anywhere in G/GS but looking at 3quarksdaily.com/3quarksdaily/2… I guess you just knew the man firsthand and that's how. 😅
105
7
6.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 15Actually, it'd just empower them by eliciting extremely subtle 'dark knowledge' gwern.net/doc/psychology… that you could download terabytes of normal web data without ever decreasing your perplexity on predicting human distances on these questions...
75
20
26.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 15"I'm here for kettlebell class?"
"Ach, you mean skullswingy class, lassie?"
𝔊𝔴𝔢𝔯𝔫@gwernMay 14That's a different point than demonstrating the 'free AGIs for all is democratic' argument fails by its own criterion.
543
15
2.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14(Specifically, I knew people were going to get the year '1986' wrong and try to go to '1985-hamming' or '1987-hamming' etc, so pre-emptively redirected those to '1986-hamming'; but I had a brain fart and wrote '[0-9]' instead of '[0-57-9]'. 🤦♂️)
183
17
9.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14As so often, I got a little too clever & careless with my redirect regexes intended to forestall error and created my own errors...
𝔊𝔴𝔢𝔯𝔫@gwernMay 14(Also, tails are dubious because of pervasive selection. One can show a *negative* correlation between C and IQ in some samples, because of Berkson effect selection, but obviously that doesn't show intelligence causally destroys motivation...)
225
12
5.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14- I've never seen a report of correlation of g with C suddenly becoming super-strong when you look at measurements which should be completely free of motivational issues like reaction time or brain volume or other neurological measures
- probably a lot more one could say...
189
12
6.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14- Some of those studies were fraudulent
- the effect size is small not 'super strong'
- that lab intervention is not actually reality so is akin to jumping on a scale or dunking the thermometer in coffee when your mom isn't looking and so doesn't change anything
167
7
4.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14What is the difference you're making between 'an early version of GPT-4' and '"GPT-4-early"' here?
(Incidentally, MS has also confirmed that they were using smaller Megatrons for parts of the workflow, presumably answering cheaper easy questions, censoring, and maybe retrieval.)
250
20
8.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 14No, it's super easy to ask people how motivated or confident they are, and the correlation with such constructs like Conscientiousness or self-esteem is a lot closer to 0 than 1, let's put it that way. What you propose just does not happen IRL.
2,526
146
5.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 13Of course, if there was anyone who you'd expect to have read up on how to fake or obtain diagnoses, even from forensic psychiatrists working with prisoners and trying to detect their faking...
867
16
1.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 13I wondered if they misunderstood Okpala said or something... I could definitely believe that Transformer interpretability work on small models shows they tend to delete those words early on and tend to collapse to bag-of-word-like representations.
186
21
11.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 12"But not as many as cats, and nothing like lions. Curiously, 'ligers', which are lion-tiger hybrids, might be intermediate in catnip-liking!
This'd offer support for the polygenic basis of catnip response, despite the long-dominant (ahem) autosomal dominant paradigm of Todd..."
171
11
6.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 12Oh, it's worse than that. The self-supervised models are *already* agents neutered/sandboxed.
So it's more like if the only nuclear power plant ideas anyone had involved detonating a nuclear bomb in the middle of a water-filled salt cavern to harvest the released pressure/steam.
83
23
27.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 12I don't think there's any real reason other than simplicity and lack of time. RL mixes in un/semi-supervised learning all the time, and various kinds of preference-learning mix them too. eg RLHF is supervised-then-RL. There's a very big design space.
63
8
12.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 12I was actually looking at the cracked-open window. The lines are way too straight and the geometry makes mechanical sense behind the the blur.
𝔊𝔴𝔢𝔯𝔫@gwernMay 11Most of the suggestions here are either way too research (ie. half-baked) or new. But everyone is using FlashAttention for improved constants which give you a few X, then I suspect just deploying more hardware like H100s & ramping up sparsity twitter.com/gwern/status/1… in some cases
2,718
94
3.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 11(Treating it as a liability-threshold selection setup, turns out to not be numerically absurd, but still quite slow on any relevant timescale: gwern.net/note/statistic… )
54
7
13.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 11'AGI is sufficient to solve protein folding' ≠ 'AGI is necessary to solve protein folding'.
The former is, if anything, strongly supported by AlphaFold and sub-AGI-AI successes in protein folding research...
47
4
8.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 11The problem is, with no access to the base un-finetuned GPT-4, I have no way of verifying it like I do with GPT-3. So I remain unsure if I was prompting it poorly or whether it's the RLHF like it looks like. Maybe I need to play around with jailbreak prompts to see if that helps.
111
9
8.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 11No, the failure modes here make me suspicious that it's the RLHF. It tended to either repeat my summary half-verbatim and only lightly fictionalized, repeat the Scott story, or write horrible positive uplifting pablum. All those are characteristic of ChatGPT-3.5's RLHF too.
𝔊𝔴𝔢𝔯𝔫@gwernMay 10GPT-4 was not helpful at all, if you were wondering. Not sure if it's the RLHF or if I just didn't get the prompt right, but even with Scott's story quoted, and a high-level overview, and then writing most of it, GPT-4 never did much helpful beyond occasional good lines. 😢
1,606
163
10.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 10But the difference is you know 𝘸𝘩𝘢𝘵 paper to look for. Here, our list of 'paper' to search as of a month ago consisted of:
1. ?the Feynman archives in LA, maybe?
2. ???
3. ???
4. ???
5. ???
6. ???
...
100. ???
...
In situations like these, full-text search is critical.
45
2
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 10"but was wrong that u needed AGI"
That's not what he said.
𝔊𝔴𝔢𝔯𝔫@gwernMay 9Draft writeup: pastebin.com/sHCGv35Y Still thinking about whether full-blown meta-DL approaches like learned optimizers could use 'free-play' periods.
12,048
152
1.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9(At least part of the problem here is that there are two very different papers with very similar titles.)
𝔊𝔴𝔢𝔯𝔫@gwernMay 9Uh... Good question. Apparently I searched the title in GS to double-check, and not finding a fulltext, clicked on the PsycNet link there, which takes you to psycnet.apa.org/record/1929-02… which gives you the h00 DOI (?!), and then I LG'd that.
But the title would've also worked in LG.
𝔊𝔴𝔢𝔯𝔫@gwernMay 9Harcourt's point is that the mentally ill are also vastly more likely to be *victims* of things like homicide: web.archive.org/web/2017081208… Where there is a homicide perpetrator, there must also have been a homicide victim previously...
38
1
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9Recycling is also a serious security risk: the new handle has essentially hacked the account and stolen the credentials & privileges.
Yahoo did this wired.com/2013/06/yahoos… and yes, people were getting hacked years later even with their limited rollout reddit.com/r/yahoo/commen…
643
25
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9when you're on call at 3AM and type 'rm -rf / *' into your terminal and before you can press RET suddenly a dozen windows pop open spontaneously around you 'TEXT ONLY' asking you if you're sure about that like it's EoE and 'Komm Susser Todd' is about to start playing in a minute
1,051
31
2.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9Imagine all the ants there we could have more profitably traded with in accordance with Ricardo's Law Of Comparative Advantage. 😢
𝔊𝔴𝔢𝔯𝔫@gwernMay 8Not so much a new feature as filling a logical gap in an old one: 𝘪𝘯𝘭𝘪𝘯𝘦 collapses.
Before, one could only collapse block elements like sections or paragraphs. Good, but why can't you just collapse the middle of a paragraph with a long list or something? Now I can. pic.twitter.com/TXgx9VOPrE
𝔊𝔴𝔢𝔯𝔫@gwernMay 8Most of these were either not 'protests' (or involved a heck of a lot more than protests), were not 'annoying', happened elsewhere without protests, or didn't work (...Hong Kong?!).
Given that she is trying to be comprehensive, her poster shows the opposite of what she thinks.
276
16
5.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7I agree that you can argue Chiang's interpretation is a *little* legit. I don't agree that you should be trying to dunk on people ex cathedra with such an at best debatable interpretation when the story - and the only other Ovid Midas story - does indeed seem to be the wish one.
72
5
6.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7Nope! It doesn't, any more than it needs a meta-reward function to learn the existing on-the-fly meta-learning that Dactyl et al do. Just optimizing long-term rewards.
(The problem is that Dactyl can't *explore* because it'd drop the rubik cube immediately and fail.)
150
3
2.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7(It seems hard to believe that an idea for solving exploration *this* simple hasn't been done before, but I don't recall it from any domain-randomizing or similar blessings-of-scale-inducing sim2real study or POMDP research in general, so maybe...?)
6,347
30
0.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7This would also work with LLM RLHF, I think, if you prompt them to do Q&A or freeform inner-monologue, and include that in the prompt to condition on, but then omit from the oracle ratings.
(Less relevant to instruction-tuning which will just generative-model it, no rewards.)
1,096
9
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7Q. have any domain randomization DRL projects like Dactyl or the DM soccer experimented with a 'free play' period, eg 30s with constant 0 reward, to encourage meta-learning of optimal exploration strategies before usual rewards/losses start being inflicted on agents in-episode?
8,912
83
0.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7Yeah, I read that on HN and I admired his attitude of just asking, 'what is all this XML stuff... *doing* for us? Is this actually solving any of our problems, or herding more yaks?'
And feel similarly as he does about all the JS frameworks compared to just... writing the JS.
105
14
13.3%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7But then, so would pretty much any other stimulant too...
1,041
27
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7(See ovid.lib.virginia.edu/trans/Metamorp… if you understandably have not read _Metamorphoses_ recently. The entire emphasis is on Midas rashly choosing for ill, Midas remaining an outspoken fool, and even his servant unable to hold his tongue and speaking foolishly.)
662
37
5.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7As with many things Chiang has been writing about AI and genetics, this is wrong. Ovid's point was that Midas made his wish foolishly, not that gold was bad. Midas is still a fool in the following story, where he speaks again foolishly and is punished again (with the ass ears).
2,337
92
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7(Now we can look around for other obtrusive labels/scaffolding there just to educate newbies that we can remove for presumed power users after a few uses/page-loads...)
2,395
10
0.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 7And, to extend the 'demo' feature, we can suppress a bit more bric-a-brac: I've always been really annoyed by the '...click to expand...' label on collapsed sections, which Said insists is necessary for readers to understand that. Now it'll be removed after 3 uncollapses. Yay !🥹
3,054
24
0.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 6What makes you think they're going to be doing all those AI dialogues in the browser default-search, exactly? Controlling that is as useful as controlling the search engine. I wouldn't be asking it to write my C++ code, after all. I'd be using an actual app/alt browser plugin.
90
9
10.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 6Probably. It's possible sparsity contributes to it as well but at least if BPEs were out of the picture, it'd be easier to figure out what's going on in errors like those.
101
7
6.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 6So, that means it takes even less effort than typing in 'bing.com' as it can be switched for them...? Wow, such moat, very margin, much profit.
178
7
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 6YouTube is great. It also justifies a market cap of what... −80% (~200b/1,340b) of the present Alphabet market cap?
Not exactly a transition to 𝘢𝘤𝘤𝘦𝘭𝘦𝘳𝘢𝘵𝘦, let us say.
𝔊𝔴𝔢𝔯𝔫@gwernMay 5The theme toggle has been one of the biggest pain points for mobile. It goes away when you read on, sure, but there's no denying that it *looks* awful on the first screen.
New approach: it has an animation to minify into a gear icon... the first page load only. Then just an icon
3,476
22
0.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 5Ah yes, the famous search engine lockin which is why no one switched to Bing when they heard about its crazy new AI.
124
10
8.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 5This reports the nth instance of a known-highly confounded correlation, adding nothing, failing to use the extremely high-quality data they have to control for things like family-level confounds 😠, and is written up extremely irresponsibly to imply at every turn causality.
108
13
12.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 5FB has the social network. What does Google have, when it's rendered its search engine irrelevant via free high-end AI running locally? When your distant T5 derivative searches, retrieves, confabulates & strips out all ads automatically? Where are the ads going to be served 𝘵𝘰?
837
30
3.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 5It's an interesting proposal, but probably utterly disastrous for Google if they do it. He's basically telling Google to turn itself into Sun: gwern.net/complement when you've released the retrieval & summarizer models that bypass your search engine ads - then what? then what?
𝔊𝔴𝔢𝔯𝔫@gwernMay 5@slatestarcodex Nominative determinism: MacPorts, the universal tool among Mac FLOSS users for downloading source code which has been ported to Mac OS & installing it on their system, is run in considerable part by Joshua Root & Dan Ports.
1,093
52
4.8%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 4(But it 𝘪𝘴 quadruple…)
"…And on the last night, that night we must all wake alone, he was amazed by the silence, and called out—𝘓𝘰𝘳𝘥? 𝘞𝘩𝘺? thrice.
And the reply came from nowhere and everywhere: 𝘐 𝘢𝘮 𝘵𝘩𝘢𝘵 𝘐 𝘢𝘮—𝘢𝘯𝘥 𝘐 𝘵𝘰𝘰 𝘢𝘮 𝘺𝘦𝘵 𝘣𝘭𝘪𝘯𝘥𝘦𝘥."
72
4
5.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 4It haunts me anights still—the dream of the rats, the maze-dark-blind groping through the labyrinth; and what of the scientists, in their maze but one of paper? And my maze, the maze of cites to fatigue one's sight? What greater scientist devised this scheme of follies & stories?
782
33
4.2%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 3This is the most tangled garden path sentence I've read all day, and results in one being terrified at the talking rat insulting his executioner.
1,234
54
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 3More broadly, it's a kind of exploration: reddit.com/r/reinforcemen… because if you're an agent, you can *create* your own samples by planning/self-play/model-based analysis, or pick existing ones (active learning) to learn from.
𝔊𝔴𝔢𝔯𝔫@gwernMay 3Not quite a probability distribution, but value functions are related, so: I remembered working through what seemed like a simple blackjack game, and I was surprised to read arxiv.org/abs/2001.00102 about how weird the easily-calculated value function was.
907
34
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 3So, that's like 4 years old now. Obormot couldn't figure out a fix, so maybe you should just update the OS/browser? That's a very old combo, especially when it comes to browser bugs/issues.
𝔊𝔴𝔢𝔯𝔫@gwernMay 3Er, no, anon, it would not make his point as well if it were fictional, because then the story would not be... let me see... what's the word I'm looking for... then it would not be 'real'. It would be 'fictional', as in, 'it did not happen'. 🙄 what are you, 5 years old pic.twitter.com/nz9AiV1voZ
2,724
72
2.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Such is the hope. It's worked well with past features: wait for a need, then implement the generalized solution, wait for other needs to materialize, repeat... Gwern.net is big enough that there usually turn out to be some.
160
7
4.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2The immediate need was to just splice one annotation into another, so I could have a UMich annotation with the Shepard middle cut out to a second annotation, but he went and generalized it to do more than just 'select body of annotation and ignore the metadata wrapper'. 🤷♂️
125
5
4.0%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Yeah, there's some pretty crazy stuff possible now. Like today Said added in CSS-queries to the transclusions so you can do something like 'transclude only .epigraph elements from ID A to ID B within page C'. I'm... not quite sure what I need that for, but now it's possible!
82
3
3.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Looks like it was a real bug. May be fixed now if you can refresh.
57
6
10.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2I don't think that's it. Can't be replicated in mobile sim, and using Browserstack to test on actual iPads, it works fine on recent iPad Pros and the iPad Air 5, but then breaks on Air 4 and below.
124
3
2.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2I'm not sure I've ever seen that usage before, and the text doesn't read like it's a humorous nickname (nor has anyone ever suggested that interpretation before). Do you have some examples of 'Mr Young' being humorous?
121
2
1.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Hm, definitely weird. But if it's mobile, that's often a cache issue. Can you reset the cache (may have to go into a 'history' setting to delete gwern.net-cached stuff specifically) to check?
107
5
4.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2(That's obviously not how it's 𝘴𝘶𝘱𝘱𝘰𝘴𝘦𝘥 to look lol. We just don't own iPads to test on.)
95
11
11.6%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2I'm deeply relieved to have an answer, and such an interesting one too.
I like to think that there may be only a dozen people worldwide who care about this as much as I do, but my writeup is going to make their day. 😀
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Only when you have figured out the magical loss post hoc which exactly counterbalances the emergence...
That's why they should make predictions about the rest of the tasks in Big-Bench etc: show me that these alternative losses are not pulled out of one's nether regions.
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Funny, that's why I think I'm right. There's a reason RL agents get sparse rewards: because that's what capabilities and rewards are in the real world! You can't eat 'token edit distance' arithmetic, and 'close only counts in horseshoes and atomic hand grenades'.
103
12
11.7%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 2Something like that, conflating Curtis & Shepard, possibly based on the more famous (but completely irrelevant AFAICT) Dr Paul Thomas Young.
𝔊𝔴𝔢𝔯𝔫@gwernMay 1(Not that this task would require 'planning' even if it wasn't DOA due to BPEs! Because it can simply pick memorized numbers' digits. It would only require planning with constraints added on.)
𝔊𝔴𝔢𝔯𝔫@gwernMay 1Yes, it's a very misleading writeup. The original is more informative: alltageinesfotoproduzenten.de/2023/04/24/lai… He cease-and-desisted LAION on a theory of linking=copyvio, they disagreed & filed a counternotice (which has loser-pays), and he used that to launch his full PR/lawsuit against LAION.
𝔊𝔴𝔢𝔯𝔫@gwernApr 30(We also save some vertical space by removing the old 'return-to-top' sticky+progress-indicator, as now redundant with the header: each section title is indeed a link, thus you can just tap on the title/... to jump to the top, as one would expect.)
4,882
25
0.5%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernApr 30This seems much like the bait-and-switch I complain about where people post-hoc swap between perplexity/likelihood/etc and downstream losses to claim it's predictable in advance.
OK, fine, show me your pre-registered *predictions* about *token-edit distance*.
Also what even: pic.twitter.com/UFuu2sRGW5
628
84
13.4%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernApr 30On my first skim, this seems like word/semantic games. They show you can get a line on a graph for emergent capabilities by... using a different metric which is nonlinear in a different way? Er, so? Brier score for multiple-choice? Token edit-distance for *arithmetic*?!
2,846
174
6.1%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernApr 30(I have to ask, how has he 'intimated' it, in a way which is not 'advocated' nor so much as merely 'articulated', and presumably neither spoken nor written either. Does he wiggle his nose?)
1,666
65
3.9%
View Tweet activity
𝔊𝔴𝔢𝔯𝔫@gwernApr 30Well, I know the moon exists because I can see it; so the only question is *what* exists, exactly, that I call the 'moon'.
I'm also reasonably sure of the last one because of how bad I felt when I tried to go a week without food to find out; it felt like 'starving' as described.
805
23
2.9%
View Tweet activity
You've reached the end of Tweets for the selected date range. Change date selection to view more.
Get your Tweets in front of more people.
Use Tweet Activity to track how your Tweets are doing.
Engagements
Showing 31 days with daily frequency
Engagement rate
2.8%
May 31
1.6% engagement rate
Link clicks
1.6K
May 31
7 link clicks
On average, you earned 53 link clicks per day
Retweets without comments
2
May 31
0 Retweets without comments
On average, you earned 0 Retweets without comments per day