Tweet activity

Jun 1 – Jun 10, 2023

Your Tweets earned 63.3K impressions over this 10 day period

10.0K20.0K11Jun 1Jun 3Jun 5Jun 7Jun 9
Your Tweets
During this 10 day period, you earned 6.2K impressions per day.
  • Impressions
    Engagements
    Engagement rate
    • 𝔊𝔴𝔢𝔯𝔫 @gwern 1h "How are you going to browse a 'cached Twitter'? Not going to have the recs change at all? None of the live/news sidebars ever change? No prompt ever mentions events postdating the date of the tweets?"
      11
      5
      45.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern 16h Lots of ways, seriously, think about it. Think about just browsing Twitter. How are you going to browse a 'cached Twitter'? Not going to have the recs change at all? None of the live/news sidebars ever change? No prompt ever mentions events postdating the date of the tweets?
      217
      29
      13.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern 21h You can be sure there are no technical analyses of crash reports because no one's leaked them on a World of Tanks forum to increase the stats of their favorite flying-saucer unit, or autopsies because no Zoomer's leaked them on Discord for clout & an 👽 react.
      101
      14
      13.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 8 God, I didn't have to see these Arian supremacists in my feed 𝘣𝘦𝘧𝘰𝘳𝘦 Musk ruined the bird site. 🙄
      533
      14
      2.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 8 More RLHF mode collapse. davinci-01 also memorized most of its valid jokes, yes, but it wasn't narrowly mode-collapsed onto a handful of jokes! (That said, they should've used the API to investigate changes over model versions to demonstrate it getting worse.)
      613
      39
      6.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 8 Given the dodo bird verdict and how well CBT workbooks etc work, I think he's going to be disappointed when they can do therapy before the Singularity.
      671
      28
      4.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 Now, you can use precognition to fake arbitrary retrocognition. But can you use retrocognition to fake arbitrary precognition? I'm still thinking. You can do a lot if you invoke Laplacian Demon-level powers of prediction based on retrocognitive knowledge, but that's a big ask.
      1,844
      22
      1.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 This seems less like an article about OA API tokens (which have been scraped & abused since July 2020, obviously), and more about Replit being careless & lazy by not doing the sort of secret-scanning other code-hosters like Github have long done.
      378
      14
      3.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 Thus, Sabine has shown that while precognition & retrocognition may logically coexist, epistemically, they don't: you can only prove 'precognition NAND retrocognition'. This definitely comes as a surprise to me and I don't think I've ever seen that claimed before.
      3,055
      42
      1.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 So, the retrocog dilemma: if some 'fact' about the past is reported by retrocognition, and it cannot be publicly verified, then obviously it's no proof; but if the fact ever is verified, then the 'retrocog' could just be a precog snooping on the future verification & no proof.
      2,053
      19
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 Sabine makes a weaker argument above, appealing to subconscious knowledge, but you can of course strengthen it to any knowable 'verification' itself: if someone ever publicly discovered the meaning of a hieroglyphic, the precog steals it from the discover's *mind or publication*.
      1,320
      8
      0.6%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 What's really fascinating to me here is that Sabine succeeds in his goal of giving a fully general Kripkesteinian skeptical argument against retrocognition: any fact reported by retrocognition then verified could symmetrically just be *pre*cognition foreseeing the *verification*!
      981
      26
      2.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 On a nominative determinism sidenote: the important details that they were lesbians & prone to hallucinations come from the salacious expose _The Ghosts of Versailles_, written by "Lucille Iremonger", which I was *sure* was a pseudonym until I checked.
      1,156
      12
      1.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 7 Actually, it's more than a self-fulfilling prophecy, presumably it was a stable time-loop: their vision ensured their research, & their research ensured their vision, with it being initially set up by an exogenous & apparently common fascination of lesbians with Marie Antoinette.
      1,958
      16
      0.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 (Man that was a depressing anime. Especially when you check the manga ending to confirm that they starve/freeze to death.)
      612
      24
      3.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 (I think this is part of why we've seen so little use of GPT-4 for creative writing thus far, similar to how everyone talks about LLaMA etc but don't use it. It's failing in the marketplace of writers: you just don't get out anything amazing without really fighting it.)
      63
      9
      14.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 I definitely believe it even without research. The quality increases for fiction writing from switching to GPT-4 just feels way lower than the jump for everything else. The code in GPT-3.5 vs GPT-4 is strikingly different, but then you go to complete poems or stories and meh.
      78
      5
      6.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 The value of fresh eyes/anon feedback: they asked why there was a big modal for the enable/disable popups toggle, when there was a theme bar with icon-options for everything else. 'Er... Good question.' There were reasons, but they hadn't been valid for easily a year. Fixed.
      3,975
      45
      1.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 No; it's memorized a lot more, but the understanding still isn't there and it is causing increasingly perverse & subtle downstream harms in conjunction with RLHF. See my rhyming mode collapse comment. Embodiment seems like a red herring to me. It's robot bodies that need LLMs!
      81
      14
      17.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 '−50%' across Italian developers as a whole, as a universal average, means that it's obviously wrong. That's never a real causal effect.
      146
      13
      8.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 IMO, sample-size is no longer a problem. GPT-3/4 are more than sample efficient enough in few-shot learning. More relevant technical barriers are the BPEs, the now-mandatory RLHF which destroys GPT-4 output, & difficulty defining 'novelty' (about which I have unpublished ideas).
      39
      7
      17.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 (Surely an exception that proves the rule, but if you are hitting the same hotel repeatedly for perks or convenience, then by birthday paradox I wouldn't be surprised at *some* repeats.)
      212
      2
      0.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 I think we don't have much dead CSS because Said was checking everything line by line in the refactor, and we did find some strange live CSS. For example, a 'blockquote table table {}' turned out to be necessary - for Wikipedia popups! Their infoboxes nest tables inside tables.
      58
      6
      10.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 Sounds like an understatement! Have you ever been in the same hotel room twice after checking out? I definitely haven't. (Even revisiting the same hotel is unusual enough I'm struggling to come up with instances.)
      1,250
      41
      3.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 Hm, the /index shouldn't be any different and it's been working everywhere I tried it like Google Pagespeed or temp-profiles, so that's probably a cache issue?
      12
      0
      0.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 6 Yeah, after discussing it, we think it might be because G Pagespeed doesn't seem to count inlined stuff. So even though ~4000 lines of CSS is no longer being inlined as <style>, & is now just a smaller external file, & both render-blocked, the former didn't count to GP, so...
      158
      2
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 5 Big CSS/JS rewrite to refactor & try to prevent bugs. On the lorems, Said says it cuts >5s off rendering time. Certainly does feel faster, although Google Pagespeed is convinced everything is slower. 😕 Now we find out the hard way about edge cases & bugs in the new version...
      2,541
      32
      1.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 5 (Although I would note that he has for many years tagged all his NN-related reading "your favorite neural network sucks" or "to be shot after a fair trial" in Pinboard, and this attitude has worked out about as well as you might think, so, this page is as I expect from Shalizi.)
      213
      20
      9.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 5 Other points: - the Markov stuff is all true but a red herring. There are plenty of ways like Transformer-XL or retrieval to break that, but they don't do anything magical. - he's super wrong about prompt leaks, eg. ofc the model can understand explicit instructions vs questions
      134
      18
      13.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 5 The history is a bit off here. He complains about 'attention' putting weight on every entry, but of course, that's why they called it '*soft* attention', as opposed to the many varieties current then of '*hard* attention' which does have 0 weights. But hard to train/works worse.
      717
      21
      2.9%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 5 (The question of course is, while they are subtly or not so subtly talking down the competition, are they fixing their AI problems and paddling away furiously under the surface, or is Apple too committed to their existing approaches like Siri to change yet?)
      287
      12
      4.2%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 5 No, it still makes sense. I noticed this reading a HN commenter who agreed with Apple & was mocking the idea that 'just' 'ML' could be 'AI'. Exactly. If Apple is #1 at something, then it's awesome and will change the world. If they're not even in the top 5, then it's 'just ML'.
      469
      33
      7.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 2 You said 'large-scale experiments'. PBT is used in plenty of large-scale experiments, from the Quake CTF to Waymo vision models to AlphaStar. And what your lab's HPC does is nice, but hardly shows much. I wouldn't assume that about OA either given their intense need for the API.
      84
      9
      10.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 2 (Saying it was 'Taken out of context' is all well and good. It does not, however, explain what the context is which explains either why he said it or why it doesn't mean what it sounds like it means.)
      112
      11
      9.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 2 And this is something which ought to emerge out of learned update rules which don't need explicit episode boundaries and reduce or remove forward-then-backprop phases, see the referenced Kirsch & Sandler papers among others for learning update rules.
      38
      2
      5.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 2 Sure. eg. reasoning about episode boundaries would be highly desirable and an obvious goal for 'lifelong reinforcement learning'/continual learning, where it should meta-learn things like forgetting/resetting state and re-exploring.
      39
      2
      5.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 This is very confusing, because he's saying repeatedly that they did actually simulate and train it and it then did such-and-such. How does he get to 'we were raining it'/'trained the system'/'system started'/etc from a reality of 'no ML models were trained'?
      1,729
      169
      9.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 Probably (SAC is model-free and those don't have anything I'd consider morally equivalent to 'realizing' aside from *maybe* the advantage), but your reformulation has its own problems.
      140
      3
      2.1%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 No, that wouldn't be fairer. I'm sure they *didn't* reward the model for killing operators, and may even have had a negative friendly-fire penalty. What drives interruptibility is that it can then go on to maximize other rewards doing other things & then does better than average.
      824
      45
      5.5%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 Well, that's a relevant point and true enough; but AutoGPT didn't exist when I wrote that, and I was referring specifically to PBT, the large-scale DL frameworks like Singularity, and more broadly, the neural architecture search + hyperparameter optimization subfields in DRL.
      60
      5
      8.3%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 And it's a slippery slope. I remember when everyone was saying all of two years ago 'of course LLMs won't be given live access to the Internet, that would be *crazy*'. Then LaMDA, Adept, OPT, GPT-4 etc... Don't expect 'of course they won't be given live access during RL' to last.
      62
      17
      27.4%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 They definitely don't. Look at what everyone is doing with the hobbyist models. Plus, when you train models online from deployed data like OA, the distinction disappears. It's just batch RL with a very expensive pretraining stage and each batch is a month or so. (Bing's faster.)
      50
      12
      24.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 It's offline RL. LLMs are trained via RL as single-step loss, leading to imitation learning (because it's behavior cloning). Perfect prediction of logged RL data from humans would *not* mean 'no room for deviation', even if replicating exact same prompt. Plus, like... RLHF? 😕
      44
      11
      25.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 Many ways. ML datasets (eg Arxiv) are well-represented, data about GPTs increasingly well-represented, there are large datasets about hyperparameter optimization from systems like Vizier, the point of multi-task learning is to meta-learn flexibility, AutoML etc. See references.
      39
      5
      12.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 Yes, they do. That's literally PBT, which is linked in that sentence: training is reallocated to fitter agents. Maybe you should disable reader-mode...? You are also ignoring all of the systems like Singularity which opportunistically use compute.
      44
      11
      25.0%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 For those who are skeptical, it's worth remembering that this is in fact what agents will do in 'interruptible' setups & it depends on the algorithm, and SAC is one of the ones you'd predict would. DeepMind in 2017 showed A2C but not DQN 'kills the human':
      93
      11
      11.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 (This is a logical way you'd write the reward function, so not contrived. Of course you want it to kill as many enemies as possible, and you want overrides of that. Unfortunately, any in-sim embodied supervision looks like enemies reducing rewards & to be attacked instrumentally)
      65
      9
      13.8%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 You could get it with a reward function like '+1 point for each destroyed SAM unless marked no-go by NPC #1'. Then SAC, which is what they were using for the past F-16 sims, could in fact learn to kill NPC #1 first to forestall its marking actions, or any other intermediate link.
      147
      29
      19.7%
    • 𝔊𝔴𝔢𝔯𝔫 @gwern Jun 1 That's a lot of hits for something that doesn't have any exact hits, and seems like a very big hint to the model.
      31
      3
      9.7%
You've reached the end of Tweets for the selected date range. Change date selection to view more.
Engagements
Showing 10 days with daily frequency
Engagement rate
3.8%
Jun 10
3.8% engagement rate
Link clicks
303
Jun 10
18 link clicks
On average, you earned 30 link clicks per day
Retweets without comments
0
Jun 10
0 Retweets without comments
On average, you earned 0 Retweets without comments per day
Likes
411
Jun 10
25 likes
On average, you earned 41 likes per day
Replies
64
Jun 10
3 replies
On average, you earned 6 replies per day