Arvind Narayanan · Mar 4, 2023 · 4:08 PM UTC

Arvind Narayanan

Arvind Narayanan @random_walker

4 Mar 2023

After the quoted thread generated furious debate about whether Bing AI keeps track of board position when given a chess game, it occurred to me to just ask it! It got 63 of 64 squares correct; it missed the pawn on b5. Accuracy degrades as the game gets longer.

Arvind Narayanan @random_walker

3 Mar 2023

Great example of why you shouldn't read too much into analogies like "bullshit generator" and "blurry JPEG of the web". The best way to predict the next move in a sequence of chess moves is… to build an internal model of chess rules and strategy, which Sydney seems to have done.

248

Critical AI : first issue out! https://read.dukeup · Mar 4, 2023 · 4:42 PM UTC

Critical AI : first issue out! https://read.dukeup @CriticalAI

4 Mar 2023

Wouldn't the fact that accuracy degrades suggest that it is not an abstracted model or internal representation but simply a probabilistic reply to your very specific prompt?

Arvind Narayanan · Mar 4, 2023 · 5:38 PM UTC

Arvind Narayanan @random_walker

4 Mar 2023

Reasonable people can disagree and I know I won't change anyone's mind, but here's why I see it differently. Gradual accuracy degradation suggests an internal representation exists but with a small chance of error in updating it, and that error accumulates with move count. 1/n

Arvind Narayanan · Mar 4, 2023 · 5:42 PM UTC

Arvind Narayanan @random_walker

4 Mar 2023

Without an internal representation, I think accuracy would drop exponentially, and there's no way it would get to 25 ply without losing the plot completely. I don't know any probabilistic mechanism that can simulate state updates without actually doing state updates.

Mikhail Parakhin · Mar 4, 2023 · 6:21 PM UTC

Mikhail Parakhin @MParakhin

4 Mar 2023

I suspect degradation is just due to the context size limitation. You should try it again in a couple of weeks.

Zack Witten · Mar 4, 2023 · 6:32 PM UTC

Zack Witten @zswitten

4 Mar 2023

Why would context size make a difference? Surely the current window is big enough to store 100 or even 200 ply of chess moves? Each move is only 3-5 chars so even in the worst case where each char was its own token…

Mikhail Parakhin · Mar 4, 2023 · 6:36 PM UTC

Mikhail Parakhin · Mar 4, 2023 · 6:36 PM UTC

Mikhail Parakhin @MParakhin

4 Mar 2023

Replying to @zswitten @random_walker @CriticalAI

Maybe you are correct, but the inner monologue can be much longer than the output - and I suspect it is in this case, though I haven't looked.

Mar 4, 2023 · 6:36 PM UTC

port — e/acc · Mar 4, 2023 · 10:22 PM UTC

port — e/acc @smkmnstr

4 Mar 2023

Replying to @MParakhin @zswitten @random_walker @CriticalAI

Wow, how does the inner monologue work? Another prompt running in parallel to the conversation that’s non user-accessible?

jj ⚙️ · Mar 4, 2023 · 8:16 PM UTC

jj ⚙️ @murchiston

4 Mar 2023

Replying to @MParakhin @zswitten @random_walker @CriticalAI

The edge running present and near future norm of transformers are multiple role specific models orchestrated together, metabolising varied types of memory storage and attentional context window inputs, like cells interlinked.

Vishal Das · Mar 4, 2023 · 6:57 PM UTC

Vishal Das @VishalD41876209

4 Mar 2023

Replying to @MParakhin @zswitten @random_walker @CriticalAI

Please look into it Sir, since 2days my Bing SERPs not showing but Chat mode responses are(I try to avoid chat for limit) I tried: 1.Sign out & Sign In 2.Clearing cookies of bing 3.Reinstalling Edge 4.Changing"Chat response on result page" in Bing"Labs"to More frequent