John Wiseman · Feb 22, 2023 · 5:46 AM UTC

John Wiseman · Feb 22, 2023 · 5:46 AM UTC

John Wiseman

John Wiseman

@lemonodor

22 Feb 2023

I saw a somewhat astonishing thing today. GPT was asked a question that it needed to write code to answer, and given access to a Python REPL. It wrote buggy code, then based on the error message it fixed its own code until it worked (and gave the correct answer). It debugged.

Feb 22, 2023 · 5:46 AM UTC

John Wiseman · Feb 22, 2023 · 5:55 AM UTC

John Wiseman

@lemonodor

22 Feb 2023

The question posed was "What is the 10th fibonacci number?" The GPT-based agent's first attempt was "fibonacci(10)"—"Action Input" is what the LLM is sending to the Python REPL. This resulted in an error because it's not a standard function. The LLM figured that out.

422

John Wiseman · Feb 22, 2023 · 6:01 AM UTC

John Wiseman

@lemonodor

22 Feb 2023

It then defined a fibonacci function, called it, and got the right answer. All fully automated, with no human intervention (using @LangChainAI). Technically simple but kinda mind-blowing: It debugged its own code until it worked.

559

John Wiseman · Feb 22, 2023 · 3:44 PM UTC

John Wiseman

@lemonodor

22 Feb 2023

Here's the 40 lines of code that will let you demonstrate GPT automatically debugging its own code. The prereqs are pip install openai langchain & get an @OpenAI API key. gist.github.com/wiseman/4a70…

580

John Wiseman · Feb 22, 2023 · 5:53 PM UTC

John Wiseman

@lemonodor

22 Feb 2023

In this run I asked it to calculate the 100th Fibonacci number and it realizes its naive approach would take too long and it needs a more efficient algorithm. (I modified the Python interface to raise an exception if code takes longer than 1 second to run.)

401

John Wiseman · Feb 22, 2023 · 6:11 PM UTC

John Wiseman

@lemonodor

22 Feb 2023

I’m imagining programming language error messages becoming optimized for giving feedback to LLM’s and thinking it would probably provide a significant benefit to humans as well.

350

John Wiseman · Feb 22, 2023 · 11:13 PM UTC

John Wiseman

@lemonodor

22 Feb 2023

fschwiet @fschwiet

22 Feb 2023

299

John Wiseman · Feb 23, 2023 · 8:42 PM UTC

John Wiseman

@lemonodor

23 Feb 2023

John Wiseman

@lemonodor

23 Feb 2023

6 million views on my post about GPT automatically debugging its own code (which it did), but only @voooooogel mentioned that GPT didn't actually use the result of the code to figure out the answer.

Philbo Laggins · Feb 23, 2023 · 2:20 AM UTC

Philbo Laggins

@philbolaggin

23 Feb 2023

Replying to @lemonodor

Couldn't it just have googled the answer?

John Wiseman · Feb 23, 2023 · 2:52 AM UTC

John Wiseman

@lemonodor

23 Feb 2023

I don’t give it access to google. :)

more replies

Matt Barrie · Feb 22, 2023 · 9:37 PM UTC

Matt Barrie

@matt_barrie

22 Feb 2023

Replying to @lemonodor

I’ve seen it do that, correcting a join that it incorrectly wrote for sql