Riley Goodside · Aug 24, 2022 · 12:22 AM UTC

Riley Goodside · Aug 24, 2022 · 12:22 AM UTC

Riley Goodside

Riley Goodside

@goodside

24 Aug 2022

Using GPT-3 to rewrite a regex legibly, construct positive/negative examples, and write Python 3 unit tests, all in a single prompt. Playground link: beta.openai.com/playground/p…

Aug 24, 2022 · 12:22 AM UTC

Riley Goodside · Aug 24, 2022 · 2:51 AM UTC

Riley Goodside

@goodside

24 Aug 2022

Note the output isn’t perfect — in this example, one of the positive test cases isn’t matched by the regex. My point with this is that GPT-3 can help you write the code needed to verify its work, and find value despite its imperfections.

Riley Goodside · Aug 24, 2022 · 2:56 AM UTC

Riley Goodside

@goodside

24 Aug 2022

Also illustrates importance of ordering in multi-task generations. Note the Python tests exactly the cases we listed above it, even though we didn’t request that explicitly. Each step in a generation guides the later ones, and you can use this your advantage.

Riley Goodside · Aug 24, 2022 · 3:00 AM UTC

Riley Goodside

@goodside

24 Aug 2022

Similarly, we request a step-by-step analysis of the regex before we summarize its purpose. Asking for a high-level summary first makes it more likely to be wrong, because it takes more mental work to produce without the commented code in front of you.

Joshua Palley · Aug 24, 2022 · 12:54 AM UTC

Joshua Palley @joshshua0

24 Aug 2022

Replying to @goodside

Since it knows languages and is able to translate between them, can it write a “unit test” for a spoken/written language? Does it know enough of its limitations to not answer or at least state it’s low confidence level?

Riley Goodside · Aug 24, 2022 · 1:09 AM UTC

Riley Goodside

@goodside

24 Aug 2022

By default, the model is tuned to always make an attempt even at things it can’t do well. It won’t, unprompted, refuse to do an unreasonable request.

more replies

Eric Rachlin · Aug 24, 2022 · 3:01 AM UTC

Eric Rachlin

@eerac

24 Aug 2022

Replying to @goodside

This is great! Have you tried running this multiple times to see 1) how often it passes the unit tests and 2) whether checking against the unit tests actually increases the overall probability of correctness?

Riley Goodside · Aug 24, 2022 · 3:06 AM UTC

Riley Goodside

@goodside

24 Aug 2022

If you really want this to be accurate, what I’d do is: 1) let it generate both regexes and analyses; 2) filter to those whose unit tests pass; 3) fine-tune on those examples; 4) repeat. Optionally, start with a known corpus of diverse regexes, then trust it to generate later.

more replies

Speedwagon Foundation · Aug 24, 2022 · 3:15 AM UTC

Speedwagon Foundation @EddyRobinson

24 Aug 2022

Replying to @goodside

Good lord