ChatGPT is running quantized

redpnd · 2023-06-11T17:03:04+00:00

Some takeaways:

Focus on building durable businesses on top of the API
Structured API responses coming (eg. JSON)
"we do a lot of quantization"
Whole year of failed attempts at exceeding GPT-3; had to rebuild the whole stack
Took months till Code Interpreter started working, plugins still don't really work
GPT-V is the internal name for the vision model
Slow rollout due to GPU shortage
Function call model is coming to the API in ~2 weeks (uses same mechanism as the plugins model)
They're surprised by the number of non-English users, future models will take this into account (tokenization!)
They did the 10x price reduction for 3.5, can do the same for 4 (in 6-12 months)
More model customization coming (swapping the encoder?)
Fine tuning will enable Korean alphabet?
Conversations will be more interactive -- going back and forth will enable more creativity (been waiting for this personally)
Semiconductors are a good analogy for how they make progress: "solve hard problems at every layer of the stack"

Edit: transcript is different, this seems to be the fireside chat, not the roundtable one

sanxiyn · 2023-06-11T11:31:17+00:00

The implication was that it wasn't running quantized before the Turbo update.

NNOTM · 2023-06-11T17:33:07+00:00

What does quantized mean in this context?

markschmidty · 2023-06-11T12:59:06+00:00

Is there a recording of this somewhere?

mlscaling