Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

Shayne Longpre

@ShayneRedford

1 Feb 2023

✨New Paper✨What’s the best completely public competitor to #ChatGPT? Flan-T5 beats all public models we tested: Flan-T5 3B ▶️ T0++ 3B ▶️ OPT-IML 175B ▶️ GLM-130B ▶️ Flan 2021 3B ▶️ NIv2 3B We release the @GoogleAI 🌟Flan Collection🌟data + methods for Instruction Tuning! 1/

Feb 1, 2023 · 3:24 PM UTC

256

1,172

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

The 🌟Flan Collection🌟 (1st used in Flan-PaLM bit.ly/3Zu7bU2): ➕ Merges Flan 2021, P3, NIv2, CoT instruction-datasets into 1800+ dataset collection ➕ Data augmentations and mixing strategies ➕ 100s new templates 2/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

This yields the best performing instruction tuning collection that has been compiled and released into one repo. See our survey Figure of the prior works we built on to produce this compilation. 3/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

Q: But why are the results strong? Our breakdown of the Flan Collection shows *why* it works. The most important methods: 🌟Finding 1🌟 Fine-tuning on zero-shot and few-shot prompts together significantly improves both settings (not a trade-off)! 4/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

🌟Finding 2🌟 Input inversion and data source balancing (as proposed and corroborated by MetaICL, T0, OPT-IML and others...) are incredibly important for successful instruction tuning. See our ablations Table. 5/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

🌟Finding 3 🌟 The Flan-T5 model converges higher and faster than T5 on single-task fine-tuning. ➡️ Recommendation: Use Flan-T5 as your base model for new tasks. ✅Better computational-efficiency and performance! 6/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

➡️ It's promising these results don't use any RLHF data, or human "alignment", which is expensive to collect and less publicly available. We hope this release supports the open source community, and improves instruction tuning methods and research! arxiv.org/abs/2301.13688 7/

The Flan Collection: Designing Data and Methods for Effective...

We study the design decisions of publicly available instruction tuning methods, and break down the development of Flan 2022 (Chung et al., 2022). Through careful ablation studies on the Flan...

arxiv.org

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

Please don’t hesitate to reach out with questions, thoughts, and critiques. We are always open to feedback! 😃 8/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

Lastly, a heartfelt thank you to my awesome mentors at Google – @_jasonwei, @barret_zoph, @YiTayML, @denny_zhou, @quocleix, and @ada_rob. 9/

Shayne Longpre · Feb 1, 2023 · 3:24 PM UTC

Shayne Longpre

@ShayneRedford

1 Feb 2023

As well as my fantastic co-contributors and colleagues @Hou_Le, @hwchung27, @tuvuumass, and @albertwebson, who ran many experiments and led the open sourcing infrastructure. 10/10

Brando Miranda · Feb 8, 2023 · 7:13 PM UTC

Brando Miranda @BrandoHablando

8 Feb 2023

Replying to @ShayneRedford @GoogleAI

@ShayneRedford if I want to fine-tune my own model on my very custom (and rare data) -- should I use instruction fine tuning, or just fine tune using my data as is on your flan-t5 model? I'm confused what your suggesting we do for fine tuning. Thanks for sharing btw!

Shayne Longpre · Feb 8, 2023 · 10:58 PM UTC

Shayne Longpre

@ShayneRedford

8 Feb 2023

Hi Brando -- I would suggest fine-tuning Flan-T5 (the size you want) on your custom data. You would only have to do your own instruction tuning if none of the T5-based architectures work for your deployment/latency considerations.