SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
State Soup: In-Context Skill Learning, Retrieval and Mixing
Auto Evol-Instruct: Automatic Instruction Evolving for Large Language Models
Instruction Modeling: Instruction Tuning With Loss Over Instructions
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Best Practices and Lessons Learned on Synthetic Data for Language Models
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning
MetaAligner: Conditional Weak-to-Strong Correction for Generalizable Multi-Objective Alignment of Language Models
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Rephrasing the Web (WARP): A Recipe for Compute and Data-Efficient Language Modeling
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
R-Tuning: Teaching Large Language Models to Refuse Unknown Questions
When ‘A Helpful Assistant’ Is Not Really Helpful: Personas in System Prompts Do Not Improve Performances of Large Language Models
Language Models are Super Mario (DARE): Absorbing Abilities from Homologous Models as a Free Lunch
LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
LLaVA-1.5: Improved Baselines with Visual Instruction Tuning
UltraFeedback: Boosting Language Models with High-quality Feedback
Can Programming Languages Boost Each Other via Instruction Tuning?
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Instruction Mining: High-Quality Instruction Data Selection for Large Language Models
Dr. LLaMa: Improving Small Language Models in Domain-Specific QA via Generative Data Augmentation
SELF-ALIGN: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
WizardLM: Empowering Large Language Models to Follow Complex Instructions
TANGO: Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
How well do Large Language Models perform in Arithmetic tasks?
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)
BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning
Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing (CoPoet)
Language Models are Multilingual Chain-of-Thought Reasoners
LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging
Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
T0: Multitask Prompted Training Enables Zero-Shot Task Generalization
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Muppet: Massive Multi-task Representations with Pre-Finetuning
UnifiedQA: Crossing Format Boundaries With a Single QA System
The Natural Language Decathlon: Multitask Learning as Question Answering
No Robots: Look Ma, an instruction dataset that wasn’t generated by GPTs!
2023-wu-figure5-humanevaluationofinstructionfinetunedmodelsbysizeon114tasksvsgpt35turboteacher.jpg
2022-chung-figure2-1836tasksforinstructionfinetuningflanpalm.png
2022-chung-figure4-scalingofinstructionfinetuningbymodelsizeandtaskcount.png
2022-chung-mainresultsfigurecodexvsdavincivspalmvsflanpalm.jpg
2022-chung-table1-average5shotmmluscoresforflanpalmshatteringmetaculushypermindforecastsaboutaiprogress.png
2022-chung-table2-thesmallcostofinstructiontuningtrainingvsoriginaltrainingcost.png
2022-gupta-figure4-instructdialinstructiontunedmodelperformanceincreaseswithnumberoftrainingtasksshowingblessingsofscale.jpg
2022-su-figure5-instructormodelsbenefitfromlongerdetaileddescriptionofdesiredembeddingfunctionality.png
2022-su-figure6-instructormodelsbenefitfromscalingupmodelsize.png
2022-wang-figure5-scalingtrendsofmodelsbynumberoftrainingtasksvsdatapointspertask.jpg
2022-xu-figure1-zeroprompttaskscalingvsmodelscalingonauc.png
2021-aghajanyan-figure1-prefinetuningscalingwithdatasetn.jpg
https://github.com/bigscience-workshop/architecture-objective
https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints
https://research.google/blog/introducing-flan-more-generalizable-language-models-with-instruction-fine-tuning/
https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
https%253A%252F%252Farxiv.org%252Fabs%252F2410.10629%2523nvidia.html
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Rephrasing the Web (WARP): A Recipe for Compute and Data-Efficient Language Modeling
https%253A%252F%252Farxiv.org%252Fabs%252F2401.16380%2523apple.html
https%253A%252F%252Farxiv.org%252Fabs%252F2310.06825%2523mistral.html
LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
UltraFeedback: Boosting Language Models with High-quality Feedback
https%253A%252F%252Farxiv.org%252Fabs%252F2307.08701%2523samsung.html
Dr. LLaMa: Improving Small Language Models in Domain-Specific QA via Generative Data Augmentation
SELF-ALIGN: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
https%253A%252F%252Farxiv.org%252Fabs%252F2305.03047%2523ibm.html
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
https%253A%252F%252Farxiv.org%252Fabs%252F2305.02301%2523google.html
WizardLM: Empowering Large Language Models to Follow Complex Instructions
TANGO: Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
How well do Large Language Models perform in Arithmetic tasks?
https%253A%252F%252Farxiv.org%252Fabs%252F2304.02015%2523alibaba.html
https%253A%252F%252Farxiv.org%252Fabs%252F2303.03846%2523google.html
https%253A%252F%252Farxiv.org%252Fabs%252F2212.13138%2523google.html
Self-Instruct: Aligning Language Models with Self-Generated Instructions
One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)
BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning
Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing (CoPoet)
https%253A%252F%252Farxiv.org%252Fabs%252F2210.11416%2523google.html
Language Models are Multilingual Chain-of-Thought Reasoners
https%253A%252F%252Farxiv.org%252Fabs%252F2210.03057%2523google.html
Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
https%253A%252F%252Farxiv.org%252Fabs%252F2208.09770%2523microsoft.html
Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
https%253A%252F%252Farxiv.org%252Fabs%252F2201.11473%2523microsoft.html
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Wikipedia Bibliography: