“‘T5 Transformer’ Tag”,2020-04-22 (backlinks):
![]()
Bibliography for tag
ai/nn/transformer/t5, most recent first: 111 annotations & 30 links (parent).
- See Also
- Links
- “Chronos: Learning the Language of Time Series”, et al 2024
- “ELLA: Equip Diffusion Models With LLM for Enhanced Semantic Alignment”, et al 2024
- “How to Train Data-Efficient LLMs”, et al 2024
- “Time Vectors: Time Is Encoded in the Weights of Finetuned Language Models”, et al 2023
- “Rich Human Feedback for Text-To-Image Generation”, et al 2023
- “Helping or Herding? Reward Model Ensembles Mitigate but Do Not Eliminate Reward Hacking”, et al 2023
- “Instruction-Tuning Aligns LLMs to the Human Brain”, et al 2023
- “PEARL: Personalizing Large Language Model Writing Assistants With Generation-Calibrated Retrievers”, et al 2023
- “UT5: Pretraining Non Autoregressive T5 With Unrolled Denoising”, et al 2023
- “FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, et al 2023
- “MADLAD-400: A Multilingual And Document-Level Large Audited Dataset”, et al 2023
- “LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models”, et al 2023
- “RAVEN: In-Context Learning With Retrieval-Augmented Encoder-Decoder Language Models”, et al 2023
- “Learning to Model the World With Language”, et al 2023
- “DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI”, et al 2023
- “No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-Based Language Models”, et al 2023
- “GKD: Generalized Knowledge Distillation for Auto-Regressive Sequence Models”, et al 2023
- “PaLI-X: On Scaling up a Multilingual Vision and Language Model”, et al 2023
- “Learning to Generate Novel Scientific Directions With Contextualized Literature-Based Discovery”, et al 2023
- “SoundStorm: Efficient Parallel Audio Generation”, et al 2023
- “Distilling Step-By-Step! Outperforming Larger Language Models With Less Training Data and Smaller Model Sizes”, et al 2023
- “LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions”, et al 2023
- “TANGO: Text-To-Audio Generation Using Instruction-Tuned LLM and Latent Diffusion Model”, et al 2023
- “Learning to Compress Prompts With Gist Tokens”, et al 2023
- “BiLD: Big Little Transformer Decoder”, et al 2023
- “Speak, Read and Prompt (SPEAR-TTS): High-Fidelity Text-To-Speech With Minimal Supervision”, et al 2023
- “BLIP-2: Bootstrapping Language-Image Pre-Training With Frozen Image Encoders and Large Language Models”, et al 2023
- “InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, et al 2023
- “Muse: Text-To-Image Generation via Masked Generative Transformers”, et al 2023
- “Character-Aware Models Improve Visual Text Rendering”, et al 2022
- “Unnatural Instructions: Tuning Language Models With (Almost) No Human Labor”, et al 2022
- “One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)”, et al 2022
- “ERNIE-Code: Beyond English-Centric Cross-Lingual Pretraining for Programming Languages”, et al 2022
- “Sparse Upcycling: Training Mixture-Of-Experts from Dense Checkpoints”, et al 2022
- “Fast Inference from Transformers via Speculative Decoding”, et al 2022
- “I Can’t Believe There’s No Images! Learning Visual Tasks Using Only Language Data”, et al 2022
- “TART: Task-Aware Retrieval With Instructions”, et al 2022
- “BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning”, et al 2022
- “EDiff-I: Text-To-Image Diffusion Models With an Ensemble of Expert Denoisers”, et al 2022
- “ProMoT: Preserving In-Context Learning Ability in Large Language Model Fine-Tuning”, et al 2022
- “Help Me Write a Poem: Instruction Tuning As a Vehicle for Collaborative Poetry Writing (CoPoet)”, et al 2022
- “FLAN: Scaling Instruction-Finetuned Language Models”, et al 2022
- “Table-To-Text Generation and Pre-Training With TabT5”, et al 2022
- “GLM-130B: An Open Bilingual Pre-Trained Model”, et al 2022
- “SAP: Bidirectional Language Models Are Also Few-Shot Learners”, et al 2022
- “FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation”, et al 2022
- “PaLI: A Jointly-Scaled Multilingual Language-Image Model”, et al 2022
- “Training a T5 Using Lab-Sized Resources”, 2022
- “PEER: A Collaborative Language Model”, et al 2022
- “Z-Code++: A Pre-Trained Language Model Optimized for Abstractive Summarization”, et al 2022
- “Limitations of Language Models in Arithmetic and Symbolic Induction”, et al 2022
- “RealTime QA: What’s the Answer Right Now?”, et al 2022
- “Forecasting Future World Events With Neural Networks”, et al 2022
- “RST: ReStructured Pre-Training”, 2022
- “Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems”, Fitz et al 2022
- “Boosting Search Engines With Interactive Agents”, et al 2022
- “EdiT5: Semi-Autoregressive Text-Editing With T5 Warm-Start”, et al 2022
- “CT0: Fine-Tuned Language Models Are Continual Learners”, et al 2022
- “Imagen: Photorealistic Text-To-Image Diffusion Models With Deep Language Understanding”, et al 2022
- “Automated Crossword Solving”, et al 2022
- “Unifying Language Learning Paradigms”, et al 2022
- “Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks”, et al 2022
- “What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?”, et al 2022
- “ByT5 Model for Massively Multilingual Grapheme-To-Phoneme Conversion”, et al 2022
- “Pathways: Asynchronous Distributed Dataflow for ML”, et al 2022
- “HyperPrompt: Prompt-Based Task-Conditioning of Transformers”, et al 2022
- “Using Natural Language Prompts for Machine Translation”, 2022
- “UnifiedQA-V2: Stronger Generalization via Broader Cross-Format Training”, et al 2022
- “Mixture-Of-Experts With Expert Choice Routing”, et al 2022
- “InPars: Data Augmentation for Information Retrieval Using Large Language Models”, et al 2022
- “Reasoning Like Program Executors”, et al 2022
- “CommonsenseQA 2.0: Exposing the Limits of AI through Gamification”, et al 2022
- “QuALITY: Question Answering With Long Input Texts, Yes!”, et al 2021
- “FRUIT: Faithfully Reflecting Updated Information in Text”, IV et al 2021
- “Large Dual Encoders Are Generalizable Retrievers”, et al 2021
- “LongT5: Efficient Text-To-Text Transformer for Long Sequences”, et al 2021
- “Scaling Language Models: Methods, Analysis & Insights from Training Gopher”, et al 2021
- “ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning”, et al 2021
- “Fast Model Editing at Scale”, et al 2021
- “T0: Multitask Prompted Training Enables Zero-Shot Task Generalization”, et al 2021
- “LFPT5: A Unified Framework for Lifelong Few-Shot Language Learning Based on Prompt Tuning of T5”, 2021
- “Can Machines Learn Morality? The Delphi Experiment”, et al 2021
- “Scale Efficiently: Insights from Pre-Training and Fine-Tuning Transformers”, et al 2021
- “TruthfulQA: Measuring How Models Mimic Human Falsehoods”, et al 2021
- “General-Purpose Question-Answering With Macaw”, 2021
- “Sentence-T5: Scalable Sentence Encoders from Pre-Trained Text-To-Text Models”, et al 2021
- “Time-Aware Language Models As Temporal Knowledge Bases”, et al 2021
- “Implicit Representations of Meaning in Neural Language Models”, et al 2021
- “Explainable Multi-Hop Verbal Reasoning Through Internal Monologue”, et al 2021
- “ByT5: Towards a Token-Free Future With Pre-Trained Byte-To-Byte Models”, et al 2021
- “Carbon Emissions and Large Neural Network Training”, et al 2021
- “The Power of Scale for Parameter-Efficient Prompt Tuning”, et al 2021
- “UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark”, et al 2021
- “GLM: General Language Model Pretraining With Autoregressive Blank Infilling”, et al 2021
- “Investigating the Limitations of the Transformers With Simple Arithmetic Tasks”, et al 2021
- “VL-T5: Unifying Vision-And-Language Tasks via Text Generation”, et al 2021
- “Switch Transformers: Scaling to Trillion Parameter Models With Simple and Efficient Sparsity”, et al 2021
- “MT5: A Massively Multilingual Pre-Trained Text-To-Text Transformer”, et al 2020
- “TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling”, et al 2020
- “MMLU: Measuring Massive Multitask Language Understanding”, et al 2020
- “ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing”, et al 2020
- “Leveraging Passage Retrieval With Generative Models for Open Domain Question Answering”, 2020
- “UnifiedQA: Crossing Format Boundaries With a Single QA System”, et al 2020
- “TTTTTackling WinoGrande Schemas”, et al 2020
- “How Much Knowledge Can You Pack Into the Parameters of a Language Model?”, et al 2020
- “CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning”, et al 2019
- “T5: Exploring the Limits of Transfer Learning With a Unified Text-To-Text Transformer”, et al 2019
- “Colin Raffel”
- “Transformer-VAE for Program Synthesis”
- “What Happened to BERT & T5? On Transformer Encoders, PrefixLM and Denoising Objectives”, 2024
- colinraffel
- Sort By Magic
- Miscellaneous
- Bibliography