- See Also
-
Links
- “Gzip versus Bag-of-words for Text Classification With k-NN”, Opitz 2023
- “Text Embeddings Reveal (Almost) As Much As Text”, Anonymous 2023
- “Copy Is All You Need”, Lan et al 2023
- “Lost in the Middle: How Language Models Use Long Contexts”, Liu et al 2023
- “LeanDojo: Theorem Proving With Retrieval-Augmented Language Models”, Yang et al 2023
- “Voice Conversion With Just Nearest Neighbors”, Baas et al 2023
- “TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models”, Hardt & Sun 2023
- “Landmark Attention: Random-Access Infinite Context Length for Transformers”, Mohtashami & Jaggi 2023
- “ImageBind: One Embedding Space To Bind Them All”, Girdhar et al 2023
- “Unlimiformer: Long-Range Transformers With Unlimited Length Input”, Bertsch et al 2023
- “Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Bitton et al 2023
- “CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval”, Wu et al 2023
- “Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes”, Arora et al 2023
- “Shall We Pretrain Autoregressive Language Models With Retrieval? A Comprehensive Study”, Wang et al 2023
- “MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks”, Kuo et al 2023
- “Mitigating YouTube Recommendation Polarity Using BERT and K-Means Clustering”, Ahmad et al 2023
- “Tag2Text: Guiding Vision-Language Model via Image Tagging”, Huang et al 2023
- “ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics”, Azerbayev et al 2023
- “Not What You’ve Signed up For: Compromising Real-World LLM-Integrated Applications With Indirect Prompt Injection”, Greshake et al 2023
- “Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”, Aksitov et al 2023
- “In-Context Retrieval-Augmented Language Models”, Ram et al 2023
- “Large Language Models Are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning”, Ye et al 2023
- “Crawling the Internal Knowledge-Base of Language Models”, Cohen et al 2023
- “InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, Boytsov et al 2023
- “Why Do Nearest Neighbor Language Models Work?”, Xu et al 2023
- “Precise Zero-Shot Dense Retrieval without Relevance Labels”, Gao et al 2022
- “One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)”, Su et al 2022
- “Less Is More: Parameter-Free Text Classification With Gzip”, Jiang et al 2022
- “Text Embeddings by Weakly-Supervised Contrastive Pre-training”, Wang et al 2022
- “NPM: Nonparametric Masked Language Modeling”, Min et al 2022
- “Retrieval-Augmented Multimodal Language Modeling”, Yasunaga et al 2022
- “TART: Task-aware Retrieval With Instructions”, Asai et al 2022
- “RARR: Attributed Text Generation via Post-hoc Research and Revision”, Gao et al 2022
- “Noise-Robust De-Duplication at Scale”, Silcock et al 2022
- “Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)”, Press et al 2022
- “ReAct: Synergizing Reasoning and Acting in Language Models”, Yao et al 2022
- “Sparrow: Improving Alignment of Dialogue Agents via Targeted Human Judgements”, Glaese et al 2022
- “FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation”, Hofstätter et al 2022
- “Generate rather than Retrieve (GenRead): Large Language Models Are Strong Context Generators”, Yu et al 2022
- “Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners”, Su et al 2022
- “Nearest Neighbor Non-autoregressive Text Generation”, Niwa et al 2022
- “CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks”, Chen et al 2022
- “RealTime QA: What’s the Answer Right Now?”, Kasai et al 2022
- “NewsStories: Illustrating Articles With Visual Summaries”, Tan et al 2022
- “Text-Guided Synthesis of Artistic Images With Retrieval-Augmented Diffusion Models”, Rombach et al 2022
- “Re2G: Retrieve, Rerank, Generate”, Glass et al 2022
- “Large-Scale Retrieval for Reinforcement Learning”, Humphreys et al 2022
- “A Neural Corpus Indexer for Document Retrieval”, Wang et al 2022
- “Boosting Search Engines With Interactive Agents”, Ciaramita et al 2022
- “Hopular: Modern Hopfield Networks for Tabular Data”, Schäfl et al 2022
- “NaturalProver: Grounded Mathematical Proof Generation With Language Models”, Welleck et al 2022
- “Down and Across: Introducing Crossword-Solving As a New NLP Benchmark”, Kulshreshtha et al 2022
- “RankGen: Improving Text Generation With Large Ranking Models”, Krishna et al 2022
- “PLAID: An Efficient Engine for Late Interaction Retrieval”, Santhanam et al 2022
- “Unifying Language Learning Paradigms”, Tay et al 2022
- “Semi-Parametric Neural Image Synthesis”, Blattmann et al 2022
- “KNN-Diffusion: Image Generation via Large-Scale Retrieval”, Ashual et al 2022
- “Language Models That Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion”, Shuster et al 2022
- “Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment”, Zhou et al 2022
- “Retrieval Augmented Classification for Long-Tail Visual Recognition”, Long et al 2022
- “Retrieval-Augmented Reinforcement Learning”, Goyal et al 2022
- “Transformer Memory As a Differentiable Search Index”, Tay et al 2022
- “InPars: Data Augmentation for Information Retrieval Using Large Language Models”, Bonifacio et al 2022
- “Text and Code Embeddings by Contrastive Pre-Training”, Neelakantan et al 2022
- “LaMDA: Language Models for Dialog Applications”, Thoppilan et al 2022
- “Memory-assisted Prompt Editing to Improve GPT-3 After Deployment”, Madaan et al 2022
- “A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering”, Gao et al 2022
- “WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, Hilton et al 2021
- “WebGPT: Browser-assisted Question-answering With Human Feedback”, Nakano et al 2021
- “Contriever: Towards Unsupervised Dense Information Retrieval With Contrastive Learning”, Izacard et al 2021
- “Learning To Retrieve Prompts for In-Context Learning”, Rubin et al 2021
- “Large Dual Encoders Are Generalizable Retrievers”, Ni et al 2021
- “Boosted Dense Retriever”, Lewis et al 2021
- “Spider: Learning to Retrieve Passages without Supervision”, Ram et al 2021
- “You Only Need One Model for Open-domain Question Answering”, Lee et al 2021
- “Improving Language Models by Retrieving from Trillions of Tokens”, Borgeaud et al 2021
- “Human Parity on CommonsenseQA: Augmenting Self-Attention With External Attention”, Xu et al 2021
- “Florence: A New Foundation Model for Computer Vision”, Yuan et al 2021
- “LiT: Zero-Shot Transfer With Locked-image Text Tuning”, Zhai et al 2021
- “HTCN: Harmonious Text Colorization Network for Visual-Textual Presentation Design”, Yang et al 2021c
- “CLOOB: Modern Hopfield Networks With InfoLOOB Outperform CLIP”, Fürst et al 2021
- “Memorizing Transformers”, Wu et al 2021
- “One Loss for All: Deep Hashing With a Single Cosine Similarity Based Learning Objective”, Hoe et al 2021
- “SPLADE V2: Sparse Lexical and Expansion Model for Information Retrieval”, Formal et al 2021
- “EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling”, Wang et al 2021
- “Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models”, Ni et al 2021
- “Contrastive Language-Image Pre-training for the Italian Language”, Bianchi et al 2021
- “Billion-Scale Pretraining With Vision Transformers for Multi-Task Visual Representations”, Beal et al 2021
- “MuSiQue: Multi-hop Questions via Single-hop Question Composition”, Trivedi et al 2021
- “Internet-Augmented Dialogue Generation”, Komeili et al 2021
- “CLIP2Video: Mastering Video-Text Retrieval via Image CLIP”, Fang et al 2021
- “A Multi-Level Attention Model for Evidence-Based Fact Checking”, Kruengkrai et al 2021
- “Towards Mental Time Travel: a Hierarchical Memory for Reinforcement Learning Agents”, Lampinen et al 2021
- “RetGen: A Joint Framework for Retrieval and Grounded Text Generation Modeling”, Zhang et al 2021
- “Not All Memories Are Created Equal: Learning to Forget by Expiring”, Sukhbaatar et al 2021
- “Rethinking Search: Making Domain Experts out of Dilettantes”, Metzler et al 2021
- “SimCSE: Simple Contrastive Learning of Sentence Embeddings”, Gao et al 2021
- “BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models”, Thakur et al 2021
- “Retrieval Augmentation Reduces Hallucination in Conversation”, Shuster et al 2021
- “NaturalProofs: Mathematical Theorem Proving in Natural Language”, Welleck et al 2021
- “China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’: The Beijing Academy of Artificial Intelligence (BAAI) Releases Wu Dao 1.0, China’s First Large-scale Pretraining Model.”, Synced 2021
- “Get Your Vitamin C! Robust Fact Verification With Contrastive Evidence (VitaminC)”, Schuster et al 2021
- “ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”, Jia et al 2021
- “Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers”, Hendricks et al 2021
- “Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup”, Gao et al 2021
- “Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps”, Ho et al 2020
- “Current Limitations of Language Models: What You Need Is Retrieval”, Komatsuzaki 2020
- “Leveraging Passage Retrieval With Generative Models for Open Domain Question Answering”, Izacard & Grave 2020
- “Pre-training via Paraphrasing”, Lewis et al 2020
- “Memory Transformer”, Burtsev et al 2020
- “M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training”, Ni et al 2020
- “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”, Lewis et al 2020
- “Dense Passage Retrieval for Open-Domain Question Answering”, Karpukhin et al 2020
- “Learning to Scale Multilingual Representations for Vision-Language Tasks”, Burns et al 2020
- “How Much Knowledge Can You Pack Into the Parameters of a Language Model?”, Roberts et al 2020
- “REALM: Retrieval-Augmented Language Model Pre-Training”, Guu et al 2020
- “Generalization through Memorization: Nearest Neighbor Language Models”, Khandelwal et al 2019
- “MULE: Multimodal Universal Language Embedding”, Kim et al 2019
- “Language Models As Knowledge Bases?”, Petroni et al 2019
- “Metalearned Neural Memory”, Munkhdalai et al 2019
- “ELI5: Long Form Question Answering”, Fan et al 2019
- “Large Memory Layers With Product Keys”, Lample et al 2019
- “OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge”, Marino et al 2019
- “Dynamic Evaluation of Transformer Language Models”, Krause et al 2019
- “LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game”, Urbanek et al 2019
- “Top-K Off-Policy Correction for a REINFORCE Recommender System”, Chen et al 2018
- “FEVER: a Large-scale Dataset for Fact Extraction and VERification”, Thorne et al 2018
- “Towards Deep Modeling of Music Semantics Using EEG Regularizers”, Raposo et al 2017
- “Learning to Organize Knowledge and Answer Questions With N-Gram Machines”, Yang et al 2017
- “Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning”, Zhong et al 2017
- “Bolt: Accelerated Data Mining With Fast Vector Compression”, Blalock & Guttag 2017
- “Get To The Point: Summarization With Pointer-Generator Networks”, See et al 2017
- “Neural Episodic Control”, Pritzel et al 2017
- “Improving Neural Language Models With a Continuous Cache”, Grave et al 2016
- “Scaling Memory-Augmented Neural Networks With Sparse Reads and Writes”, Rae et al 2016
- “Deep Neural Networks for YouTube Recommendations”, Covington et al 2016
- “One-shot Learning With Memory-Augmented Neural Networks”, Santoro et al 2016
- “PlaNet—Photo Geolocation With Convolutional Neural Networks”, Weyand et al 2016
- “Learning to Win by Reading Manuals in a Monte-Carlo Framework”, Branavan et al 2014
- “This Week's Citation Classic: Nearest Neighbor Pattern Classification”, Cover 1982
- “Nearest Neighbor Pattern Classification”, Cover & Hart 1967
- “ANN-Benchmarks Is a Benchmarking Environment for Approximate Nearest Neighbor Algorithms Search. This Website Contains the Current Benchmarking Results. Please Visit Https://github.com/erikbern/ann-benchmarks/ to Get an Overview over Evaluated Data Sets and Algorithms. Make a Pull Request on Github to Add Your Own Code or Improvements to the Benchmarking System.”
- “This Anime Does Not Exist, Search: This Notebook Uses the Precomputed CLIP Feature Vectors for 100k Images from TADNE”
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“Gzip versus Bag-of-words for Text Classification With k-NN”, Opitz 2023
“Gzip versus bag-of-words for text classification with k-NN”
“Text Embeddings Reveal (Almost) As Much As Text”, Anonymous 2023
“Copy Is All You Need”, Lan et al 2023
“Lost in the Middle: How Language Models Use Long Contexts”, Liu et al 2023
“LeanDojo: Theorem Proving With Retrieval-Augmented Language Models”, Yang et al 2023
“LeanDojo: Theorem Proving with Retrieval-Augmented Language Models”
“Voice Conversion With Just Nearest Neighbors”, Baas et al 2023
“TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models”, Hardt & Sun 2023
“TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models”
“Landmark Attention: Random-Access Infinite Context Length for Transformers”, Mohtashami & Jaggi 2023
“Landmark Attention: Random-Access Infinite Context Length for Transformers”
“ImageBind: One Embedding Space To Bind Them All”, Girdhar et al 2023
“Unlimiformer: Long-Range Transformers With Unlimited Length Input”, Bertsch et al 2023
“Unlimiformer: Long-Range Transformers with Unlimited Length Input”
“Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Bitton et al 2023
“q2d: Turning Questions into Dialogs to Teach Models How to Search”
“CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval”, Wu et al 2023
“Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes”, Arora et al 2023
“Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes”
“Shall We Pretrain Autoregressive Language Models With Retrieval? A Comprehensive Study”, Wang et al 2023
“Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study”
“MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks”, Kuo et al 2023
“MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks”
“Mitigating YouTube Recommendation Polarity Using BERT and K-Means Clustering”, Ahmad et al 2023
“Mitigating YouTube Recommendation Polarity using BERT and K-Means Clustering”
“Tag2Text: Guiding Vision-Language Model via Image Tagging”, Huang et al 2023
“ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics”, Azerbayev et al 2023
“ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics”
“Not What You’ve Signed up For: Compromising Real-World LLM-Integrated Applications With Indirect Prompt Injection”, Greshake et al 2023
“Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”, Aksitov et al 2023
“Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”
“In-Context Retrieval-Augmented Language Models”, Ram et al 2023
“Large Language Models Are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning”, Ye et al 2023
“Crawling the Internal Knowledge-Base of Language Models”, Cohen et al 2023
“InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, Boytsov et al 2023
“InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”
“Why Do Nearest Neighbor Language Models Work?”, Xu et al 2023
“Precise Zero-Shot Dense Retrieval without Relevance Labels”, Gao et al 2022
“Precise Zero-Shot Dense Retrieval without Relevance Labels”
“One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)”, Su et al 2022
“One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)”
“Less Is More: Parameter-Free Text Classification With Gzip”, Jiang et al 2022
“Less is More: Parameter-Free Text Classification with Gzip”
“Text Embeddings by Weakly-Supervised Contrastive Pre-training”, Wang et al 2022
“Text Embeddings by Weakly-Supervised Contrastive Pre-training”
“NPM: Nonparametric Masked Language Modeling”, Min et al 2022
“Retrieval-Augmented Multimodal Language Modeling”, Yasunaga et al 2022
“TART: Task-aware Retrieval With Instructions”, Asai et al 2022
“RARR: Attributed Text Generation via Post-hoc Research and Revision”, Gao et al 2022
“RARR: Attributed Text Generation via Post-hoc Research and Revision”
“Noise-Robust De-Duplication at Scale”, Silcock et al 2022
“Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)”, Press et al 2022
“Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)”
“ReAct: Synergizing Reasoning and Acting in Language Models”, Yao et al 2022
“ReAct: Synergizing Reasoning and Acting in Language Models”
“Sparrow: Improving Alignment of Dialogue Agents via Targeted Human Judgements”, Glaese et al 2022
“Sparrow: Improving alignment of dialogue agents via targeted human judgements”
“FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation”, Hofstätter et al 2022
“FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation”
“Generate rather than Retrieve (GenRead): Large Language Models Are Strong Context Generators”, Yu et al 2022
“Generate rather than Retrieve (GenRead): Large Language Models are Strong Context Generators”
“Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners”, Su et al 2022
“Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners”
“Nearest Neighbor Non-autoregressive Text Generation”, Niwa et al 2022
“CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks”, Chen et al 2022
“CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks”
“RealTime QA: What’s the Answer Right Now?”, Kasai et al 2022
“NewsStories: Illustrating Articles With Visual Summaries”, Tan et al 2022
“Text-Guided Synthesis of Artistic Images With Retrieval-Augmented Diffusion Models”, Rombach et al 2022
“Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models”
“Re2G: Retrieve, Rerank, Generate”, Glass et al 2022
“Large-Scale Retrieval for Reinforcement Learning”, Humphreys et al 2022
“A Neural Corpus Indexer for Document Retrieval”, Wang et al 2022
“Boosting Search Engines With Interactive Agents”, Ciaramita et al 2022
“Hopular: Modern Hopfield Networks for Tabular Data”, Schäfl et al 2022
“NaturalProver: Grounded Mathematical Proof Generation With Language Models”, Welleck et al 2022
“NaturalProver: Grounded Mathematical Proof Generation with Language Models”
“Down and Across: Introducing Crossword-Solving As a New NLP Benchmark”, Kulshreshtha et al 2022
“Down and Across: Introducing Crossword-Solving as a New NLP Benchmark”
“RankGen: Improving Text Generation With Large Ranking Models”, Krishna et al 2022
“RankGen: Improving Text Generation with Large Ranking Models”
“PLAID: An Efficient Engine for Late Interaction Retrieval”, Santhanam et al 2022
“Unifying Language Learning Paradigms”, Tay et al 2022
“Semi-Parametric Neural Image Synthesis”, Blattmann et al 2022
“KNN-Diffusion: Image Generation via Large-Scale Retrieval”, Ashual et al 2022
“Language Models That Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion”, Shuster et al 2022
“Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment”, Zhou et al 2022
“Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment”
“Retrieval Augmented Classification for Long-Tail Visual Recognition”, Long et al 2022
“Retrieval Augmented Classification for Long-Tail Visual Recognition”
“Retrieval-Augmented Reinforcement Learning”, Goyal et al 2022
“Transformer Memory As a Differentiable Search Index”, Tay et al 2022
“InPars: Data Augmentation for Information Retrieval Using Large Language Models”, Bonifacio et al 2022
“InPars: Data Augmentation for Information Retrieval using Large Language Models”
“Text and Code Embeddings by Contrastive Pre-Training”, Neelakantan et al 2022
“LaMDA: Language Models for Dialog Applications”, Thoppilan et al 2022
“Memory-assisted Prompt Editing to Improve GPT-3 After Deployment”, Madaan et al 2022
“Memory-assisted prompt editing to improve GPT-3 after deployment”
“A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering”, Gao et al 2022
“WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, Hilton et al 2021
“WebGPT: Improving the factual accuracy of language models through web browsing”
“WebGPT: Browser-assisted Question-answering With Human Feedback”, Nakano et al 2021
“WebGPT: Browser-assisted question-answering with human feedback”
“Contriever: Towards Unsupervised Dense Information Retrieval With Contrastive Learning”, Izacard et al 2021
“Contriever: Towards Unsupervised Dense Information Retrieval with Contrastive Learning”
“Learning To Retrieve Prompts for In-Context Learning”, Rubin et al 2021
“Large Dual Encoders Are Generalizable Retrievers”, Ni et al 2021
“Boosted Dense Retriever”, Lewis et al 2021
“Spider: Learning to Retrieve Passages without Supervision”, Ram et al 2021
“You Only Need One Model for Open-domain Question Answering”, Lee et al 2021
“You Only Need One Model for Open-domain Question Answering”
“Improving Language Models by Retrieving from Trillions of Tokens”, Borgeaud et al 2021
“Improving language models by retrieving from trillions of tokens”
“Human Parity on CommonsenseQA: Augmenting Self-Attention With External Attention”, Xu et al 2021
“Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention”
“Florence: A New Foundation Model for Computer Vision”, Yuan et al 2021
“LiT: Zero-Shot Transfer With Locked-image Text Tuning”, Zhai et al 2021
“HTCN: Harmonious Text Colorization Network for Visual-Textual Presentation Design”, Yang et al 2021c
“HTCN: Harmonious Text Colorization Network for Visual-Textual Presentation Design”
“CLOOB: Modern Hopfield Networks With InfoLOOB Outperform CLIP”, Fürst et al 2021
“CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP”
“Memorizing Transformers”, Wu et al 2021
“One Loss for All: Deep Hashing With a Single Cosine Similarity Based Learning Objective”, Hoe et al 2021
“One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective”
“SPLADE V2: Sparse Lexical and Expansion Model for Information Retrieval”, Formal et al 2021
“SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval”
“EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling”, Wang et al 2021
“Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models”, Ni et al 2021
“Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models”
“Contrastive Language-Image Pre-training for the Italian Language”, Bianchi et al 2021
“Contrastive Language-Image Pre-training for the Italian Language”
“Billion-Scale Pretraining With Vision Transformers for Multi-Task Visual Representations”, Beal et al 2021
“Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations”
“MuSiQue: Multi-hop Questions via Single-hop Question Composition”, Trivedi et al 2021
“MuSiQue: Multi-hop Questions via Single-hop Question Composition”
“Internet-Augmented Dialogue Generation”, Komeili et al 2021
“CLIP2Video: Mastering Video-Text Retrieval via Image CLIP”, Fang et al 2021
“A Multi-Level Attention Model for Evidence-Based Fact Checking”, Kruengkrai et al 2021
“A Multi-Level Attention Model for Evidence-Based Fact Checking”
“Towards Mental Time Travel: a Hierarchical Memory for Reinforcement Learning Agents”, Lampinen et al 2021
“Towards mental time travel: a hierarchical memory for reinforcement learning agents”
“RetGen: A Joint Framework for Retrieval and Grounded Text Generation Modeling”, Zhang et al 2021
“RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling”
“Not All Memories Are Created Equal: Learning to Forget by Expiring”, Sukhbaatar et al 2021
“Not All Memories are Created Equal: Learning to Forget by Expiring”
“Rethinking Search: Making Domain Experts out of Dilettantes”, Metzler et al 2021
“Rethinking Search: Making Domain Experts out of Dilettantes”
“SimCSE: Simple Contrastive Learning of Sentence Embeddings”, Gao et al 2021
“SimCSE: Simple Contrastive Learning of Sentence Embeddings”
“BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models”, Thakur et al 2021
“BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models”
“Retrieval Augmentation Reduces Hallucination in Conversation”, Shuster et al 2021
“Retrieval Augmentation Reduces Hallucination in Conversation”
“NaturalProofs: Mathematical Theorem Proving in Natural Language”, Welleck et al 2021
“NaturalProofs: Mathematical Theorem Proving in Natural Language”
“China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’: The Beijing Academy of Artificial Intelligence (BAAI) Releases Wu Dao 1.0, China’s First Large-scale Pretraining Model.”, Synced 2021
“Get Your Vitamin C! Robust Fact Verification With Contrastive Evidence (VitaminC)”, Schuster et al 2021
“Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence (VitaminC)”
“ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”, Jia et al 2021
“ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”
“Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers”, Hendricks et al 2021
“Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers”
“Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup”, Gao et al 2021
“Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup”
“Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps”, Ho et al 2020
“Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps”
“Current Limitations of Language Models: What You Need Is Retrieval”, Komatsuzaki 2020
“Current Limitations of Language Models: What You Need is Retrieval”
“Leveraging Passage Retrieval With Generative Models for Open Domain Question Answering”, Izacard & Grave 2020
“Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering”
“Pre-training via Paraphrasing”, Lewis et al 2020
“Memory Transformer”, Burtsev et al 2020
“M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training”, Ni et al 2020
“M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training”
“Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”, Lewis et al 2020
“Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”
“Dense Passage Retrieval for Open-Domain Question Answering”, Karpukhin et al 2020
“Dense Passage Retrieval for Open-Domain Question Answering”
“Learning to Scale Multilingual Representations for Vision-Language Tasks”, Burns et al 2020
“Learning to Scale Multilingual Representations for Vision-Language Tasks”
“How Much Knowledge Can You Pack Into the Parameters of a Language Model?”, Roberts et al 2020
“How Much Knowledge Can You Pack Into the Parameters of a Language Model?”
“REALM: Retrieval-Augmented Language Model Pre-Training”, Guu et al 2020
“Generalization through Memorization: Nearest Neighbor Language Models”, Khandelwal et al 2019
“Generalization through Memorization: Nearest Neighbor Language Models”
“MULE: Multimodal Universal Language Embedding”, Kim et al 2019
“Language Models As Knowledge Bases?”, Petroni et al 2019
“Metalearned Neural Memory”, Munkhdalai et al 2019
“ELI5: Long Form Question Answering”, Fan et al 2019
“Large Memory Layers With Product Keys”, Lample et al 2019
“OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge”, Marino et al 2019
“OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge”
“Dynamic Evaluation of Transformer Language Models”, Krause et al 2019
“LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game”, Urbanek et al 2019
“LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game”
“Top-K Off-Policy Correction for a REINFORCE Recommender System”, Chen et al 2018
“Top-K Off-Policy Correction for a REINFORCE Recommender System”
“FEVER: a Large-scale Dataset for Fact Extraction and VERification”, Thorne et al 2018
“FEVER: a large-scale dataset for Fact Extraction and VERification”
“Towards Deep Modeling of Music Semantics Using EEG Regularizers”, Raposo et al 2017
“Towards Deep Modeling of Music Semantics using EEG Regularizers”
“Learning to Organize Knowledge and Answer Questions With N-Gram Machines”, Yang et al 2017
“Learning to Organize Knowledge and Answer Questions with N-Gram Machines”
“Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning”, Zhong et al 2017
“Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning”
“Bolt: Accelerated Data Mining With Fast Vector Compression”, Blalock & Guttag 2017
“Bolt: Accelerated Data Mining with Fast Vector Compression”
“Get To The Point: Summarization With Pointer-Generator Networks”, See et al 2017
“Get To The Point: Summarization with Pointer-Generator Networks”
“Neural Episodic Control”, Pritzel et al 2017
“Improving Neural Language Models With a Continuous Cache”, Grave et al 2016
“Scaling Memory-Augmented Neural Networks With Sparse Reads and Writes”, Rae et al 2016
“Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes”
“Deep Neural Networks for YouTube Recommendations”, Covington et al 2016
“One-shot Learning With Memory-Augmented Neural Networks”, Santoro et al 2016
“PlaNet—Photo Geolocation With Convolutional Neural Networks”, Weyand et al 2016
“PlaNet—Photo Geolocation with Convolutional Neural Networks”
“This Week's Citation Classic: Nearest Neighbor Pattern Classification”, Cover 1982
“This Week's Citation Classic: Nearest Neighbor Pattern Classification”
“Nearest Neighbor Pattern Classification”, Cover & Hart 1967
“ANN-Benchmarks Is a Benchmarking Environment for Approximate Nearest Neighbor Algorithms Search. This Website Contains the Current Benchmarking Results. Please Visit Https://github.com/erikbern/ann-benchmarks/ to Get an Overview over Evaluated Data Sets and Algorithms. Make a Pull Request on Github to Add Your Own Code or Improvements to the Benchmarking System.”
“This Anime Does Not Exist, Search: This Notebook Uses the Precomputed CLIP Feature Vectors for 100k Images from TADNE”
Wikipedia
Miscellaneous
-
https://blog.research.google/2021/05/kelm-integrating-knowledge-graphs-with.html
-
https://every.to/chain-of-thought/gpt-4-is-a-reasoning-engine
-
https://jalammar.github.io/illustrated-retrieval-transformer/
-
https://openai.com/blog/introducing-text-and-code-embeddings/
-
https://platform.openai.com/docs/guides/embeddings/use-cases
-
https://til.simonwillison.net/llms/claude-hacker-news-themes
-
https://twitter.com/andrewwhite01/status/1616933106786738176
-
https://twitter.com/mathemagic1an/status/1595410144522813440
-
https://www.deepmind.com/blog/differentiable-neural-computers
-
https://www.reddit.com/r/ChatGPT/comments/12a0ajb/i_gave_gpt4_persistent_memory_and_the_ability_to/
Link Bibliography
-
https://arxiv.org/abs/2305.18466
: “TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models”, Moritz Hardt, Yu Sun -
https://arxiv.org/abs/2305.16300
: “Landmark Attention: Random-Access Infinite Context Length for Transformers”, Amirkeivan Mohtashami, Martin Jaggi -
https://arxiv.org/abs/2305.05665#facebook
: “ImageBind: One Embedding Space To Bind Them All”, Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Arm, Joulin, Ishan Misra -
https://arxiv.org/abs/2305.01625
: “Unlimiformer: Long-Range Transformers With Unlimited Length Input”, Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley -
https://arxiv.org/abs/2304.14318#google
: “Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb -
https://arxiv.org/abs/2304.06762#nvidia
: “Shall We Pretrain Autoregressive Language Models With Retrieval? A Comprehensive Study”, -
https://arxiv.org/abs/2302.12433
: “ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics”, Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf, Edward W. Ayers, Dragomir Radev, Jeremy Avigad -
https://arxiv.org/abs/2212.10496
: “Precise Zero-Shot Dense Retrieval without Relevance Labels”, Luyu Gao, Xueguang Ma, Jimmy Lin, Jamie Callan -
https://arxiv.org/abs/2212.09741
: “One Embedder, Any Task: Instruction-Finetuned Text Embeddings (INSTRUCTOR)”, -
https://arxiv.org/abs/2212.09410
: “Less Is More: Parameter-Free Text Classification With Gzip”, Zhiying Jiang, Matthew Y. R. Yang, Mikhail Tsirlin, Raphael Tang, Jimmy Lin -
https://arxiv.org/abs/2212.03533#microsoft
: “Text Embeddings by Weakly-Supervised Contrastive Pre-training”, Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei -
https://arxiv.org/abs/2212.01349#facebook
: “NPM: Nonparametric Masked Language Modeling”, Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer -
https://arxiv.org/abs/2211.12561#facebook
: “Retrieval-Augmented Multimodal Language Modeling”, -
https://arxiv.org/abs/2210.08726#google
: “RARR: Attributed Text Generation via Post-hoc Research and Revision”, -
https://arxiv.org/abs/2210.03350#allen
: “Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)”, Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, Mike Lewis -
https://arxiv.org/abs/2209.01975
: “Vote-<em>K</em>: Selective Annotation Makes Language Models Better Few-Shot Learners”, -
https://arxiv.org/abs/2207.13061
: “NewsStories: Illustrating Articles With Visual Summaries”, Reuben Tan, Bryan A. Plummer, Kate Saenko, J. P. Lewis, Avneesh Sud, Thomas Leung -
https://arxiv.org/abs/2207.06300#ibm
: “Re2G: Retrieve, Rerank, Generate”, Michael Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Ankita Rajaram Naik, Pengshan Cai, Alfio Gliozzo -
https://arxiv.org/abs/2206.05314#deepmind
: “Large-Scale Retrieval for Reinforcement Learning”, Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent Sifre, Théophane Weber, Timothy Lillicrap -
https://openreview.net/forum?id=0ZbPmmB61g#google
: “Boosting Search Engines With Interactive Agents”, -
https://arxiv.org/abs/2205.12910#allen
: “NaturalProver: Grounded Mathematical Proof Generation With Language Models”, Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi -
https://arxiv.org/abs/2205.05131#google
: “Unifying Language Learning Paradigms”, -
https://arxiv.org/abs/2203.13224#facebook
: “Language Models That Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion”, Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston -
https://arxiv.org/abs/2201.10005#openai
: “Text and Code Embeddings by Contrastive Pre-Training”, -
https://openai.com/research/webgpt
: “WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, Jacob Hilton, Suchir Balaji, Reiichiro Nakano, John Schulman -
https://arxiv.org/abs/2112.09332#openai
: “WebGPT: Browser-assisted Question-answering With Human Feedback”, -
https://arxiv.org/abs/2112.09118#facebook
: “Contriever: Towards Unsupervised Dense Information Retrieval With Contrastive Learning”, Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Arm, Joulin, Edouard Grave -
https://arxiv.org/abs/2112.07899#google
: “Large Dual Encoders Are Generalizable Retrievers”, -
https://arxiv.org/abs/2112.07381#samsung
: “You Only Need One Model for Open-domain Question Answering”, Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher D. Manning, Kyoung-Gu Woo -
https://arxiv.org/abs/2112.04426#deepmind
: “Improving Language Models by Retrieving from Trillions of Tokens”, -
https://arxiv.org/abs/2111.11432#microsoft
: “Florence: A New Foundation Model for Computer Vision”, -
https://arxiv.org/abs/2111.07991#google
: “LiT: Zero-Shot Transfer With Locked-image Text Tuning”, Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer -
https://openreview.net/forum?id=qw674L9PfQE
: “CLOOB: Modern Hopfield Networks With InfoLOOB Outperform CLIP”, -
https://arxiv.org/abs/2203.08913#google
: “Memorizing Transformers”, Yuhuai Wu, Markus Norman Rabe, DeLesley Hutchins, Christian Szegedy -
https://arxiv.org/abs/2108.08877#google
: “Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models”, Jianmo Ni, Gustavo Hernández Ábrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Cer, Yinfei Yang -
https://arxiv.org/abs/2107.07566#facebook
: “Internet-Augmented Dialogue Generation”, Mojtaba Komeili, Kurt Shuster, Jason Weston -
https://arxiv.org/abs/2106.11097
: “CLIP2Video: Mastering Video-Text Retrieval via Image CLIP”, Han Fang, Pengfei Xiong, Luhui Xu, Yu Chen -
https://arxiv.org/abs/2104.07567#facebook
: “Retrieval Augmentation Reduces Hallucination in Conversation”, Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston -
https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/#baai
: “China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’: The Beijing Academy of Artificial Intelligence (BAAI) Releases Wu Dao 1.0, China’s First Large-scale Pretraining Model.”, Synced -
https://arxiv.org/abs/2102.05918#google
: “ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”, -
https://arxiv.org/abs/1904.08378
: “Dynamic Evaluation of Transformer Language Models”, Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals -
1982-cover.pdf
: “This Week's Citation Classic: Nearest Neighbor Pattern Classification”, Thomas M. Cover