“‘ML Dataset’ Tag”,2019-09-12 ():
![]()
Bibliography for tag
ai/dataset, most recent first: 5 related tags, 389 annotations, & 38 links (parent).
- See Also
- Links
- “Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?”, et al 2024
- “HtmlRAG: HTML Is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems”, et al 2024
- “Centaur: a Foundation Model of Human Cognition”, et al 2024
- “SimpleStrat: Diversifying Language Model Generation With Stratification”, et al 2024
- “MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering”, et al 2024
- “Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making”, et al 2024
- “Seeing Faces in Things: A Model and Dataset for Pareidolia”, et al 2024
- “H-ARC: A Robust Estimate of Human Performance on the Abstraction and Reasoning Corpus Benchmark”, et al 2024
- “How to Evaluate Jailbreak Methods: A Case Study With the StrongREJECT Benchmark”, et al 2024
- “To Code, or Not To Code? Exploring Impact of Code in Pre-Training”, et al 2024
- “Tails Tell Tales: Chapter-Wide Manga Transcriptions With Character Names”, et al 2024
- “ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning”, 2024
- “Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs”, et al 2024
- “Future Events As Backdoor Triggers: Investigating Temporal Vulnerabilities in LLMs”, et al 2024
- “Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets”, et al 2024
- “APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets”, et al 2024
- “Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?”, et al 2024
- “OlympicArena: Benchmarking Multi-Discipline Cognitive Reasoning for Superintelligent AI”, et al 2024
- “DataComp-LM: In Search of the next Generation of Training Sets for Language Models”, et al 2024
- “GUI-WORLD: A Dataset for GUI-Oriented Multimodal LLM-Based Agents”, et al 2024
- “Newswire: A Large-Scale Structured Database of a Century of Historical News”, et al 2024
- “Are We Done With MMLU?”, et al 2024
- “MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark”, et al 2024
- “LLMs Achieve Adult Human Performance on Higher-Order Theory of Mind Tasks”, et al 2024
- “DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches With TikZ”, et al 2024
- “Sakuga-42M Dataset: Scaling Up Cartoon Research”, et al 2024
- “Can Language Models Explain Their Own Classification Behavior?”, et al 2024
- “Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models”, et al 2024
- “ImageInWords: Unlocking Hyper-Detailed Image Descriptions”, et al 2024
- “GSM1k: A Careful Examination of Large Language Model Performance on Grade School Arithmetic”, et al 2024
- “Building a Large Japanese Web Corpus for Large Language Models”, et al 2024
- “CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs’ (Lack Of) Multicultural Knowledge”, et al 2024
- “VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?”, et al 2024
- “Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators”, et al 2024
- “How Tech Giants Cut Corners to Harvest Data for AI: OpenAI, Google and Meta Ignored Corporate Policies, Altered Their Own Rules and Discussed Skirting Copyright Law As They Sought Online Information to Train Their Newest Artificial Intelligence Systems”, et al 2024
- “Vulnerability Detection With Code Language Models: How Far Are We?”, et al 2024
- “Long-Form Factuality in Large Language Models”, et al 2024
- “COIG-CQIA: Quality Is All You Need for Chinese Instruction Fine-Tuning”, et al 2024
- “RewardBench: Evaluating Reward Models for Language Modeling”, et al 2024
- “Evaluating Text to Image Synthesis: Survey and Taxonomy of Image Quality Metrics”, et al 2024
- “Hierarchical Feature Warping and Blending for Talking Head Animation”, et al 2024
- “Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models”, et al 2024
- “ELLA: Equip Diffusion Models With LLM for Enhanced Semantic Alignment”, et al 2024
- “Investigating Continual Pretraining in Large Language Models: Insights and Implications”, et al 2024
- “Hal-Eval: A Universal and Fine-Grained Hallucination Evaluation Framework for Large Vision Language Models”, et al 2024
- “
ArtPrompt: ASCII Art-Based Jailbreak Attacks against Aligned LLMs”, et al 2024- “DE-COP: Detecting Copyrighted Content in Language Models Training Data”, et al 2024
- “I Think, Therefore I Am: Benchmarking Awareness of Large Language Models Using AwareBench”, et al 2024
- “Can AI Assistants Know What They Don’t Know?”, et al 2024
- “AnimeDiffusion: Anime Diffusion Colorization”, et al 2024
- “I Am a Strange Dataset: Metalinguistic Tests for Language Models”, et al 2024
- “Generative AI for Math: Part I—MathPile: A Billion-Token-Scale Pretraining Corpus for Math”, et al 2023
- “WaveCoder: Widespread And Versatile Enhanced Instruction Tuning With Refined Data Generation”, et al 2023
- “Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach”, et al 2023
- “StarVector: Generating Scalable Vector Graphics Code from Images”, et al 2023
- “Rich Human Feedback for Text-To-Image Generation”, et al 2023
- “TinyGSM: Achieving >80% on GSM8k With Small Language Models”, et al 2023
- “EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models”, 2023
- “Retrieving Conditions from Reference Images for Diffusion Models”, et al 2023
- “Sequential Modeling Enables Scalable Learning for Large Vision Models”, et al 2023
- “BioCLIP: A Vision Foundation Model for the Tree of Life”, et al 2023
- “Efficient Transformer Knowledge Distillation: A Performance Review”, et al 2023
- “GPQA: A Graduate-Level Google-Proof Q&A Benchmark”, et al 2023
- “Dazed & Confused: A Large-Scale Real-World User Study of ReCAPTCHAv2”, et al 2023
- “Instruction-Following Evaluation for Large Language Models”, et al 2023
- “In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search”, et al 2023
- “AnyText: Multilingual Visual Text Generation And Editing”, et al 2023
- “GLaMM: Pixel Grounding Large Multimodal Model”, et al 2023
- “Don’t Make Your LLM an Evaluation Benchmark Cheater”, et al 2023
- “CommonCanvas: An Open Diffusion Model Trained With Creative-Commons Images”, et al 2023
- “FANToM: A Benchmark for Stress-Testing Machine Theory of Mind in Interactions”, et al 2023
- “MuSR: Testing the Limits of Chain-Of-Thought With Multistep Soft Reasoning”, et al 2023
- “Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition”, et al 2023
- “Llemma: An Open Language Model For Mathematics”, et al 2023
- “From Scarcity to Efficiency: Improving CLIP Training via Visual-Enriched Captions”, et al 2023
- “TabLib: A Dataset of 627M Tables With Context”, et al 2023
- “SWE-Bench: Can Language Models Resolve Real-World GitHub Issues?”, et al 2023
- “OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text”, et al 2023
- “FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, et al 2023
- “UltraFeedback: Boosting Language Models With High-Quality Feedback”, et al 2023
- “MTOB: A Benchmark for Learning to Translate a New Language from One Grammar Book”, et al 2023
- “Demystifying CLIP Data”, et al 2023
- “The Cambridge Law Corpus: A Corpus for Legal AI Research”, Östling et al 2023
- “MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models”, et al 2023
- “LongLoRA: Efficient Fine-Tuning of Long-Context Large Language Models”, et al 2023
- “SlimPajama-DC: Understanding Data Combinations for LLM Training”, et al 2023
- “MADLAD-400: A Multilingual And Document-Level Large Audited Dataset”, et al 2023
- “GoodWiki”, 2023
- “From Sparse to Dense: GPT-4 Summarization With Chain of Density (CoD) Prompting”, et al 2023
- “FIMO: A Challenge Formal Dataset for Automated Theorem Proving”, et al 2023
- “American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers”, et al 2023
- “LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models”, et al 2023
- “The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain”, et al 2023
- “Android in the Wild: A Large-Scale Dataset for Android Device Control”, et al 2023
- “DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI”, et al 2023
- “AlpaGasus: Training A Better Alpaca With Fewer Data”, et al 2023
- “InternVid: A Large-Scale Video-Text Dataset for Multimodal Understanding and Generation”, et al 2023
- “Instruction Mining: High-Quality Instruction Data Selection for Large Language Models”, et al 2023
- “Test-Time Training on Video Streams”, et al 2023
- “HEADLINES: A Massive Scale Semantic Similarity Dataset of Historical English”, 2023
- “LeanDojo: Theorem Proving With Retrieval-Augmented Language Models”, et al 2023
- “SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality”, et al 2023
- “ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews”, et al 2023
- “Understanding Social Reasoning in Language Models With Language Models”, et al 2023
- “OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents”, et al 2023
- “AI Is a Lot of Work: As the Technology Becomes Ubiquitous, a Vast Tasker Underclass Is Emerging—And Not Going Anywhere”, 2023
- “Anime Character Identification and Tag Prediction by Multimodality Modeling: Dataset and Model”, et al 2023
- “ChessGPT: Bridging Policy Learning and Language Modeling”, et al 2023
- “Why YouTube Could Give Google an Edge in AI”, 2023
- “Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks”, et al 2023
- “The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora With Web Data, and Web Data Only”, et al 2023
- “Let’s Verify Step by Step”, et al 2023
- “WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia”, et al 2023
- “SeeGULL: A Stereotype Benchmark With Broad Geo-Cultural Coverage Leveraging Generative Models”, et al 2023
- “C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models”, et al 2023
- “TinyStories: How Small Can Language Models Be and Still Speak Coherent English?”, 2023
- “Pick-A-Pic: An Open Dataset of User Preferences for Text-To-Image Generation”, et al 2023
- “LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions”, et al 2023
- “Multi-Party Chat (MultiLIGHT): Conversational Agents in Group Settings With Humans and Models”, et al 2023
- “ImageNet-Hard: The Hardest Images Remaining from a Study of the Power of Zoom and Spatial Biases in Image Classification”, et al 2023
- “Parsing-Conditioned Anime Translation: A New Dataset and Method”, et al 2023c
- “Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling”, et al 2023
- “Abstraction-Perception Preserving Cartoon Face Synthesis”, et al 2023
- “How Well Do Large Language Models Perform in Arithmetic Tasks?”, et al 2023
- “The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset”, et al 2023
- “Large Language Models Are State-Of-The-Art Evaluators of Translation Quality”, 2023
- “Benchmarks for Automated Commonsense Reasoning: A Survey”, 2023
- “Data Selection for Language Models via Importance Resampling”, et al 2023
- “Off-The-Grid MARL (OG-MARL): Datasets With Baselines for Offline Multi-Agent Reinforcement Learning”, et al 2023
- “The BabyLM Challenge: Sample-Efficient Pretraining on a Developmentally Plausible Corpus”, et al 2023
- “The Semantic Scholar Open Data Platform”, et al 2023
- “Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, et al 2023
- “How Close Is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection”, et al 2023
- “Med-PaLM: Large Language Models Encode Clinical Knowledge”, et al 2022
- “Unnatural Instructions: Tuning Language Models With (Almost) No Human Labor”, et al 2022
- “HALIE: Evaluating Human-Language Model Interaction”, et al 2022
- “A Whack-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others”, et al 2022
- “Text Embeddings by Weakly-Supervised Contrastive Pre-Training”, et al 2022
- “The Stack: 3 TB of Permissively Licensed Source Code”, et al 2022
- “UniSumm: Unified Few-Shot Summarization With Multi-Task Pre-Training and Prefix-Tuning”, et al 2022
- “A Creative Industry Image Generation Dataset Based on Captions”, et al 2022
- “AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities”, et al 2022
- “AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies”, et al 2022
- “MMDialog: A Large-Scale Multi-Turn Dialogue Dataset Towards Multi-Modal Open-Domain Conversation”, et al 2022
- “BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning”, et al 2022
- “Dungeons and Data: A Large-Scale NetHack Dataset”, et al 2022
- “Will We Run out of Data? An Analysis of the Limits of Scaling Datasets in Machine Learning”, et al 2022
- “Large Language Models Can Self-Improve”, et al 2022
- “CARP: Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning”, et al 2022
- “MTEB: Massive Text Embedding Benchmark”, et al 2022
- “Most Language Models Can Be Poets Too: An AI Writing Assistant and Constrained Text Generation Studio”, et al 2022
- “Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)”, et al 2022
- “Dynamic Prompt Learning via Policy Gradient for Semi-Structured Mathematical Reasoning”, et al 2022
- “Brain Imaging Generation With Latent Diffusion Models”, et al 2022
- “PaLI: A Jointly-Scaled Multilingual Language-Image Model”, et al 2022
- “FOLIO: Natural Language Reasoning With First-Order Logic”, et al 2022
- “Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned”, et al 2022
- “Bugs in the Data: How ImageNet Misrepresents Biodiversity”, 2022
- “Discovering Bugs in Vision Models Using Off-The-Shelf Image Generation and Captioning”, et al 2022
- “Benchmarking Compositionality With Formal Languages”, et al 2022
- “Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP”, et al 2022
- “Learning to Generalize With Object-Centric Agents in the Open World Survival Game Crafter”, et al 2022
- “Few-Shot Adaptation Works With UnpredicTable Data”, et al 2022
- “Language Models Can Teach Themselves to Program Better”, et al 2022
- “RealTime QA: What’s the Answer Right Now?”, et al 2022
- “NewsStories: Illustrating Articles With Visual Summaries”, et al 2022
- “CelebV-HQ: A Large-Scale Video Facial Attributes Dataset”, et al 2022
- “Why Do Tree-Based Models Still Outperform Deep Learning on Tabular Data?”, et al 2022
- “Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset”, et al 2022
- “Forecasting Future World Events With Neural Networks”, et al 2022
- “RST: ReStructured Pre-Training”, 2022
- “Learning to Generate Artistic Character Line Drawing”, et al 2022
- “Dataset Condensation via Efficient Synthetic-Data Parameterization”, et al 2022
- “Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions”, et al 2022
- “Fine-Grained Image Captioning With CLIP Reward”, et al 2022
- “FLEURS: Few-Shot Learning Evaluation of Universal Representations of Speech”, et al 2022
- “InstructDial: Improving Zero and Few-Shot Generalization in Dialogue through Instruction Tuning”, et al 2022
- “Learning to Model Editing Processes”, 2022
- “Flexible Diffusion Modeling of Long Videos”, et al 2022
- “Housekeep: Tidying Virtual Households Using Commonsense Reasoning”, et al 2022
- “Instruction Induction: From Few Examples to Natural Language Task Descriptions”, et al 2022
- “Down and Across: Introducing Crossword-Solving As a New NLP Benchmark”, et al 2022
- “Automated Crossword Solving”, et al 2022
- “Dialog Inpainting: Turning Documents into Dialogues”, et al 2022
- “SymphonyNet: Symphony Generation With Permutation Invariant Language Model”, et al 2022
- “Building Machine Translation Systems for the Next Thousand Languages”, et al 2022
- “When Does Dough Become a Bagel? Analyzing the Remaining Mistakes on ImageNet”, et al 2022
- “Data Determines Distributional Robustness in Contrastive Language Image Pre-Training (CLIP)”, et al 2022
- “A Challenging Benchmark of Anime Style Recognition”, et al 2022
- “Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks”, et al 2022
- “Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality”, et al 2022
- “KNN-Diffusion: Image Generation via Large-Scale Retrieval”, et al 2022
- “ByT5 Model for Massively Multilingual Grapheme-To-Phoneme Conversion”, et al 2022
- “STaR: Bootstrapping Reasoning With Reasoning”, et al 2022
- “CLIP Meets GamePhysics: Towards Bug Identification in Gameplay Videos Using Zero-Shot Transfer Learning”, et al 2022
- “Bamboo: Building Mega-Scale Vision Dataset Continually With Human-Machine Synergy”, et al 2022
- “Self-Distilled StyleGAN: Towards Generation from Internet Photos”, et al 2022
- “RuCLIP—New Models and Experiments: a Technical Report”, et al 2022
- “Wukong: 100 Million Large-Scale Chinese Cross-Modal Pre-Training Dataset and A Foundation Framework”, et al 2022
- “ROME: Locating and Editing Factual Associations in GPT”, et al 2022
- “DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-To-Image Generative Transformers”, et al 2022
- “PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts”, et al 2022
- “StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets”, et al 2022
- “BLIP: Bootstrapping Language-Image Pre-Training for Unified Vision-Language Understanding and Generation”, et al 2022
- “Can Wikipedia Help Offline Reinforcement Learning?”, et al 2022
- “SWAG: Revisiting Weakly Supervised Pre-Training of Visual Perception Models”, et al 2022
- “CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities”, et al 2022
- “WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation”, et al 2022
- “SynthBio: A Case Study in Faster Curation of Text Datasets”, et al 2022
- “BigDatasetGAN: Synthesizing ImageNet With Pixel-Wise Annotations”, et al 2022
- “ERNIE-ViLG: Unified Generative Pre-Training for Bidirectional Vision-Language Generation”, et al 2021
- “A Fistful of Words: Learning Transferable Visual Models from Bag-Of-Words Supervision”, et al 2021
- “GLIDE: Towards Photorealistic Image Generation and Editing With Text-Guided Diffusion Models”, et al 2021
- “QuALITY: Question Answering With Long Input Texts, Yes!”, et al 2021
- “FRUIT: Faithfully Reflecting Updated Information in Text”, IV et al 2021
- “Models in the Loop: Aiding Crowdworkers With Generative Annotation Assistants”, et al 2021
- “WebGPT: Browser-Assisted Question-Answering With Human Feedback”, et al 2021
- “GLaM: Efficient Scaling of Language Models With Mixture-Of-Experts”, et al 2021
- “MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions”, et al 2021
- “BASIC: Combined Scaling for Open-Vocabulary Image Classification”, et al 2021
- “It’s About Time: Analog Clock Reading in the Wild”, et al 2021
- “Solving Probability and Statistics Problems by Program Synthesis”, et al 2021
- “Few-Shot Self-Rationalization With Natural Language Prompts”, et al 2021
- “AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment”, et al 2021
- “RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning”, et al 2021
- “An Explanation of In-Context Learning As Implicit Bayesian Inference”, et al 2021
- “LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs”, et al 2021
- “Training Verifiers to Solve Math Word Problems”, et al 2021
- “A Connectome of the Drosophila Central Complex Reveals Network Motifs Suitable for Flexible Navigation and Context-Dependent Action Selection”, et al 2021
- “HTCN: Harmonious Text Colorization Network for Visual-Textual Presentation Design”, et al 2021c
- “T0: Multitask Prompted Training Enables Zero-Shot Task Generalization”, et al 2021
- “Can Machines Learn Morality? The Delphi Experiment”, et al 2021
- “Situated Dialogue Learning through Procedural Environment Generation”, et al 2021
- “MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research”, et al 2021
- “TruthfulQA: Measuring How Models Mimic Human Falsehoods”, et al 2021
- “MiniF2F: a Cross-System Benchmark for Formal Olympiad-Level Mathematics”, et al 2021
- “LAION-400-Million Open Dataset”, 2021
- “Transfer Learning for Pose Estimation of Illustrated Characters”, 2021
- “MuSiQue: Multi-Hop Questions via Single-Hop Question Composition”, et al 2021
- “Scaling Vision Transformers”, et al 2021
- “QASPER: A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers”, et al 2021
- “XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond”, et al 2021
- “BEIR: A Heterogenous Benchmark for Zero-Shot Evaluation of Information Retrieval Models”, et al 2021
- “SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network”, et al 2021
- “Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks”, et al 2021
- “NaturalProofs: Mathematical Theorem Proving in Natural Language”, et al 2021
- “Get Your Vitamin C! Robust Fact Verification With Contrastive Evidence (VitaminC)”, et al 2021
- “Are NLP Models Really Able to Solve Simple Math Word Problems?”, et al 2021
- “Measuring Mathematical Problem Solving With the MATH Dataset”, et al 2021
- “WIT: Wikipedia-Based Image Text Dataset for Multimodal Multilingual Machine Learning”, et al 2021
- “A Massive 7T FMRI Dataset to Bridge Cognitive and Computational Neuroscience”, et al 2021
- “Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts”, et al 2021
- “ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”, et al 2021
- “Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Scaling”, et al 2021
- “Scaling Laws for Transfer”, et al 2021
- “Automatic Curation of Large-Scale Datasets for Audio-Visual Representation Learning”, et al 2021
- “MSR-VTT: A Large Video Description Dataset for Bridging Video and Language”, et al 2021
- “CLIP: Learning Transferable Visual Models From Natural Language Supervision”, et al 2021
- “CLIP: Connecting Text and Images: We’re Introducing a Neural Network Called CLIP Which Efficiently Learns Visual Concepts from Natural Language Supervision. CLIP Can Be Applied to Any Visual Classification Benchmark by Simply Providing the Names of the Visual Categories to Be Recognized, Similar to the ‘Zero-Shot’ Capabilities of GPT-2 and GPT-3”, et al 2021
- “The Pile: An 800GB Dataset of Diverse Text for Language Modeling”, et al 2021
- “Selective Eye-Gaze Augmentation To Enhance Imitation Learning In Atari Games”, et al 2020
- “VoxLingua107: a Dataset for Spoken Language Recognition”, Valk & 2020
- “MoGaze: A Dataset of Full-Body Motions That Includes Workspace Geometry and Eye-Gaze”, et al 2020
- “End-To-End Chinese Landscape Painting Creation Using Generative Adversarial Networks”, 2020
- “Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding”, et al 2020
- “Constructing A Multi-Hop QA Dataset for Comprehensive Evaluation of Reasoning Steps”, et al 2020
- “Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus”, et al 2020
- “Open-Domain Question Answering Goes Conversational via Question Rewriting”, et al 2020
- “Digital Voicing of Silent Speech”, 2020
- “A C/C++ Code Vulnerability Dataset With Code Changes and CVE Summaries”, et al 2020
- “MMLU: Measuring Massive Multitask Language Understanding”, et al 2020
- “ETHICS: Aligning AI With Shared Human Values”, et al 2020
- “Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing”, et al 2020
- “CoVoST 2 and Massively Multilingual Speech-To-Text Translation”, et al 2020
- “The Many Faces of Robustness: A Critical Analysis of Out-Of-Distribution Generalization”, et al 2020
- “The NetHack Learning Environment”, et al 2020
- “Anime Crop Datasets: Faces, Figures, & Hands”, et al 2020
- “ForecastQA: A Question Answering Challenge for Event Forecasting With Temporal Text Data”, et al 2020
- “Shortcut Learning in Deep Neural Networks”, et al 2020
- “D4RL: Datasets for Deep Data-Driven Reinforcement Learning”, et al 2020
- “TyDiQA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages”, et al 2020
- “SAYCam: A Large, Longitudinal Audiovisual Dataset Recorded from the Infant’s Perspective”, et al 2020
- “ImageNet-A: Natural Adversarial Examples”, et al 2020
- “Measuring Compositional Generalization: A Comprehensive Method on Realistic Data”, et al 2019
- “Libri-Light: A Benchmark for ASR With Limited or No Supervision”, et al 2019
- “How Can We Know What Language Models Know?”, et al 2019
- “SimpleBooks: Long-Term Dependency Book Dataset With Simplified English Vocabulary for Word-Level Language Modeling”, 2019
- “How Machine Learning Can Help Unlock the World of Ancient Japan”, 2019
- “Compressive Transformers for Long-Range Sequence Modeling”, et al 2019
- “CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning”, et al 2019
- “CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data”, et al 2019
- “T5: Exploring the Limits of Transfer Learning With a Unified Text-To-Text Transformer”, et al 2019
- “Restoring Ancient Text Using Deep Learning (Pythia): a Case Study on Greek Epigraphy”, et al 2019
- “CATER: A Diagnostic Dataset for Compositional Actions and TEmporal Reasoning”, 2019
- “PubMedQA: A Dataset for Biomedical Research Question Answering”, et al 2019
- “ObjectNet: A Large-Scale Bias-Controlled Dataset for Pushing the Limits of Object Recognition Models”, et al 2019
- “No Press Diplomacy: Modeling Multi-Agent Gameplay”, et al 2019
- “Language Modeling State-Of-The-Art Leaderboards”, paperswithcode.com 2019
- “LVIS: A Dataset for Large Vocabulary Instance Segmentation”, et al 2019
- “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank”, et al 2019
- “A Large Single-Participant FMRI Dataset for Probing Brain Responses to Naturalistic Stimuli in Space and Time”, et al 2019
- “OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge”, et al 2019
- “ImageNet-Sketch: Learning Robust Global Representations by Penalizing Local Predictive Power”, et al 2019
- “Cold Case: The Lost MNIST Digits”, 2019
- “SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems”, et al 2019
- “The MineRL 2019 Competition on Sample Efficient Reinforcement Learning Using Human Priors”, et al 2019
- “ProductNet: a Collection of High-Quality Datasets for Product Representation Learning”, et al 2019
- “Benchmarking Neural Network Robustness to Common Corruptions and Perturbations”, 2019
- “Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset”, et al 2019
- “LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game”, et al 2019
- “DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs”, et al 2019
- “A Replication Study: Machine Learning Models Are Capable of Predicting Sexual Orientation From Facial Images”, 2019
- “Language Models Are Unsupervised Multitask Learners”, et al 2019
- “The Omniglot Challenge: a 3-Year Progress Report”, et al 2019
- “Do We Train on Test Data? Purging CIFAR of Near-Duplicates”, 2019
- “The RobotriX: An EXtremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences With Robot Trajectories and Interactions”, Garcia- et al 2019
- “FIGR: Few-Shot Image Generation With Reptile”, 2019
- “Natural Questions: A Benchmark for Question Answering Research”, et al 2019
- “A Style-Based Generator Architecture for Generative Adversarial Networks”, et al 2018
- “ImageNet-Trained CNNs Are Biased towards Texture; Increasing Shape Bias Improves Accuracy and Robustness”, et al 2018
- “CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge”, et al 2018
- “The Open Images Dataset V4: Unified Image Classification, Object Detection, and Visual Relationship Detection at Scale”, et al 2018
- “HotpotQA: A Dataset for Diverse, Explainable Multi-Hop Question Answering”, et al 2018
- “Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization”, et al 2018
- “CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images”, et al 2018
- “A Short Note about Kinetics-600”, et al 2018
- “Cartoon Set”, et al 2018
- “Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations”, 2018
- “Conceptual Captions: A Cleaned, Hypernymed, Image Alt-Text Dataset For Automatic Image Captioning”, et al 2018
- “Know What You Don’t Know: Unanswerable Questions for SQuAD”, et al 2018
- “BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning”, et al 2018
- “Exploring the Limits of Weakly Supervised Pretraining”, et al 2018
- “Newsroom: A Dataset of 1.3 Million Summaries With Diverse Extractive Strategies”, et al 2018
- “GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding”, et al 2018
- “The Sound of Pixels”, et al 2018
- “FEVER: a Large-Scale Dataset for Fact Extraction and VERification”, et al 2018
- “Think You Have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge”, et al 2018
- “SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction”, et al 2018
- “11K Hands: Gender Recognition and Biometric Identification Using a Large Dataset of Hand Images”, 2017
- “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, et al 2017
- “OpenML Benchmarking Suites”, et al 2017
- “WebVision Database: Visual Learning and Understanding from Web Data”, et al 2017
- “A Downsampled Variant of ImageNet As an Alternative to the CIFAR Datasets”, et al 2017
- “Revisiting Unreasonable Effectiveness of Data in Deep Learning Era”, et al 2017
- “Driver Identification Using Automobile Sensor Data from a Single Turn”, et al 2017
- “StreetStyle: Exploring World-Wide Clothing Styles from Millions of Photos”, et al 2017
- “The Kinetics Human Action Video Dataset”, et al 2017
- “WebVision Challenge: Visual Learning and Understanding With Web Data”, et al 2017
- “TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension”, et al 2017
- “Dense-Captioning Events in Videos”, et al 2017
- “BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography”, et al 2017
- “SearchQA: A New Q&A Dataset Augmented With Context from a Search Engine”, et al 2017
- “RACE: Large-Scale ReAding Comprehension Dataset From Examinations”, et al 2017
- “NewsQA: A Machine Comprehension Dataset”, et al 2016
- “MS MARCO: A Human Generated MAchine Reading COmprehension Dataset”, et al 2016
- “Lip Reading Sentences in the Wild”, et al 2016
- “Pointer Sentinel Mixture Models”, et al 2016
- “Deep Learning the City: Quantifying Urban Perception At A Global Scale”, et al 2016
- “Solving General Arithmetic Word Problems”, 2016
- “The LAMBADA Dataset: Word Prediction Requiring a Broad Discourse Context”, et al 2016
- “SQuAD: 100,000+ Questions for Machine Comprehension of Text”, et al 2016
- “Matching Networks for One Shot Learning”, et al 2016
- “Convolutional Sketch Inversion”, et al 2016
- “The MovieLens Datasets: History and Context”, 2015
- “Neural Module Networks”, et al 2015
- “Sketch-Based Manga Retrieval Using Manga109 Dataset”, et al 2015
- “Amazon Reviews: Image-Based Recommendations on Styles and Substitutes”, et al 2015
- “Teaching Machines to Read and Comprehend”, et al 2015
- “LSUN: Construction of a Large-Scale Image Dataset Using Deep Learning With Humans in the Loop”, et al 2015
- “VQA: Visual Question Answering”, et al 2015
- “YFCC100M: The New Data in Multimedia Research”, et al 2015
- “ImageNet Large Scale Visual Recognition Challenge”, et al 2014
- “Microsoft COCO: Common Objects in Context”, et al 2014
- “N-Gram Counts and Language Models from the Common Crawl”, et al 2014
- “Ukiyo-E Search”, 2013
- “UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild”, et al 2012
- “The Caltech-UCSD Birds-200-2011 Dataset”, et al 2011
- “Unbiased Look at Dataset Bias”, 2011
- “Caltech-UCSD Birds 200”, et al 2010
- “Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments”, et al 2008
- “Building a Large Annotated Corpus of English: The Penn Treebank”, et al 1993
- “About the Test Data”
- “DataGemma: AI Open Models Connecting LLMs to Google’s Data Commons”
- “Scale AI Secures $1B Funding at $14B Valuation As Its CEO Predicts Big Revenue Growth and Profitability by Year-End [On Very High Quality Data]”
- “No Robots: Look Ma, an Instruction Dataset That Wasn’t Generated by GPTs!”, Hugging2024
- “Psych-101 Dataset [For Centaur]”
- “FineWeb: Decanting the Web for the Finest Text Data at Scale”
- “Solving Math Word Problems: We’ve Trained a System That Solves Grade School Math Problems With Nearly Twice the Accuracy of a Fine-Tuned GPT-3 Model. It Solves about 90% As Many Problems As Real Kids: a Small Sample of 9-12 Year Olds Scored 60% on a Test from Our Dataset, While Our System Scored 55% on Those Same Problems. This Is Important Because Today’s AI Is Still Quite Weak at Commonsense Multistep Reasoning, Which Is Easy Even for Grade School Kids. We Achieved These Results by Training Our Model to Recognize Its Mistakes, so That It Can Try Repeatedly Until It Finds a Solution That Works”
- “Lip Reading Sentences in the Wild [Video]”
- Sort By Magic
- Wikipedia
- Miscellaneous
- Bibliography