Bibliography:

  1. ‘RL exploration’ tag

  2. ‘data pruning’ tag

  3. ‘statistical comparison’ tag

  4. Probing the Decision Boundaries of In-context Learning in Large Language Models

  5. Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement

  6. Artificial Intelligence for Retrosynthetic Planning Needs Both Data and Expert Knowledge

  7. Sparse Universal Transformer

  8. Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

  9. AlpaGasus: Training A Better Alpaca with Fewer Data

  10. Instruction Mining: High-Quality Instruction Data Selection for Large Language Models

  11. No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models

  12. Estimating label quality and errors in semantic segmentation data via any model

  13. Self Expanding Neural Networks

  14. DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

  15. Chatting with GPT-3 for Zero-Shot Human-Like Mobile Automated GUI Testing

  16. TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

  17. q2d: Turning Questions into Dialogs to Teach Models How to Search

  18. Segment Anything

  19. Scaling Expert Language Models with Unsupervised Domain Discovery

  20. Modern Bayesian Experimental Design

  21. Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities

  22. Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula

  23. CDCD: Continuous diffusion for categorical data

  24. Query by Committee Made Real

  25. Weakly supervised structured output learning for semantic segmentation

  26. The Power of Ensembles for Active Learning in Image Classification

  27. Multi-class active learning for image classification

  28. Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

  29. The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

  30. Detecting Label Errors in Token Classification Data

  31. RHO-LOSS: Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

  32. Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy

  33. Multi-Task Self-Training for Learning General Representations

  34. Predictive Coding: a Theoretical and Experimental Review

  35. Dataset Distillation with Infinitely Wide Convolutional Networks

  36. Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning

  37. Adapting the Function Approximation Architecture in Online Reinforcement Learning

  38. B-Pref: Benchmarking Preference-Based Reinforcement Learning

  39. Fully General Online Imitation Learning

  40. When Do Curricula Work?

  41. Dataset Meta-Learning from Kernel Ridge-Regression

  42. Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

  43. BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

  44. Exploring Bayesian Optimization: Breaking Bayesian Optimization into small, sizeable chunks

  45. Small-GAN: Speeding Up GAN Training Using Core-sets

  46. A deep active learning system for species identification and counting in camera trap images

  47. On Warm-Starting Neural Network Training

  48. Accelerating Deep Learning by Focusing on the Biggest Losers

  49. Data Valuation using Reinforcement Learning

  50. BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

  51. BADGE: Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

  52. Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules

  53. Learning Loss for Active Learning

  54. A Recipe for Training Neural Networks

  55. ProductNet: a Collection of High-Quality Datasets for Product Representation Learning

  56. End-to-End Robotic Reinforcement Learning without Reward Engineering

  57. Data Shapley: Equitable Valuation of Data for Machine Learning

  58. Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

  59. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

  60. Computational mechanisms of curiosity and goal-directed exploration

  61. Conditional Neural Processes

  62. Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning

  63. More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch

  64. Fingerprint Policy Optimization for Robust Reinforcement Learning

  65. AutoAugment: Learning Augmentation Policies from Data

  66. Optimization, fast and slow: optimally switching between local and Bayesian optimization

  67. Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications

  68. Active Learning with Partial Feedback

  69. Active, Continual Fine Tuning of Convolutional Neural Networks for Reducing Annotation Efforts

  70. Less is more: sampling chemical space with active learning

  71. The Eighty Five Percent Rule for Optimal Learning

  72. ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks

  73. Learning a Generative Model for Validity in Complex Discrete Structures

  74. Learning by Asking Questions

  75. BlockDrop: Dynamic Inference Paths in Residual Networks

  76. Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

  77. Classification with Costly Features using Deep Reinforcement Learning

  78. Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning

  79. Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification

  80. Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

  81. Active Learning for Convolutional Neural Networks: A Core-Set Approach

  82. Interpretable Active Learning

  83. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

  84. A Tutorial on Thompson Sampling

  85. Learning to Learn from Noisy Web Videos

  86. Teaching Machines to Describe Images via Natural Language Feedback

  87. Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

  88. BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography

  89. PBO: Preferential Bayesian Optimization

  90. OHEM: Training Region-based Object Detectors with Online Hard Example Mining

  91. The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition

  92. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

  93. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

  94. Just Sort It! A Simple and Effective Approach to Active Preference Learning

  95. Learning with Intelligent Teacher: Similarity Control and Knowledge Transfer

  96. Minimax Analysis of Active Learning

  97. Algorithmic and Human Teaching of Sequential Decision Tasks

  98. Bayesian Active Learning for Classification and Preference Learning

  99. Rates of convergence in active learning

  100. The true sample complexity of active learning

  101. Active Testing for Face Detection and Localization

  102. The wisdom of the few: a collaborative filtering approach based on expert opinions from the web

  103. Learning and Example Selection for Object and Pattern Detection

  104. Information-Based Objective Functions for Active Data Selection

  105. Active Learning Literature Survey

  106. Brief Summary of the Panel Discussion at DL Workshop @ICML 2015

  107. 415a798fa1f3d6bade4858ab71ab735aad551407.html

  108. Active Learning

  109. Aurora’s Approach to Development

  110. Active Learning for High Dimensional Inputs Using Bayesian Convolutional Neural Networks

  111. 41cc1ed0e2e1ca39190300331f08f2f21aaa66ac.pdf

  112. AI-Guided Robots Are Ready to Sort Your Recyclables

  113. When Self-Driving Cars Can’t Help Themselves, Who Takes the Wheel?

  114. How a Feel-Good AI Story Went Wrong in Flint: A Machine-Learning Model Showed Promising Results, but City Officials and Their Engineering Contractor Abandoned It.

  115. design#future-tag-features

    [Transclude the forward-link's context]

  116. 2024-zhao-figure11-improvmeentofllmtransformerdecisionboundariesbyusingactivelearning.png

  117. 2023-kaddour-figure3-validationlossesforbertusingselectivebackpropvsreducibleholdoutvsrandomsampling.png

  118. 2023-xie-figure2-doremioptimizationoftrainingperformancetrainstwiceasfast.jpg

  119. 2009-amatriain-figure5-accuracyofnetflixmovierecommendationsbyhowmanynearbyexpertratingsareusedandweighted.jpg

  120. 2009-amatriain-figure6-expertcfvsnearnestneighborerrorrates.jpg

  121. https://bair.berkeley.edu/blog/2019/06/07/data_aug/

  122. https://blog.mldb.ai/blog/posts/2016/10/deepteach/

  123. https://explosion.ai/blog/prodigy-annotation-tool-active-learning

  124. 0c90d6205548307d1b745bf6adeb040954c07e80.html

  125. https://github.com/cranmer/active_sciencing/blob/master/README.md

  126. https://medium.com/cruise/cruise-continuous-learning-machine-30d60f4c691b

  127. https://medium.com/pytorch/road-defect-detection-using-deep-active-learning-98d94fe854d

  128. https://oatml.cs.ox.ac.uk/blog/2019/06/24/batchbald.html

  129. https://openai.com/research/dall-e-2-pre-training-mitigations

  130. https://proceedings.neurips.cc/paper_files/paper/2007/file/a1519de5b5d44b31a01de013b9b51a80-Paper.pdf

  131. e3ed0223e7847719caa1bc18858f0f8e7c404917.pdf

  132. https://research.google/blog/estimating-the-impact-of-training-data-with-reinforcement-learning/

  133. https://research.google/blog/fluid-annotation-an-exploratory-machine-learningpowered-interface-for-faster-image-annotation/

  134. https://research.google/blog/open-sourcing-active-question-reformulation-with-reinforcement-learning/

  135. https://www.cs.ox.ac.uk/people/yarin.gal/website/blog_2248.html

  136. https://www.forbes.com/sites/bradtempleton/2019/04/22/tesla-bets-farm-on-neural-network-based-autonomy-with-impressive-presentation/

  137. https://www.marble.onl/posts/data_takers_and_makers.html

  138. 6f3e5b2cde9ee7119be98b60faa666e24c48f65f.html

  139. https://www.probabilistic-numerics.org/assets/ProbabilisticNumerics.pdf#page=3

  140. https://www.youtube.com/watch?v=Q0nGo2-y0xY

  141. https://www.youtube.com/watch?v=_Ql5vfOPxZU?t=735

  142. https://x.com/elonmusk/status/1787768103449010597

  143. https://x.com/polynoamial/status/1676971503261454340

  144. Probing the Decision Boundaries of In-context Learning in Large Language Models

  145. Aditya Grover

  146. https%253A%252F%252Farxiv.org%252Fabs%252F2406.11233.html

  147. Artificial Intelligence for Retrosynthetic Planning Needs Both Data and Expert Knowledge

  148. %252Fdoc%252Freinforcement-learning%252Fmodel%252Falphago%252F2024-striethkalthoff.pdf.html

  149. Sparse Universal Transformer

  150. Aaron Courville

  151. https%253A%252F%252Farxiv.org%252Fabs%252F2310.07096%2523ibm.html

  152. AlpaGasus: Training A Better Alpaca with Fewer Data

  153. https%253A%252F%252Farxiv.org%252Fabs%252F2307.08701%2523samsung.html

  154. No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models

  155. https%253A%252F%252Farxiv.org%252Fabs%252F2307.06440.html

  156. Estimating label quality and errors in semantic segmentation data via any model

  157. https%253A%252F%252Farxiv.org%252Fabs%252F2307.05080.html

  158. DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

  159. Percy Liang

  160. https%253A%252F%252Farxiv.org%252Fabs%252F2305.10429%2523google.html

  161. TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

  162. https%253A%252F%252Farxiv.org%252Fabs%252F2305.07759%2523microsoft.html

  163. q2d: Turning Questions into Dialogs to Teach Models How to Search

  164. https%253A%252F%252Farxiv.org%252Fabs%252F2304.14318%2523google.html

  165. Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities

  166. https%253A%252F%252Fopenreview.net%252Fforum%253Fid%253DUVDAKQANOW.html

  167. RHO-LOSS: Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

  168. https%253A%252F%252Farxiv.org%252Fabs%252F2206.07137.html

  169. BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

  170. https%253A%252F%252Farxiv.org%252Fabs%252F2006.06856.html

  171. Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules

  172. https%253A%252F%252Farxiv.org%252Fabs%252F1905.05393.html

  173. A Recipe for Training Neural Networks

  174. https%253A%252F%252Fkarpathy.github.io%252F2019%252F04%252F25%252Frecipe%252F.html

  175. AutoAugment: Learning Augmentation Policies from Data

  176. Barret Zoph

  177. https%253A%252F%252Farxiv.org%252Fabs%252F1805.09501%2523google.html

  178. Active Learning with Partial Feedback

  179. https%253A%252F%252Farxiv.org%252Fabs%252F1802.07427.html

  180. The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition

  181. Jonathan Krause

  182. https%253A%252F%252Farxiv.org%252Fabs%252F1511.06789%2523google.html

  183. Learning with Intelligent Teacher: Similarity Control and Knowledge Transfer

  184. %252Fdoc%252Freinforcement-learning%252Fexploration%252Factive-learning%252F2015-vapnik.pdf.html

  185. Rates of convergence in active learning

  186. https%253A%252F%252Fprojecteuclid.org%252Fjournals%252Fannals-of-statistics%252Fvolume-39%252Fissue-1%252FRates-of-convergence-in-active-learning%252F10.1214%252F10-AOS843.full.html

  187. The true sample complexity of active learning

  188. %252Fdoc%252Freinforcement-learning%252Fexploration%252Factive-learning%252F2010-balcan.pdf.html