- See Also
- Gwern
-
Links
- “Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models”, Ankner et al 2024
- “Rho-1: Not All Tokens Are What You Need”, Lin et al 2024
- “A Study in Dataset Pruning for Image Super-Resolution”, Moser et al 2024
- “How to Train Data-Efficient LLMs”, Sachdeva et al 2024
- “Autonomous Data Selection With Language Models for Mathematical Texts”, Zhang et al 2024
- “Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding”, Evans et al 2023
- “Does CLIP’s Generalization Performance Mainly Stem from High Train-Test Similarity?”, Mayilvahanan et al 2023
- “Data Filtering Networks”, Fang et al 2023
- “SlimPajama-DC: Understanding Data Combinations for LLM Training”, Shen et al 2023
- “Anchor Points: Benchmarking Models With Much Fewer Examples”, Vivek et al 2023
- “When Less Is More: Investigating Data Pruning for Pretraining LLMs at Scale”, Marion et al 2023
- “Beyond Neural Scaling Laws: Beating Power Law Scaling via Data Pruning”, Sorscher et al 2022
- “Unadversarial Examples: Designing Objects for Robust Vision”, Salman et al 2020
- “Generative Models Are Unsupervised Predictors of Page Quality: A Colossal-Scale Study”, Bahri et al 2020
- “Dataset Distillation”, Wang et al 2018
- “FineWeb: Decanting the Web for the Finest Text Data at Scale”
- Sort By Magic
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Gwern
“Making Anime Faces With StyleGAN”, Gwern 2019
Links
“Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models”, Ankner et al 2024
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
“Rho-1: Not All Tokens Are What You Need”, Lin et al 2024
“A Study in Dataset Pruning for Image Super-Resolution”, Moser et al 2024
“How to Train Data-Efficient LLMs”, Sachdeva et al 2024
“Autonomous Data Selection With Language Models for Mathematical Texts”, Zhang et al 2024
Autonomous Data Selection with Language Models for Mathematical Texts
“Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding”, Evans et al 2023
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding
“Does CLIP’s Generalization Performance Mainly Stem from High Train-Test Similarity?”, Mayilvahanan et al 2023
Does CLIP’s Generalization Performance Mainly Stem from High Train-Test Similarity?
“Data Filtering Networks”, Fang et al 2023
“SlimPajama-DC: Understanding Data Combinations for LLM Training”, Shen et al 2023
SlimPajama-DC: Understanding Data Combinations for LLM Training
“Anchor Points: Benchmarking Models With Much Fewer Examples”, Vivek et al 2023
“When Less Is More: Investigating Data Pruning for Pretraining LLMs at Scale”, Marion et al 2023
When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
“Beyond Neural Scaling Laws: Beating Power Law Scaling via Data Pruning”, Sorscher et al 2022
Beyond neural scaling laws: beating power law scaling via data pruning
“Unadversarial Examples: Designing Objects for Robust Vision”, Salman et al 2020
“Generative Models Are Unsupervised Predictors of Page Quality: A Colossal-Scale Study”, Bahri et al 2020
Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
“Dataset Distillation”, Wang et al 2018
“FineWeb: Decanting the Web for the Finest Text Data at Scale”
FineWeb: decanting the web for the finest text data at scale
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
data-selection
pruning
distillation
Wikipedia
-
Coreset:
Miscellaneous
-
https://aclanthology.org/2023.findings-emnlp.18/
:View External Link:
Link Bibliography
-
https://arxiv.org/abs/2405.20541
: “Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models”, -
https://arxiv.org/abs/2404.07965#microsoft
: “Rho-1: Not All Tokens Are What You Need”, -
https://arxiv.org/abs/2402.07625
: “Autonomous Data Selection With Language Models for Mathematical Texts”, -
https://arxiv.org/abs/2312.05328#deepmind
: “Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding”, -
https://arxiv.org/abs/2309.17425#apple
: “Data Filtering Networks”, -
https://arxiv.org/abs/2309.10818#cerebras
: “SlimPajama-DC: Understanding Data Combinations for LLM Training”, -
https://arxiv.org/abs/2206.14486
: “Beyond Neural Scaling Laws: Beating Power Law Scaling via Data Pruning”,