-
Recurrent Neural Network Based Language Model § Dynamic Evaluation
-
CQK Is The First Unused TLA § Effective GPT-4 Programming
-
Generative AI for Professional Services
-
Sudowrite - Best AI Writing Partner for Fiction
-
design#future-tag-features
[Transclude the forward-link's context]
-
Optical Character Recognition (OCR) in Google Docs
-
‘self-attention’ directory
-
Generating Sequences With Recurrent Neural Networks
-
Dynamic Evaluation of Transformer Language Models
-
In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries
-
TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models
-
Improving Neural Language Models with a Continuous Cache
-
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
-
Compressive Transformers for Long-Range Sequence Modeling
-
Memorizing Transformers
-
Scaling Data-Constrained Language Models
-
A Neural Corpus Indexer for Document Retrieval
-
Absolute Unit NNs: Regression-Based MLPs for Everything § Memorize All The Things
[Transclude the forward-link's context]
-
Many-Shot In-Context Learning
-
https://openai.com/pricing#fine-tuning-models
-
Faster SGD training by minibatch persistency
-
LoRA: Low-Rank Adaptation of Large Language Models
-
LoRA vs Full Fine-tuning: An Illusion of Equivalence
-
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
-
A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Instruction Following Large Language Model
-
‘low-precision NN’ directory
-
Decision Transformer: Reinforcement Learning via Sequence Modeling
-
Gato: A Generalist Agent
-
Toolformer: Language Models Can Teach Themselves to Use Tools
-
DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
-
holy-war#bitrot
[Transclude the forward-link's context]
-
https://maggieappleton.com/lm-sketchbook#daemons
-
The Turing Complete User
-
Naked objects: a technique for designing more expressive systems
-