-
AI and Compute
-
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
-
Towards a Human-like Open-Domain Chatbot
-
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
-
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
-
GPT-3: Language Models are Few-Shot Learners
-
The Evolved Transformer
-
https://mlcommons.org/
-
https://www.hpcwire.com/2019/03/19/aws-upgrades-its-gpu-backed-ai-inference-platform/
-
https://aws.amazon.com/blogs/aws/amazon-ec2-update-inf1-instances-with-aws-inferentia-chips-for-high-performance-cost-effective-inferencing/
-
https://arxiv.org/pdf/2104.10350.pdf#page=6
-
Attention Is All You Need
-
Energy and Policy Considerations for Deep Learning in NLP
-
The Evolved Transformer
-
https://arxiv.org/pdf/2104.10350#page=21&org=google
-
https://arxiv.org/pdf/2104.10350.pdf#page=9
-
https://arxiv.org/pdf/2104.10350.pdf#page=3
-
https://www.gstatic.com/gumdrop/sustainability/google-2020-environmental-report.pdf
-
https://arxiv.org/pdf/2104.10350.pdf#page=14
-
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
-