- See Also
-
Links
- “Beyond Memorization: Violating Privacy Via Inference With Large Language Models”, Staab et al 2023
- “FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, Vu et al 2023
- “CausalLM Is Not Optimal for In-context Learning”, Ding et al 2023
- “Simple Synthetic Data Reduces Sycophancy in Large Language Models”, Wei et al 2023
- “Large Language Models Are Few-Shot Health Learners”, Liu et al 2023
- “SeeGULL: A Stereotype Benchmark With Broad Geo-Cultural Coverage Leveraging Generative Models”, Jha et al 2023
- “Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Bitton et al 2023
- “Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”, Aksitov et al 2023
- “Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, Pilault et al 2023
- “Memory Augmented Large Language Models Are Computationally Universal”, Schuurmans 2023
- “Med-PaLM: Large Language Models Encode Clinical Knowledge”, Singhal et al 2022
- “Efficiently Scaling Transformer Inference”, Pope et al 2022
- “Large Language Models Can Self-Improve”, Huang et al 2022
- “FLAN: Scaling Instruction-Finetuned Language Models”, Chung et al 2022
- “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, Tay et al 2022
- “Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, Suzgun et al 2022
- “RARR: Attributed Text Generation via Post-hoc Research and Revision”, Gao et al 2022
- “ReAct: Synergizing Reasoning and Acting in Language Models”, Yao et al 2022
- “Language Models Are Multilingual Chain-of-Thought Reasoners”, Shi et al 2022
- “AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, Soltan et al 2022
- “Inner Monologue: Embodied Reasoning through Planning With Language Models”, Huang et al 2022
- “Solving Quantitative Reasoning Problems With Language Models”, Lewkowycz et al 2022
- “Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, Zhou et al 2022
- “Unifying Language Learning Paradigms”, Tay et al 2022
- “PaLM: Scaling Language Modeling With Pathways”, Chowdhery et al 2022
- “Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, Ahn et al 2022
- “Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance”, Chowdhery & Narang 2022
- “PaLM § Figure 19: [Explaining a Joke / Inference Chaining] Each ‘Input” Was Independently Prepended With the Same 2-shot Exemplar Shown at the Top, and “Model Output’ Shows the Greedy Decoding Output of PaLM 540B. The Two Exemplar Jokes Are Known Jokes (explanations Written by Authors), While All Evaluated Jokes Were Written by the Authors. Of Course, These Jokes Do Share Abstract Premises With Existing Jokes (wordplay, Reliability, Humorous Analogies, Reversal-of-expectations). The Inference Chaining Examples Were Also Written by the Authors.”
- Miscellaneous
- Link Bibliography
See Also
Links
“Beyond Memorization: Violating Privacy Via Inference With Large Language Models”, Staab et al 2023
“Beyond Memorization: Violating Privacy Via Inference with Large Language Models”
“FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, Vu et al 2023
“FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation”
“CausalLM Is Not Optimal for In-context Learning”, Ding et al 2023
“Simple Synthetic Data Reduces Sycophancy in Large Language Models”, Wei et al 2023
“Simple synthetic data reduces sycophancy in large language models”
“Large Language Models Are Few-Shot Health Learners”, Liu et al 2023
“SeeGULL: A Stereotype Benchmark With Broad Geo-Cultural Coverage Leveraging Generative Models”, Jha et al 2023
“SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models”
“Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Bitton et al 2023
“q2d: Turning Questions into Dialogs to Teach Models How to Search”
“Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”, Aksitov et al 2023
“Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”
“Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, Pilault et al 2023
“Memory Augmented Large Language Models Are Computationally Universal”, Schuurmans 2023
“Memory Augmented Large Language Models are Computationally Universal”
“Med-PaLM: Large Language Models Encode Clinical Knowledge”, Singhal et al 2022
“Efficiently Scaling Transformer Inference”, Pope et al 2022
“Large Language Models Can Self-Improve”, Huang et al 2022
“FLAN: Scaling Instruction-Finetuned Language Models”, Chung et al 2022
“U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, Tay et al 2022
“Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, Suzgun et al 2022
“Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”
“RARR: Attributed Text Generation via Post-hoc Research and Revision”, Gao et al 2022
“RARR: Attributed Text Generation via Post-hoc Research and Revision”
“ReAct: Synergizing Reasoning and Acting in Language Models”, Yao et al 2022
“ReAct: Synergizing Reasoning and Acting in Language Models”
“Language Models Are Multilingual Chain-of-Thought Reasoners”, Shi et al 2022
“Language Models are Multilingual Chain-of-Thought Reasoners”
“AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, Soltan et al 2022
“AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”
“Inner Monologue: Embodied Reasoning through Planning With Language Models”, Huang et al 2022
“Inner Monologue: Embodied Reasoning through Planning with Language Models”
“Solving Quantitative Reasoning Problems With Language Models”, Lewkowycz et al 2022
“Solving Quantitative Reasoning Problems with Language Models”
“Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, Zhou et al 2022
“Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”
“Unifying Language Learning Paradigms”, Tay et al 2022
“PaLM: Scaling Language Modeling With Pathways”, Chowdhery et al 2022
“Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, Ahn et al 2022
“Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”
“Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance”, Chowdhery & Narang 2022
“Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance”
“PaLM § Figure 19: [Explaining a Joke / Inference Chaining] Each ‘Input” Was Independently Prepended With the Same 2-shot Exemplar Shown at the Top, and “Model Output’ Shows the Greedy Decoding Output of PaLM 540B. The Two Exemplar Jokes Are Known Jokes (explanations Written by Authors), While All Evaluated Jokes Were Written by the Authors. Of Course, These Jokes Do Share Abstract Premises With Existing Jokes (wordplay, Reliability, Humorous Analogies, Reversal-of-expectations). The Inference Chaining Examples Were Also Written by the Authors.”
Miscellaneous
-
/doc/ai/nn/transformer/gpt/palm/2022-ahn-figure2-saycanqueryinglanguagemodelforoptions.png
-
https://blog.research.google/2022/06/minerva-solving-quantitative-reasoning.html
-
https://blog.research.google/2023/01/google-research-2022-beyond-language.html
-
https://www.lesswrong.com/posts/EHbJ69JDs4suovpLw/testing-palm-prompts-on-gpt3
-
https://www.lesswrong.com/posts/mLuQfS7gmfr4nwTdv/google-s-new-540-billion-parameter-language-model
-
https://www.reddit.com/r/GPT3/comments/twxtwg/how_gpt3_answers_the_google_pathway_sample/
Link Bibliography
-
https://arxiv.org/abs/2310.03214#google
: “FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, -
https://arxiv.org/abs/2308.03958#deepmind
: “Simple Synthetic Data Reduces Sycophancy in Large Language Models”, Jerry Wei, Da Huang, Yifeng Lu, Denny Zhou, Quoc V. Le -
https://arxiv.org/abs/2304.14318#google
: “Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb -
https://arxiv.org/abs/2212.13138#google
: “Med-PaLM: Large Language Models Encode Clinical Knowledge”, -
https://arxiv.org/abs/2211.05102#google
: “Efficiently Scaling Transformer Inference”, -
https://arxiv.org/abs/2210.11610#google
: “Large Language Models Can Self-Improve”, Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han -
https://arxiv.org/abs/2210.11416#google
: “FLAN: Scaling Instruction-Finetuned Language Models”, -
https://arxiv.org/abs/2210.11399#google
: “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, -
https://arxiv.org/abs/2210.09261#google
: “Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, -
https://arxiv.org/abs/2210.08726#google
: “RARR: Attributed Text Generation via Post-hoc Research and Revision”, -
https://arxiv.org/abs/2210.03057#google
: “Language Models Are Multilingual Chain-of-Thought Reasoners”, -
https://arxiv.org/abs/2208.01448#amazon
: “AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, -
https://arxiv.org/abs/2207.05608#google
: “Inner Monologue: Embodied Reasoning through Planning With Language Models”, -
https://arxiv.org/abs/2205.10625#google
: “Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, -
https://arxiv.org/abs/2205.05131#google
: “Unifying Language Learning Paradigms”, -
https://arxiv.org/abs/2204.02311#google
: “PaLM: Scaling Language Modeling With Pathways”, -
https://arxiv.org/abs/2204.01691#google
: “Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”,