Bibliography:

  1. ‘GPT-3’ tag

  2. Inside the OpenAI ChatGPT Launch—And Future

  3. Can LLMs be Scammed? A Baseline Measurement Study

  4. The Rise of AI-Generated Content in Wikipedia

  5. On scalable oversight with weak LLMs judging strong LLMs

  6. APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

  7. Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

  8. Designing a Dashboard for Transparency and Control of Conversational AI

  9. Delving into ChatGPT usage in academic writing through excess vocabulary

  10. Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays

  11. LLMs achieve adult human performance on higher-order theory of mind tasks

  12. Can Language Models Explain Their Own Classification Behavior?

  13. The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

  14. FABLES: Evaluating faithfulness and content selection in book-length summarization

  15. Vulnerability Detection with Code Language Models: How Far Are We?

  16. The NSA Warns That US Adversaries Free to Mine Private Data May Have an AI Edge: Gilbert Herrera, who leads research at the National Security Agency, says large language models are incredibly useful—and a bit of a headache—for America’s intelligence machine

  17. Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

  18. Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap

  19. Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs

  20. Who Is AI Replacing? The Impact of Generative AI on Online Freelancing Platforms

  21. ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

  22. Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models

  23. The Non-Effect of Sampling Temperature on Problem Solving in GPT-3.5/GPT-4

  24. I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench

  25. Does Using ChatGPT Result in Human Cognitive Augmentation?

  26. A Vision Check-up for Language Models

  27. Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach

  28. TinyGSM: achieving >80% on GSM8k with small language models

  29. Universal Self-Consistency for Large Language Model Generation

  30. PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

  31. Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations

  32. InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

  33. Data Contamination Through the Lens of Time

  34. Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

  35. Large language models can replicate cross-cultural differences in personality

  36. Beyond Memorization: Violating Privacy Via Inference with Large Language Models

  37. GeoLLM: Extracting Geospatial Knowledge from Large Language Models

  38. Can a computer outfake a human [personality]?

  39. Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

  40. Using Large Language Models for Qualitative Analysis Can Introduce Serious Bias

  41. MTOB: A Benchmark for Learning to Translate a New Language from One Grammar Book

  42. Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

  43. The Cambridge Law Corpus: A Corpus for Legal AI Research

  44. Assessing the nature of large language models: A caution against anthropocentrism

  45. A boy saw 17 doctors over 3 years for chronic pain. ChatGPT found the diagnosis

  46. Taken out of context: On measuring situational awareness in LLMs

  47. Investigating the Existence of ‘Secret Language’ in Language Models

  48. Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow

  49. Machine-Assisted Social Psychology Hypothesis Generation

  50. Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

  51. Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

  52. Explaining Competitive-Level Programming Solutions using LLMs

  53. Lost in the Middle: How Language Models Use Long Contexts

  54. Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models

  55. Language models are weak learners

  56. Understanding Social Reasoning in Language Models with Language Models

  57. Evaluating Superhuman Models with Consistency Checks

  58. Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks

  59. Can large language models democratize access to dual-use biotechnology?

  60. Iterative Translation Refinement with Large Language Models

  61. Don’t Want Students to Rely on ChatGPT? Have Them Use It: It’s easy to forget how little students and educators understand generative AI’s flaws. Once they actually try it out, they’ll see that it can’t replace them

  62. The Exciting Potential for ChatGPT in Obstetrics and Gynecology

  63. Do GPTs Produce Less Literal Translations?

  64. The False Promise of Imitating Proprietary LLMs

  65. Learning to Generate Novel Scientific Directions with Contextualized Literature-based Discovery

  66. How Language Model Hallucinations Can Snowball

  67. LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

  68. Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition

  69. Generative AI at Work

  70. Humans in Humans Out: On GPT Converging Toward Common Sense in both Success and Failure

  71. Language Models can Solve Computer Tasks

  72. Performance of ChatGPT on free-response, clinical reasoning exams

  73. How well do Large Language Models perform in Arithmetic tasks?

  74. Larger language models do in-context learning differently

  75. Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

  76. Predicting Consumer Contracts [With GPT-3]

  77. Use GPT-3 incorrectly: reduce costs 40× and increase speed by 5×

  78. A Judge Just Used ChatGPT to Make a Court Decision: The case is the first time a court has admitted to using the AI text generator’s answers in a legal ruling

  79. Co-Writing with Opinionated Language Models Affects Users’ Views

  80. The inside story of ChatGPT: How OpenAI founder Sam Altman built the world’s hottest technology with billions from Microsoft

  81. How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

  82. Can GPT-3 produce new ideas? Partially automating Robin Hanson and others § If you never miss a plane…

  83. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment

  84. GPT-3 Takes the Bar Exam

  85. Precise Zero-Shot Dense Retrieval without Relevance Labels

  86. Self-Instruct: Aligning Language Models with Self-Generated Instructions

  87. Emergent Analogical Reasoning in Large Language Models

  88. Harvey, which uses AI to answer legal questions, lands cash from OpenAI

  89. LMentry: A Language Model Benchmark of Elementary Language Tasks

  90. Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)

  91. How persuasive is AI-generated argumentation? An analysis of the quality of an argumentative text produced by the GPT-3 AI text generator

  92. Out of One, Many: Using Language Models to Simulate Human Samples

  93. What does a platypus look like? Generating customized prompts for zero-shot image classification (CuPL)

  94. Using Large Language Models to Simulate Multiple Humans

  95. Limitations of Language Models in Arithmetic and Symbolic Induction

  96. RealTime QA: What’s the Answer Right Now?

  97. GODEL: Large-Scale Pre-Training for Goal-Directed Dialog

  98. Can GPT-3 write an academic paper on itself, with minimal human input?

  99. NaturalProver: Grounded Mathematical Proof Generation with Language Models

  100. OPT: Open Pre-trained Transformer Language Models

  101. InstructGPT: Training language models to follow instructions with human feedback

  102. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

  103. Impact of Pretraining Term Frequencies on Few-Shot Reasoning

  104. Contracts in the Age of Smart Readers

  105. Memory-assisted prompt editing to improve GPT-3 after deployment

  106. CommonsenseQA 2.0: Exposing the Limits of AI through Gamification

  107. Limits of Using Artificial Intelligence and GPT-3 in Patent Prosecution

  108. What Can a Generative Language Model Answer About a Passage?

  109. Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

  110. Scaling Laws for Autoregressive Generative Modeling

  111. GPT-3: Its Nature, Scope, Limits, and Consequences

  112. MMLU: Measuring Massive Multitask Language Understanding

  113. GPT-3: Language Models are Few-Shot Learners

  114. Extrapolating to Unnatural Language Processing With GPT-3’s In-Context Learning: The Good, the Bad, and the Mysterious

  115. 53906a0a199a213fa1bce0b97ecad6b5063931e4.html

  116. Janus

  117. Fine-Tuning Is Not Sufficient for Capability Elicitation

  118. 3c2e4d110a8c6ea80f1da299394f1bd30b760862.html

  119. Connecting the Dots: LLMs Can Infer & Verbalize Latent Structure from Training Data

  120. Reward Hacking Behavior Can Generalize across Tasks

  121. Who Models the Models That Model Models? An Exploration of GPT-3’s In-Context Model Fitting Ability

  122. GPT-3 Catching Fish in Morse Code

  123. A Robot Wrote This Entire Article. Are You Scared Yet, Human? We Asked GPT-3, OpenAI’s Powerful New Language Generator, to Write an Essay for Us from Scratch. The Assignment? To Convince Us Robots Come in Peace | For More about GPT-3 and How This Essay Was Written and Edited, Please Read Our Editor’s Note Below

  124. You’re Right, Spaces Make All the Difference! Copycat Is Toast! (Except for the Last One :-) (GPT-3 Output in Red).

  125. Playing #chess With GPT-3. Built Using Chess.js, Chessboard.js and @OpenAI’s GPT-3. White Is Me, Black Is GPT-3. GPT-3 Went for the Capture First and Did a Castling Move. Amazing!

  126. I Think ‘GPT-3 Can’t Do Parity Checking’ Isn’t Quite Right. It Can Clearly Pattern Match the Algorithm, Almost Perfectly. It’s Just a Little Mistake Prone. Here, I Invented a Syntax for Having It Evaluate Parity on Each Pair of Digits. It...almost Gets It Right.

  127. I Asked GPT-3 about Xinjiang and It Broke...The Pro-CCP Responses Seem to Have Worse English, like including ‘the’ in ‘the Stability Maintenance’. Unnecessary Articles Are a Tic of ESL Speakers. The Topic Seems to Prompt GPT to Draw from Either Western or Chinese State Media Sources, With the Politics That Come With It.

  128. GPT-3 calculating derivatives

  129. The Examples Are Indeed Extremely Simple on Purpose (otherwise It’s Hard to Communicate Efficiently What’s Happening to Non-Metamath Experts). That Being Said, We’re Still Pretty Far Away from IMOs; but This Is Definitely a Goal for Us, and One We’re Actively Working Towards!

  130. design#future-tag-features

    [Transclude the forward-link's context]

  131. 2023-manvi-figure3-performanceofllmsandtabularmethodstopredictpopulationworldwide.png

  132. 2023-brynjolfsson-w31161-improvementincustomercomplaintresolutionperhourusinggpt3.jpg

  133. https://apnews.com/article/brazil-artificial-intelligence-porto-alegre-5afd1240afe7b6ac202bb0bbc45e08d4

  134. bb6b5f6661cfeab189d82c2951fb490b34a1f3d4.html

  135. https://archive.is/BtuOG

  136. https://automated.beehiiv.com/p/aiimmunity-challenge-lessons-clinical-research-exam

  137. 9ca63f25ac44434658bdb5ceaa475acd802abbbe.html

  138. https://chat.openai.com/share/25124525-0bad-4c13-ae5a-ae4beac60360

  139. 16acf748e692ba02b1731ca5262169d6a3c2d309.html

  140. https://davidabell.substack.com/p/playing-around-with-machine-translation

  141. https://dropbox.tech/machine-learning/prompt-injection-with-control-characters-openai-chatgpt-llm

  142. https://github.com/desik1998/MathWithLLMs

  143. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2812620

  144. https://jxnl.github.io/instructor/blog/2023/11/05/chain-of-density/

  145. b272f657beb41f780daeefd4da1c58566a855360.html

  146. https://medium.com/@JarrettYe/casting-a-spell-on-chatgpt-let-it-write-anki-cards-for-you-a-prompt-engineering-case-fd7d577b9d94

  147. https://model-checking.github.io/kani-verifier-blog/2023/05/01/writing-code-with-chatgpt-improve-it-with-kani.html

  148. https://news.ycombinator.com/item?id=39557213

  149. https://openai.com/blog/function-calling-and-other-api-updates#function-calling

  150. https://osf.io/preprints/psyarxiv/dc6tz/

  151. https://restofworld.org/2023/ai-revolution-outsourced-workers/

  152. https://tytonpartners.com/app/uploads/2023/10/GenAI-IN-HIGHER-EDUCATION-FALL-2023-UPDATE-TIME-FOR-CLASS-STUDY.pdf#page=4

  153. 5e7c7899495bcb3b1ee3e0a30896a65302f70b55.pdf#page=4

  154. https://www.ft.com/content/9aeb482d-f781-45c0-896f-38fdcc912139

  155. https://www.getlibretto.com/blog/does-it-matter-which-examples-you-choose-for-few-shot-prompting

  156. f8d22d9d91fbaba1c19f3ea199f16276d7f6b6ae.html

  157. https://www.integrity-research.com/ai-fails-insider-trading-test/

  158. 205c4ac71cb742a634cd4b4f721e1fb40b1997ad.html

  159. https://www.lesswrong.com/posts/3ou8DayvDXxufkjHD/openai-api-base-models-are-not-sycophantic-at-any-size

  160. https://www.lesswrong.com/posts/qbbaF79uJqvmWZELv/real-life-sort-by-controversial

  161. 94a696d1b720a011ca93ef3b26ce27d324729d80.html

  162. https://www.nytimes.com/2023/06/08/business/khan-ai-gpt-tutoring-bot.html

  163. https://www.nytimes.com/2023/12/13/technology/chatbot-cheating-schools-students.html

  164. https://www.pewresearch.org/short-reads/2024/03/26/americans-use-of-chatgpt-is-ticking-up-but-few-trust-its-election-information/

  165. c13afc064ae13d0b43dcbc4fff6ad820cf69dafc.html

  166. https://www.pnas.org/doi/abs/10.1073/pnas.2405460121

  167. https://www.pnas.org/doi/full/10.1073/pnas.2317967121

  168. https://www.reddit.com/r/ChatGPT/comments/15et6f2/well_i_got_what_i_asked_for/

  169. https://www.reddit.com/r/OpenAI/comments/xlvygv/artifical_intelligence_allows_me_to_get_straight/

  170. bd95cc434bdf5b3878423a705419ed7ed3c6078f.html

  171. https://www.reddit.com/r/TrueOffMyChest/comments/12zjiwq/my_wifes_company_has_started_replacing_positions/jhtkckq/

  172. 168dc72046f8dd4435a172284153fd3f21b44bac.html

  173. https://www.supersimple.io/blog/gpt-4-fine-tuning-early-access

  174. https://www.theguardian.com/technology/commentisfree/2020/sep/11/artificial-intelligence-robot-writing-gpt-3

  175. https://www.vice.com/en/article/5d93p3/what-happens-when-you-ask-ai-to-control-your-life

  176. https://www.wired.com/story/china-chatgpt-opportunists-grifters-hard-at-work/

  177. https://x.com/DrJimFan/status/1733904000745955352

  178. https://x.com/Sirupsen/status/1673309920769323008

  179. https://x.com/ahr_like_air/status/1682885469632360448

  180. https://x.com/amasad/status/1704323196944527624

  181. https://x.com/ben_golub/status/1665030874272866305

  182. https://x.com/grantslatton/status/1703913578036904431

  183. https://x.com/joshgans/status/1656307700244815872

  184. https://x.com/kenshinsamurai9/status/1662510532585291779

  185. https://x.com/michaeltefula/status/1285505897108832257

  186. https://x.com/mplappert/status/1663892732652273664

  187. https://x.com/nabeelqu/status/1703967073150304728

  188. https://x.com/paulnovosad/status/1655925767333658626

  189. https://x.com/shinboson/status/1794570054165729303

  190. Can LLMs be Scammed? A Baseline Measurement Study

  191. https%253A%252F%252Farxiv.org%252Fabs%252F2410.13893.html

  192. APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

  193. Caiming Xiong—Home Page

  194. https%253A%252F%252Farxiv.org%252Fabs%252F2406.18518%2523salesforce.html

  195. Designing a Dashboard for Transparency and Control of Conversational AI

  196. https%253A%252F%252Farxiv.org%252Fabs%252F2406.07882.html

  197. Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays

  198. https%253A%252F%252Fwww.sciencedirect.com%252Fscience%252Farticle%252Fpii%252FS2666920X24000109.html

  199. LLMs achieve adult human performance on higher-order theory of mind tasks

  200. https%253A%252F%252Farxiv.org%252Fabs%252F2405.18870%2523google.html

  201. Vulnerability Detection with Code Language Models: How Far Are We?

  202. https%253A%252F%252Farxiv.org%252Fabs%252F2403.18624.html

  203. The NSA Warns That US Adversaries Free to Mine Private Data May Have an AI Edge: Gilbert Herrera, who leads research at the National Security Agency, says large language models are incredibly useful—and a bit of a headache—for America’s intelligence machine

  204. https%253A%252F%252Fwww.wired.com%252Fstory%252Ffast-forward-nsa-warns-us-adversaries-private-data-ai-edge%252F.html

  205. Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap

  206. https%253A%252F%252Farxiv.org%252Fabs%252F2402.19450.html

  207. Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs

  208. https%253A%252F%252Farxiv.org%252Fabs%252F2402.14903.html

  209. Who Is AI Replacing? The Impact of Generative AI on Online Freelancing Platforms

  210. https%253A%252F%252Fpapers.ssrn.com%252Fsol3%252Fpapers.cfm%253Fabstract_id%253D4602944.html

  211. ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

  212. https%253A%252F%252Farxiv.org%252Fabs%252F2402.11753.html

  213. Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

  214. https%253A%252F%252Farxiv.org%252Fabs%252F2310.08678.html

  215. GeoLLM: Extracting Geospatial Knowledge from Large Language Models

  216. Stefano Ermon

  217. https%253A%252F%252Farxiv.org%252Fabs%252F2310.06213.html

  218. Can a computer outfake a human [personality]?

  219. %252Fdoc%252Fpsychology%252Fpersonality%252F2023-phillips.pdf.html

  220. Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

  221. https%253A%252F%252Farxiv.org%252Fabs%252F2310.04406.html

  222. The Cambridge Law Corpus: A Corpus for Legal AI Research

  223. https%253A%252F%252Farxiv.org%252Fabs%252F2309.12269.html

  224. Taken out of context: On measuring situational awareness in LLMs

  225. Owain Evans, AI Alignment Researcher

  226. https%253A%252F%252Farxiv.org%252Fabs%252F2309.00667.html

  227. Machine-Assisted Social Psychology Hypothesis Generation

  228. %252Fdoc%252Fai%252Fnn%252Ftransformer%252Fgpt%252F3%252Fnonfiction%252F2024-banker.pdf.html

  229. Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

  230. https%253A%252F%252Farxiv.org%252Fabs%252F2307.06439%2523microsoft.html

  231. Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

  232. Furu Wei

  233. https%253A%252F%252Farxiv.org%252Fabs%252F2307.05300%2523microsoft.html

  234. Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models

  235. https%253A%252F%252Farxiv.org%252Fabs%252F2308.01404.html

  236. Understanding Social Reasoning in Language Models with Language Models

  237. https%253A%252F%252Farxiv.org%252Fabs%252F2306.15448.html

  238. The False Promise of Imitating Proprietary LLMs

  239. Sergey Levine

  240. https%253A%252F%252Farxiv.org%252Fabs%252F2305.15717.html

  241. How Language Model Hallucinations Can Snowball

  242. https%253A%252F%252Farxiv.org%252Fabs%252F2305.13534.html

  243. Performance of ChatGPT on free-response, clinical reasoning exams

  244. https%253A%252F%252Fwww.medrxiv.org%252Fcontent%252F10.1101%252F2023.03.24.23287731.full.html

  245. How well do Large Language Models perform in Arithmetic tasks?

  246. https%253A%252F%252Farxiv.org%252Fabs%252F2304.02015%2523alibaba.html

  247. Larger language models do in-context learning differently

  248. Jason Wei

  249. Yi Tay

  250. https%253A%252F%252Farxiv.org%252Fabs%252F2303.03846%2523google.html

  251. Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

  252. https%253A%252F%252Farxiv.org%252Fabs%252F2302.06476.html

  253. Predicting Consumer Contracts [With GPT-3]

  254. %252Fdoc%252Flaw%252F2022-kolt.pdf.html

  255. A Judge Just Used ChatGPT to Make a Court Decision: The case is the first time a court has admitted to using the AI text generator’s answers in a legal ruling

  256. https%253A%252F%252Fwww.vice.com%252Fen%252Farticle%252Fk7bdmv%252Fjudge-used-chatgpt-to-make-court-decision.html

  257. Co-Writing with Opinionated Language Models Affects Users’ Views

  258. https%253A%252F%252Farxiv.org%252Fabs%252F2302.00560.html

  259. Can GPT-3 produce new ideas? Partially automating Robin Hanson and others § If you never miss a plane…

  260. https%253A%252F%252Fnunosempere.com%252Fblog%252F2023%252F01%252F11%252Fcan-gpt-produce-ideas%252F%2523if-you-never-miss-a-plane.html

  261. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment

  262. https%253A%252F%252Fmededu.jmir.org%252F2023%252F1%252Fe45312%252F.html

  263. GPT-3 Takes the Bar Exam

  264. https%253A%252F%252Farxiv.org%252Fabs%252F2212.14402.html

  265. Precise Zero-Shot Dense Retrieval without Relevance Labels

  266. https%253A%252F%252Farxiv.org%252Fabs%252F2212.10496.html

  267. Self-Instruct: Aligning Language Models with Self-Generated Instructions

  268. Yizhong Wang—University of Washington

  269. Hannaneh Hajishirzi—University of Washington

  270. https%253A%252F%252Farxiv.org%252Fabs%252F2212.10560.html

  271. Harvey, which uses AI to answer legal questions, lands cash from OpenAI

  272. https%253A%252F%252Ftechcrunch.com%252F2022%252F11%252F23%252Fharvey-which-uses-ai-to-answer-legal-questions-lands-cash-from-openai%252F.html

  273. Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)

  274. Noah A. Smith

  275. Mike Lewis

  276. https%253A%252F%252Farxiv.org%252Fabs%252F2210.03350%2523allen.html

  277. How persuasive is AI-generated argumentation? An analysis of the quality of an argumentative text produced by the GPT-3 AI text generator

  278. https%253A%252F%252Fcontent.iospress.com%252Farticles%252Fargument-and-computation%252Faac210026.html

  279. What does a platypus look like? Generating customized prompts for zero-shot image classification (CuPL)

  280. https%253A%252F%252Farxiv.org%252Fabs%252F2209.03320.html

  281. Can GPT-3 write an academic paper on itself, with minimal human input?

  282. %252Fdoc%252Fai%252Fnn%252Ftransformer%252Fgpt%252F3%252Fnonfiction%252F2022-gpt3.pdf%2523page%253D2.html

  283. NaturalProver: Grounded Mathematical Proof Generation with Language Models

  284. Hannaneh Hajishirzi—University of Washington

  285. https%253A%252F%252Farxiv.org%252Fabs%252F2205.12910%2523allen.html

  286. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

  287. Mike Lewis

  288. Hannaneh Hajishirzi—University of Washington

  289. Luke Zettlemoyer

  290. https%253A%252F%252Farxiv.org%252Fabs%252F2202.12837%2523facebook.html

  291. CommonsenseQA 2.0: Exposing the Limits of AI through Gamification

  292. https%253A%252F%252Farxiv.org%252Fabs%252F2201.05320%2523allen.html

  293. Limits of Using Artificial Intelligence and GPT-3 in Patent Prosecution

  294. %252Fdoc%252Flaw%252F2022-tu.pdf.html

  295. What Can a Generative Language Model Answer About a Passage?

  296. https%253A%252F%252Faclanthology.org%252F2021.mrqa-1.7.pdf.html

  297. Scaling Laws for Autoregressive Generative Modeling

  298. Jared Kaplan

  299. Speaker Details: EmTech MIT 2023

  300. Alec Radford

  301. Aditya A. Ramesh

  302. John Schulman’s Homepage

  303. Sam McCandlish

  304. https%253A%252F%252Farxiv.org%252Fabs%252F2010.14701%2523openai.html

  305. MMLU: Measuring Massive Multitask Language Understanding

  306. https://people.eecs.berkeley.edu/~hendrycks/

  307. Steven's Web Thoughts

  308. Andy Zou

  309. Mantas Mazeika

  310. Jacob Steinhardt

  311. https%253A%252F%252Farxiv.org%252Fabs%252F2009.03300.html