Bibliography:

  1. ‘GPT’ tag

  2. latex2unicode.py

  3. CQK Is The First Unused TLA

  4. They all use it

  5. Business Spending on AI Surged 500% This Year to $13.8 Billion

  6. Alphabet Q3 Earnings Call: CEO Sundar Pichai’s Remarks

  7. Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks

  8. MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

  9. Project Zero: From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code

  10. Evaluation of OpenAI o1: Opportunities and Challenges of AGI

  11. Language Models Learn to Mislead Humans via RLHF

  12. Using ChatGPT to Reverse Engineer Minified JavaScript

  13. 89d194a9bd2f95cb5e035f371810f20842f2f652.html

  14. SWE-Bench Technical Report: 22%

  15. AI-powered coding pulls in almost $1bn of funding to claim ‘killer app’ status

  16. Prompt Injection in ‘Resolve Vulnerabilty’ Results in Arbitrary Command Execution in Victim’s Pipeline

  17. To Code, or Not To Code? Exploring Impact of Code in Pre-training

  18. Replacing My Right Hand With AI

  19. 076e50f5dc692923bc072d387bd8f3911e9cad53.html

  20. APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

  21. Diffusion On Syntax Trees For Program Synthesis

  22. A Peter Thiel-Backed AI Startup, Cognition Labs, Seeks $2 Billion Valuation: Funding round could increase startup’s valuation nearly sixfold in a matter of weeks, reflecting AI frenzy

  23. Vulnerability Detection with Code Language Models: How Far Are We?

  24. Gold-Medalist Coders Build an AI That Can Do Their Job for Them: A new startup called Cognition AI can turn a user’s prompt into a website or video game

  25. TestGen-LLM: Automated Unit Test Improvement using Large Language Models at Meta

  26. The Impact of AI Tool on Engineering at ANZ Bank: An Empirical Study on GitHub Copilot Within a Corporate Environment

  27. CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

  28. Coding on Copilot: 2023 Data Shows Downward Pressure on Code Quality, Plus Projections for 2024

  29. Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

  30. Leveraging Large Language Models to Boost Dafny’s Developers Productivity

  31. WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation

  32. StarVector: Generating Scalable Vector Graphics Code from Images

  33. Universal Self-Consistency for Large Language Model Generation

  34. LLM-Assisted Code Cleaning For Training Accurate Code Generators

  35. A Coder Considers the Waning Days of the Craft: Coding has always felt to me like an endlessly deep and rich domain. Now I find myself wanting to write a eulogy for it

  36. ChipNeMo: Domain-Adapted LLMs for Chip Design

  37. CodeFusion: A Pre-trained Diffusion Model for Code Generation

  38. Eureka: Human-Level Reward Design via Coding Large Language Models

  39. Data Contamination Through the Lens of Time

  40. SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

  41. Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

  42. PassUntil: Predicting Emergent Abilities with Infinite Resolution Evaluation

  43. Security Weaknesses of Copilot Generated Code in GitHub

  44. Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

  45. Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems

  46. Insights into Stack Overflow’s traffic: We’re setting the record straight

  47. Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow

  48. Explaining Competitive-Level Programming Solutions using LLMs

  49. InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

  50. AI Is a Lot of Work: As the technology becomes ubiquitous, a vast tasker underclass is emerging—and not going anywhere

  51. When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming (CDHF)

  52. CodeCompose: A Large-Scale Industrial Deployment of AI-assisted Code Authoring

  53. Chatting with GPT-3 for Zero-Shot Human-Like Mobile Automated GUI Testing

  54. Large Language Model Programs

  55. StarCoder: may the source be with you!

  56. Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding

  57. LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

  58. Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes

  59. How Secure is Code Generated by ChatGPT?

  60. Today was the first day that I could definitively say that GPT-4 has saved me a substantial amount of tedious work

  61. Language Models can Solve Computer Tasks

  62. Introducing Microsoft 365 Copilot—your copilot for work

  63. Reflexion: Language Agents with Verbal Reinforcement Learning

  64. Large Language Models and Simple, Stupid Bugs

  65. Larger language models do in-context learning differently

  66. ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics

  67. CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code

  68. Faithful Chain-of-Thought Reasoning

  69. Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

  70. Google is asking employees to test potential ChatGPT competitors, including a chatbot called 'Apprentice Bard'

  71. An Analysis of the Automatic Bug Fixing Performance of ChatGPT

  72. Connor Leahy on Aliens, Ethics, Economics, Memetics, and Education § GPT-4

  73. General availability of Azure OpenAI Service expands access to large, advanced AI models with added enterprise benefits

  74. SantaCoder: don’t reach for the stars!

  75. TrojanPuzzle: Covertly Poisoning Code-Suggestion Models

  76. ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

  77. The Stack: 3 TB of permissively licensed source code

  78. PAL: Program-aided Language Models

  79. Do Users Write More Insecure Code with AI Assistants?

  80. Broken Neural Scaling Laws

  81. Programming Possibility: Kevin Scott on AI’s Impact on Cognitive Work

  82. Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them

  83. Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners

  84. Repair Is Nearly Generation: Multilingual Program Repair with LLMs

  85. Limitations of Language Models in Arithmetic and Symbolic Induction

  86. Language Models Can Teach Themselves to Program Better

  87. PanGu-Coder: Program Synthesis with Function-Level Language Modeling

  88. CodeT: Code Generation with Generated Tests

  89. Can large language models reason about medical questions?

  90. Craft an Iron Sword: Dynamically Generating Interactive Game Characters by Prompting Large Language Models Tuned on Code

  91. Code Translation with Compiler Representations

  92. Repository-Level Prompt Generation for Large Language Models of Code

  93. Learning to Model Editing Processes

  94. Productivity Assessment of Neural Code Completion

  95. End-to-end symbolic regression with transformers

  96. InCoder: A Generative Model for Code Infilling and Synthesis

  97. PaLM: Scaling Language Modeling with Pathways

  98. A Conversational Paradigm for Program Synthesis

  99. Evaluating the Text-to-SQL Capabilities of Large Language Models

  100. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

  101. PolyCoder: A Systematic Evaluation of Large Language Models of Code

  102. Pop Quiz! Can a Large Language Model Help With Reverse Engineering?

  103. Text and Code Embeddings by Contrastive Pre-Training

  104. Neural Language Models are Effective Plagiarists

  105. Deep Symbolic Regression for Recurrent Sequences

  106. Discovering the Syntax and Strategies of Natural Language Programming with Generative Language Models

  107. A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More

  108. Few-Shot Semantic Parsing with Language Models Trained On Code

  109. WebGPT: Browser-assisted question-answering with human feedback

  110. WebGPT: Improving the factual accuracy of language models through web browsing

  111. Scaling Language Models: Methods, Analysis & Insights from Training Gopher

  112. Jigsaw: Large Language Models meet Program Synthesis

  113. Can Pre-trained Language Models be Used to Resolve Textual and Semantic Merge Conflicts?

  114. Solving Linear Algebra by Program Synthesis

  115. Solving Probability and Statistics Problems by Program Synthesis

  116. Automatic Program Repair with OpenAI’s Codex: Evaluating QuixBugs

  117. GenLine and GenForm: Two Tools for Interacting with Generative Language Models in a Code Editor

  118. An Empirical Cybersecurity Evaluation of GitHub Copilot’s Code Contributions

  119. Learning C to x86 Translation: An Experiment in Neural Compilation

  120. Program Synthesis with Large Language Models

  121. TAPEX: Table Pre-training via Learning a Neural SQL Executor

  122. Evaluating Large Language Models Trained on Code

  123. Research recitation: A first look at rote learning in GitHub Copilot suggestions

  124. Microsoft and OpenAI have a new AI tool that will give coding suggestions to software developers

  125. SymbolicGPT: A Generative Transformer Model for Symbolic Regression

  126. Measuring Coding Challenge Competence With APPS

  127. Improving Code Autocompletion with Transfer Learning

  128. LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

  129. Learning Autocompletion from Real-World Datasets

  130. GraphCodeBERT: Pre-training Code Representations with Data Flow

  131. CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair

  132. TransCoder: Unsupervised Translation of Programming Languages

  133. GPT-3 random sample dump: JavaScript tutorial

  134. IJON: Exploring Deep State Spaces via Fuzzing

  135. IntelliCode Compose: Code Generation Using Transformer

  136. Deep Learning for Symbolic Mathematics

  137. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search

  138. BERTScore: Evaluating Text Generation with BERT

  139. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

  140. Learning to superoptimize programs

  141. DeepCoder: Learning to Write Programs

  142. Neural Programmer-Interpreters

  143. Computers Doing The Right Thing

  144. OpenAI API Alchemy: Smart Formatting and Code Creation

  145. Building Games and Apps Entirely through Natural Language Using OpenAI’s Code-Davinci Model

  146. Replit

  147. c111945e461baafb3b10187dc65b4ff4256530c4.html

  148. Working With AI (Part 2): Code Conversion

  149. fd94ca950274977e4321f54a45033143e8b87efc.html

  150. An Amazing Journey With Claude 3.5 and ChatGPT-4o Who Helped Me Backwards Engineer an Econometrics Theory Paper and Taught Me a Lot More in the Process

  151. StenographyDev/autopilot-Vsc

  152. Copilot Stops Working on `gender` Related Subjects · Community · Discussion #72603

  153. 240b757ca122975adc355feffb57df79223bfa90.html

  154. Revolutionize Your Project Documentation With the Codex-README Generator, Utilizing OpenAI's Codex for Intelligent README Creation.

  155. LLM Powered Autonomous Agents

  156. The RetroInstruct Guide To Synthetic Text Data

  157. 496263b26aa6d7d6f161844fdec698493f2b0773.html

  158. Fun and Dystopia With AI-Based Code Generation Using GPT-J-6B

  159. 3eaae6137c3ff81ee1ce8b508282148db4e60799.html

  160. There’s a Running Theme in Here of Programming Problems LLMs Solve Where It’s...

  161. 85525c9bb48f9c95680601ccae4284f2c576e93b.html

  162. How Anthropic Built Artifacts

  163. e20cc27ccea0d8ec5d4e7a9a71b5d3e325d41754.html

  164. How I Use ‘AI’

  165. Using GPT-3 to Explain How Code Works

  166. 3a6b69f320048c5f35bf8a02af29ed8831a0ace6.html

  167. Adept Video Demo!

  168. Transformer-VAE for Program Synthesis

  169. Writer

  170. 2ac5f030b3a38c813fbeb999b44a83b650ff3f66.html

  171. Introducing ‘Computer Use’, a New Claude 3.5 Sonnet, and Claude 3.5 Haiku

  172. Claude 3.5 Sonnet on GitHub Copilot

  173. Developing a Computer Use Model

  174. Websim, Worldsim, and The Summer of Simulative AI

  175. I Found >800 Orthogonal ‘Write Code’ Steering Vectors

  176. 441e2c82f2dbe90699728ce7f7fefd27ae4f2a0e.html

  177. Who Models the Models That Model Models? An Exploration of GPT-3’s In-Context Model Fitting Ability

  178. OpenAI Codex: First Impressions

  179. A.I. Can Now Write Its Own Computer Code. That’s Good News for Humans.

  180. Balloons! The Balloon Clicker Game

  181. a99f236d1b9ce4d4f5da9a22ac2ac991c8be1f99.html

  182. Tabnine AI Code Assistant

  183. OpenAI Can Translate English into Code With Its New Machine Learning Software Codex

  184. FROM PLAIN TO EXPLAINED IN FIVE MINUTES: Getting Started With Stenography Autopilot

  185. OpenAI Codex Live Demo

  186. Is Finetuning GPT-4o worth It?

  187. Creating a Space Game With OpenAI Codex

  188. I Built a Todo List App Simply by Describing It to GPT-3. It Generated the React Code for a Fully Functioning App within Seconds. I’m Becoming More Impressed and Aware of Its Capabilities Every Single Day.

  189. I Gave GPT-3 Access to Chrome With the Objective ‘Please Buy Me AirPods’...It Successfully Made It to the Product Page, but Got Sidetracked With Walmart’s Privacy Policy. Since Even a Simplified DOM Is Far Too Large for a Single Prompt, Multiple Prompts Are given Different Chunks of the DOM, Each Generating Their Own ‘Interaction’. Another Prompt Then Takes All the Proposed Interactions and Selects the Best One, Sort of like a Tournament Bracket. For More Complex Web Pages, the Time It Takes to Generate an Action Scales at 𝒪(log n) With the Size of the DOM—Really Fast! It Also Gets around Token Limits, so You Could Technically Process an Infinitely Large DOM!

  190. The Examples Are Indeed Extremely Simple on Purpose (otherwise It’s Hard to Communicate Efficiently What’s Happening to Non-Metamath Experts). That Being Said, We’re Still Pretty Far Away from IMOs; but This Is Definitely a Goal for Us, and One We’re Actively Working Towards!

  191. XBOW Now Matches the Capabilities of a Top Human Pentester

  192. 6488c9703734a04ed02d9d7e6094a6df83b55484.html

  193. design#future-tag-features

    [Transclude the forward-link's context]

  194. 2024-03-07-inflection-inflection25benchmarks.svg

  195. 2024-harding-figure1-codechurnincreasefrom2020to2023.png

  196. 2024-harding-figure2-gitcodemodificationsbytypeovertime.jpg

  197. 2023-08-07-gwernnet-gpt4-scrollmarker.jpg

  198. 2023-01-16-microsoft-timelineofairesearchandproducts.png

  199. 2022-neelakantan-figure1-gpt3textcodeembeddingscalingbymodelsize.png

  200. 2022-ziegler-figure5-githubcopilotcodecompletionsuggestionacceptanceratebyprogramminglanguage.jpg

  201. 2021-austin-figure3-lamdaprogrammingperformancevsmodelscaling.png

  202. 2021-austin-figure4-fractionofsamplessolvingeachtaskbylamdamodelscaling.png

  203. 2021-nakano-figure1-gpt3textbrowserenvironmentobservations.png

  204. 2021-nakano-figure2-humanevaluationsofscalinggpt3questionanswering.png

  205. 2021-nakano-figure3-truthfulqaresultsbyscaling.png

  206. 2021-nakano-figure5-humanpreferencebynumberofrandomsamplesgeneratedforpreferenceranking.png

  207. 2021-nakano-figure6-behaviorcloningscalingbydemonstrationsandparametercount.jpg

  208. 2021-nakano-figure7-bestfnscalingbyflopsandanswerssampled.jpg

  209. 2021-nakano-figure7-rewardmodelscalingbycomparisonsandparametercount.jpg

  210. 2021-rae-figure3-gopherscalingcurvesforfeverfactcheckinginusingevidenceforreasoning.png

  211. 2021-rae-figure4-gopherscalingacrossfamiliesoftasksupto280bparameters.jpg

  212. 2021-rae-figurea17-gopherfewshotcapabilityemergesontruthfulqaby280bparameters.jpg

  213. 2021-zhang-figure8-gpt3vsgptjjavascriptmergeaccuracybynumberofattempts.png

  214. http://antirez.com/news/140

  215. 2912b1a76d585c99519f6cb25132b279f92f0a49.html

  216. http://bit-player.org/2023/ai-and-the-end-of-programming

  217. https://about.sourcegraph.com/blog/cheating-is-all-you-need

  218. https://aider.chat/2024/03/08/claude-3.html

  219. 63b824f385c9d8d24d92b19d7fdc0f95c706e74a.html

  220. https://aider.chat/docs/unified-diffs.html

  221. 1a1f47a0d0731a8033299b28fe81d7459a02f065.html

  222. https://amistrongeryet.substack.com/p/can-ai-do-my-job

  223. https://andrewmayne.com/2023/03/23/chatgpt-code-interpreter-magic/

  224. https://archive.is/BtuOG

  225. https://blog.darklang.com/gpt/

  226. d52235736faea17c5807c3e7dd5c232a1005c5ec.html

  227. https://blog.eleuther.ai/pile-t5/

  228. https://blog.humphd.org/cheatgpt/

  229. 5fed574e814e05393d05b084220934991d0f95af.html

  230. https://blog.mentat.ai/benchmarking-gpt-4-turbo-a-cautionary-tale

  231. https://bloop.ai/blog/evaluating-llms-on-cobol

  232. 021c6dbebb60998705fba7c08886e69644a565b0.html

  233. https://borretti.me/article/astronomical-calculations-for-hard-sf-common-lisp

  234. https://builtin.com/job/customer-success/expert-ai-teacher-contract/1267315

  235. https://dmicz.github.io/machine-learning/openai-changes/

  236. 7d11e59b234f27e96a4808d0b04365daf45263ad.html

  237. https://docs.flux.ai/tutorials/ai-for-hardware-design

  238. https://docs.parea.ai/blog/benchmarking-anthropic-beta-tool-use

  239. cc464c6b2b114fa90055f7723d7955b1d82cd352.html

  240. https://docs.sweep.dev/blogs/sweeps-core-algo

  241. https://finedataproducts.com/posts/2024-03-10-tax-scenarios-with-ai/

  242. https://gist.github.com/harryaskham/68a611bef777525991790bca2f2d324d

  243. https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/

  244. ea6fc226945c891a4d039fc6b80f1264364c3218.html

  245. https://github.blog/2023-02-14-github-copilot-for-business-is-now-available/

  246. 81ad701c3ed8ffa3ee774aa1a5e5823d43f2354f.html

  247. https://github.blog/2023-11-08-universe-2023-copilot-transforms-github-into-the-ai-powered-developer-platform/

  248. https://github.com/E-xyza/Exonerate/blob/master/bench/reports/gpt-bench.md

  249. https://github.com/JusticeRage/Gepetto

  250. https://github.com/RootbeerComputer/backend-GPT

  251. https://github.com/TaxyAI/browser-extension

  252. https://github.com/aiwebb/treenav-bench#interesting-findings

  253. https://github.com/features/copilot/

  254. https://github.com/ggerganov/llama.cpp/pull/1773

  255. https://github.com/greshake/Alice

  256. https://github.com/jart/emacs-copilot

  257. https://github.com/jujumilk3/leaked-system-prompts/blob/main/github-copilot-chat_20230513.md

  258. https://github.com/tldraw/make-real

  259. https://github.com/xenodium/chatgpt-shell/

  260. d5a2ac20a8da4c74dc7b9be0d3503dda9eb760f2.html

  261. https://github.com/ywkim/gpt-commit

  262. https://githubcopilotlitigation.com/

  263. a919798a207db73d92d78e88ef01a41cfeb63f4d.html

  264. https://githubnext.com/projects/ai-for-pull-requests/

  265. https://gpt3demo.com/category/code-generation

  266. 3fac822d700f1dccd623074f9943f2ff95b0eeb5.html

  267. https://huggingface.co/spaces/mullikine/ilambda

  268. https://huyenchip.com/2023/04/11/llm-engineering.html

  269. f049d498dce2e2d93b1c35bcb99d0c2b4f7f1327.html

  270. https://interconnected.org/home/2023/02/07/braggoscope

  271. 0700f3831fd6f891ea08503f104867dc44129f80.html

  272. https://jacobbrazeal.wordpress.com/2022/09/23/gpt-3-can-find-paths-up-to-7-nodes-long-in-random-graphs/

  273. 96a9d314150997e2202d4fae7f91f886cbf8e829.html

  274. https://joel.tools/codegen/

  275. d6082054b87ea939d2d1fb415c101b8b26633f54.html

  276. https://kenkantzer.com/lessons-after-a-half-billion-gpt-tokens/

  277. 6b83bca645ec463296a7f20d72f6bacf6136361e.html

  278. https://koenvangilst.nl/blog/keeping-code-complexity-in-check

  279. https://lemire.me/blog/2023/03/22/can-gpt-pass-my-programming-courses/

  280. 72489a62790e55c446f0dfddd69a822773490f91.html

  281. https://martinfowler.com/articles/2023-chatgpt-xu-hao.html

  282. 7596d0d6c396b18f6ccdb0e8d4aa5ea96915b59f.html

  283. https://mathstodon.xyz/@tao/111158219956220256

  284. https://mathstodon.xyz/@tao/111439273687647142

  285. ead860871749403a1343a71fbb2a23367e15f4d4.html

  286. https://mazzzystar.github.io/2023/05/10/LLM-for-individual/

  287. 1d6aaab3a845ac7ad77dc6669cfa47c4c44f7892.html

  288. https://medium.com/geekculture/i-found-a-loophole-to-successfully-web-scrape-using-chatgpt-heres-how-it-works-135f6c077d4d

  289. https://medium.com/tenable-techblog/g-3po-a-protocol-droid-for-ghidra-4b46fa72f1ff

  290. https://micahflee.com/2023/04/capturing-the-flag-with-gpt-4/

  291. https://minimaxir.com/2023/12/chatgpt-structured-data/

  292. 199ac619f4d3ae61796bc52828c1435d6eddaf0b.html

  293. https://model-checking.github.io/kani-verifier-blog/2023/05/01/writing-code-with-chatgpt-improve-it-with-kani.html

  294. https://news.ycombinator.com/item?id=33847479

  295. 49d94a797c3196a26b375db2933950390e02da48.html

  296. https://news.ycombinator.com/item?id=34865460

  297. b5303c50b9ba3e5738789a719a31a39e9e22c8f0.html

  298. https://news.ycombinator.com/item?id=35199646

  299. b4970a6857f66d79d0630fbc8745e4a5dc5a8df4.html

  300. https://news.ycombinator.com/item?id=35236275

  301. 18c4321d1e7e2b7014fa88b8078725b217a347a9.html

  302. https://news.ycombinator.com/item?id=35604715

  303. https://news.ycombinator.com/item?id=36606573

  304. 9b7b97825c13deaa0bb86864f537ca916200de50.html

  305. https://news.ycombinator.com/item?id=37834750

  306. 8951b22fc89595774385dfd3bfb8f9ef975ae93a.html

  307. https://nickarner.com/notes/llm-powered-assistants-for-complex-interfaces-february-26-2023/

  308. 1349c996e6b9d5fba2c20e0022cbf6598def1240.html

  309. https://old.reddit.com/r/singularity/comments/1atjz9v/ive_put_a_complex_codebase_into_a_single/

  310. 0bda7850f9c1f8b76ce61ec3d844c4aec7bb59bf.html

  311. https://openai.com/blog/chatgpt-plugins

  312. https://openai.com/blog/function-calling-and-other-api-updates#function-calling

  313. https://openai.com/blog/introducing-text-and-code-embeddings/

  314. https://openai.com/blog/openai-codex/

  315. https://openai.com/index/introducing-structured-outputs-in-the-api/#_5PYjnV1iAHOPKPupDztdZk

  316. https://openai.com/index/mle-bench/

  317. https://paperswithcode.com/sota/math-word-problem-solving-on-math

  318. https://platform.openai.com/docs/guides/embeddings/code-search-using-embeddings

  319. 8ad29fb0b395b3556ce9e34cbf321ba260e4dc48.html

  320. https://platform.openai.com/docs/guides/embeddings/use-cases

  321. a3b4bf458001b18ebe1ff66b052631833625ece4.html

  322. https://research.checkpoint.com/2023/opwnai-cybercriminals-starting-to-use-chatgpt/

  323. 50121d9fabf664329d72ca9579c7ed6a6f577535.html

  324. https://research.google/blog/safely-repairing-broken-builds-with-ml/

  325. 64ce129b51932c8310d6fb4e72fbe24ababafc67.html

  326. https://scale.com/blog/chatgpt-vs-claude

  327. https://scale.com/leaderboard/coding

  328. https://security.googleblog.com/2023/08/ai-powered-fuzzing-breaking-bug-hunting.html

  329. c7461a9fdee68f7da7e59c4daa450b0a305a2bfc.html

  330. https://simonwillison.net/2022/Dec/5/rust-chatgpt-copilot/

  331. cb75a13c5edc1c6c6f8b65df3001be90dc649600.html

  332. https://simonwillison.net/2023/Sep/30/cli-tools-python/

  333. 5912783cea3dda73053772b5446f93db7737a130.html

  334. https://smitop.com/post/codex/

  335. fce7d778943fb434f196cda5af95dae50694c6c7.html

  336. https://stability.ai/blog/stablecode-llm-generative-ai-coding

  337. https://stackoverflow.co/company/press/archive/openai-partnership/

  338. 5b5bb6282f9dd81a0cb38eed5327edac57b0a663.html

  339. https://statmodeling.stat.columbia.edu/2023/04/18/chatgpt4-writes-stan-code-so-i-dont-have-to/

  340. https://statmodeling.stat.columbia.edu/2023/08/20/bob-carpenter-thinks-gpt-4-is-awesome/

  341. https://tagide.com/education/writing-a-tokenizer-with-chatgpt/

  342. 375ba455383a5677927cb1eabc5ffbc345544a09.html

  343. https://towardsdatascience.com/can-chatgpt-write-better-sql-than-a-data-analyst-f079518efab2

  344. d538e23920c06b3942d2431617e608532003e1e5.html

  345. https://towardsdatascience.com/codex-by-openai-in-action-83529c0076cc

  346. 1a92b20c4f7dbc876e40bd8a481d1a4a36559870.html

  347. https://tyleransom.substack.com/p/using-llms-to-fuzzy-merge

  348. https://verse.systems/blog/post/2024-03-09-using-llms-to-generate-fuzz-generators/

  349. https://vulcan.io/blog/ai-hallucinations-package-risk

  350. 36af62ddc68a108a558a5afbee2a53774fc9b09d.html

  351. https://web.archive.org/web/20221112033036/https://mullikine.github.io/posts/nlsh-natural-language-shell/

  352. 8dadb11bcb321b0be838428db54f645f812eb633.html

  353. https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/

  354. https://www.autoregex.xyz/

  355. https://www.benkuhn.net/autocomplete/

  356. https://www.chargebackstop.com/blog/card-networks-exploitation

  357. https://www.engraved.blog/building-a-virtual-machine-inside/

  358. https://www.geoffreylitt.com/2023/03/25/llm-end-user-programming

  359. https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm

  360. https://www.kite.com/blog/product/kite-launches-ai-powered-javascript-completions/

  361. https://www.lasso.security/blog/ai-package-hallucinations

  362. https://www.lesswrong.com/posts/KSroBnxCHodGmPPJ8/jailbreaking-gpt-4-s-code-interpreter

  363. https://www.lesswrong.com/posts/u3SueTC44tgKFMMNs/is-the-chatgpt-simulated-linux-virtual-machine-real?commentId=iCAiCah33bBNJqNQE

  364. https://www.lesswrong.com/posts/u6KXXmKFbXfWzoAXn/a-circuit-for-python-docstrings-in-a-4-layer-attention-only

  365. https://www.lesswrong.com/posts/ukTLGe5CQq9w8FMne/inducing-unprompted-misalignment-in-llms

  366. https://www.lesswrong.com/posts/ux93sLHcqmBfsRTvg/gpt-can-write-quines-now-gpt-4

  367. https://www.oneusefulthing.org/p/it-is-starting-to-get-strange

  368. https://www.oneusefulthing.org/p/one-sentence

  369. f07ba7e4aebd02d54bd534f26b0fa8148485513e.html

  370. https://www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/

  371. https://www.patterns.app/blog/2023/01/18/crunchbot-sql-analyst-gpt/

  372. d7aaf7b7491492af22c98dae1079fbfa93961b5b.html

  373. https://www.quantamagazine.org/a-team-of-math-proves-a-critical-link-between-addition-and-sets-20231206/

  374. b701bdff7a875e05a46358ee7df347ee6db581e0.html

  375. https://www.reddit.com/r/ChatGPT/comments/12a0ajb/i_gave_gpt4_persistent_memory_and_the_ability_to/

  376. https://www.reddit.com/r/GPT3/comments/106t5gv/compressing_prompt_text_with_lossless_compression/

  377. 34619f69076df04923abeced72d75c069c9f7e26.html

  378. https://www.reddit.com/r/MachineLearning/comments/106q6m9/p_i_built_adrenaline_a_debugger_that_fixes_errors/

  379. 51a80239d3d095f0da9370b1d975c1ddcece61d4.html

  380. https://www.reddit.com/r/OpenAI/comments/1bm305k/what_the_hell_claud_3_opus_is_a_straight/

  381. c88272dae240233080f1bf85f995bb5ed1a64ad7.html

  382. https://www.reddit.com/r/singularity/comments/1atjz9v/ive_put_a_complex_codebase_into_a_single/

  383. https://www.samdickie.me/writing/experiment-1-creating-a-landing-page-using-ai-tools-no-code

  384. https://www.sigarch.org/coping-with-copilot/

  385. e01e0d0a4cccf5a3ec4b10f53b5a5687a8fbeaa2.html

  386. https://www.zdnet.com/article/microsoft-has-over-a-million-paying-github-copilot-users-ceo-nadella/

  387. https://x.com/AdeptAILabs/status/1590396065072951296

  388. https://x.com/Afinetheorem/status/1634516697515261953

  389. https://x.com/AlexKontorovich/status/1678772963183820801

  390. https://x.com/AlexKontorovich/status/1678772964836397056

  391. https://x.com/AlexTamkin/status/1567956315208830976

  392. https://x.com/ArtirKel/status/1588245580160983040

  393. https://x.com/ArtirKel/status/1588246269385838594

  394. https://x.com/BHolmesDev/status/1587788026637336576

  395. https://x.com/BlackHC/status/1567810869211316224

  396. https://x.com/CFGeek/status/1768024040487453169

  397. https://x.com/ChatGPTapp/status/1732979491071549792

  398. https://x.com/DaveMonlander/status/1612802240582135809

  399. https://x.com/ESYudkowsky/status/1718654143110512741

  400. https://x.com/GabriellaG439/status/1561007332267421696

  401. https://x.com/GrantSlatton/status/1677895737735286785

  402. https://x.com/GrantSlatton/status/1677895739958267905

  403. https://x.com/ItalyPaleAle/status/1409890404615409671

  404. https://x.com/LericDax/status/1635804659448152067

  405. https://x.com/LericDax/status/1635871504138133504

  406. https://x.com/MikePFrank/status/1588212826811772928

  407. https://x.com/MikePFrank/status/1622202768743096320

  408. https://x.com/MikePFrank/status/1622495004810784768

  409. https://x.com/Naman_Bhalla/status/1637578019811340292

  410. https://x.com/NickADobos/status/1634672282005295104

  411. https://x.com/PerksPlus0001/status/1631372820709253120

  412. https://x.com/Sirupsen/status/1673309920769323008

  413. https://x.com/Suhail/status/1635706222514167808

  414. https://x.com/SullyOmarr/status/1769107969872953634

  415. https://x.com/ThePrimeagen/status/1628047727866126336

  416. https://x.com/ThomasMiconi/status/1569408502447374336

  417. https://x.com/VictorTaelin/status/1768070973515800931

  418. https://x.com/VictorTaelin/status/1804665522241294582

  419. https://x.com/VivaLaPanda_/status/1677828821964439553

  420. https://x.com/YaBoyFathoM/status/1647608734175186944

  421. https://x.com/ZachWeiner/status/1694685022236610900

  422. https://x.com/__anjor/status/1830972847759729124

  423. https://x.com/abacaj/status/1736819789841281372

  424. https://x.com/ahr_like_air/status/1682885469632360448

  425. https://x.com/aicrumb/status/1712883451437646027

  426. https://x.com/alexalbert__/status/1636488551817965568

  427. https://x.com/amanrsanger/status/1631029716550549504

  428. https://x.com/amasad/status/1510330409908772867

  429. https://x.com/amasad/status/1587702550349811712

  430. https://x.com/amasad/status/1628546489843863555

  431. https://x.com/amasad/status/1704323196944527624

  432. https://x.com/andrewwhite01/status/1616933106786738176

  433. https://x.com/atlantis__labs/status/1677782219937525760

  434. https://x.com/backus/status/1652433895793516544

  435. https://x.com/ben_golub/status/1665030874272866305

  436. https://x.com/cHHillee/status/1732868066558792189

  437. https://x.com/ccanonne_/status/1639848150495301633

  438. https://x.com/d_feldman/status/1549607411845152770

  439. https://x.com/dandangond/status/1636063902688526339

  440. https://x.com/davidad/status/1639215289677017099

  441. https://x.com/dmvaldman/status/1658689854056853504

  442. https://x.com/emollick/status/1618969731431804929

  443. https://x.com/emollick/status/1639421740358193153

  444. https://x.com/emollick/status/1652170706312896512

  445. https://x.com/emollick/status/1652406848253771778

  446. https://x.com/emollick/status/1652545966480785408

  447. https://x.com/emollick/status/1658537599797977091

  448. https://x.com/emollick/status/1658698874117308417

  449. https://x.com/emollick/status/1736196921541140861

  450. https://x.com/emollick/status/1818009927107174771

  451. https://x.com/emollick/status/1864744770695815234

  452. https://x.com/fabianstelzer/status/1572571003804614657

  453. https://x.com/francoisfleuret/status/1699117856779075949

  454. https://x.com/gd3kr/status/1545370626273120256

  455. https://x.com/geepytee/status/1765428294630179168

  456. https://x.com/geoffreylitt/status/1635757456377917440

  457. https://x.com/gf_256/status/1598104835848798208

  458. https://x.com/goodside/status/1559801520773898240

  459. https://x.com/goodside/status/1559984178862628864

  460. https://x.com/goodside/status/1560273161840898048

  461. https://x.com/goodside/status/1560853596572450816

  462. https://x.com/goodside/status/1560867589835968513

  463. https://x.com/goodside/status/1560906792330199040

  464. https://x.com/goodside/status/1561296820373954560

  465. https://x.com/goodside/status/1561437390576750593

  466. https://x.com/goodside/status/1562233738863452160

  467. https://x.com/goodside/status/1562284846059339776

  468. https://x.com/goodside/status/1562417843542654976

  469. https://x.com/goodside/status/1562991379915341824

  470. https://x.com/goodside/status/1563989550808154113

  471. https://x.com/goodside/status/1568448128495534081

  472. https://x.com/goodside/status/1598129631609380864

  473. https://x.com/goodside/status/1614089728890130435

  474. https://x.com/goodside/status/1652059541301866496

  475. https://x.com/goodside/status/1652496489241878533

  476. https://x.com/goodside/status/1657396491676164096

  477. https://x.com/harryaskham/status/1636376676329455617

  478. https://x.com/hwchase17/status/1634606661137731584

  479. https://x.com/jamesbrandecon/status/1639709460762624001

  480. https://x.com/javilopen/status/1719363669685916095

  481. https://x.com/jeremyphoward/status/1688793283034779648

  482. https://x.com/jeremyphoward/status/1765529891343339804

  483. https://x.com/jeremyphoward/status/1779311134656671872

  484. https://x.com/jkronand/status/1638054742386679810

  485. https://x.com/kenshinsamurai9/status/1662510532585291779

  486. https://x.com/kuizinas/status/1562086476690644992

  487. https://x.com/lacker/status/1655685341649719296

  488. https://x.com/lemonodor/status/1628270074074398720

  489. https://x.com/marvinvonhagen/status/1657060506371346432

  490. https://x.com/mathemagic1an/status/1636121914849792000

  491. https://x.com/mattshumer_/status/1636512490195501056

  492. https://x.com/mattshumer_/status/1653060363972124673

  493. https://x.com/mattshumer_/status/1766157714411942055

  494. https://x.com/mattshumer_/status/1782468402293903662

  495. https://x.com/maxkriegers/status/1663372146696138752

  496. https://x.com/mckaywrigley/status/1642948620604538880

  497. https://x.com/moreisdifferent/status/1612489352105365511

  498. https://x.com/moyix/status/1598081204846489600

  499. https://x.com/moyix/status/1601056131681771521

  500. https://x.com/moyix/status/1603848600253042693

  501. https://x.com/mplappert/status/1663892732652273664

  502. https://x.com/natfriedman/status/1575631194032549888

  503. https://x.com/natfriedman/status/1712141463876776415

  504. https://x.com/negamuhia/status/1569616507256115205

  505. https://x.com/oegerikus/status/1610945035888955392

  506. https://x.com/packyM/status/1598405769669771264

  507. https://x.com/pararths/status/1598047138097033216

  508. https://x.com/patio11/status/1677890745683025920

  509. https://x.com/patio11/status/1721722777705603432

  510. https://x.com/patrickmineault/status/1591874392279351297

  511. https://x.com/paulnovosad/status/1655925767333658626

  512. https://x.com/perrymetzger/status/1632004276883947520

  513. https://x.com/perrymetzger/status/1635811092654858240

  514. https://x.com/perrymetzger/status/1639968357607698433

  515. https://x.com/philhawksworth/status/1720106515300860230

  516. https://x.com/rasbt/status/1530684760913285120

  517. https://x.com/rharang/status/1641899743608463365

  518. https://x.com/scottleibrand/status/1430753899460194310

  519. https://x.com/sergeykarayev/status/1569377881440276481

  520. https://x.com/sergeykarayev/status/1569571367833714688

  521. https://x.com/sharifshameem/status/1672852345259180037

  522. https://x.com/shesek/status/1603902050504478721

  523. https://x.com/shinboson/status/1794570054165729303

  524. https://x.com/skirano/status/1635736107949195278

  525. https://x.com/swyx/status/1776771329066500589

  526. https://x.com/thisiswrenn/status/1523182708385452032

  527. https://x.com/tunguz/status/1628075460230885381

  528. https://x.com/volokuleshov/status/1619906183955095558

  529. https://x.com/vykthur/status/1598148850837250049

  530. https://x.com/wunderwuzzi23/status/1849637648274686129

  531. https://x.com/yoheinakajima/status/1670557048743010305

  532. https://x.com/zswitten/status/1631190068970012675

  533. https://xenaproject.wordpress.com/2022/09/12/beyond-the-liquid-tensor-experiment/

  534. c1e4fdc61de44b3b4d07407e88f4a86cd7888bee.html

  535. They all use it

  536. https%253A%252F%252Fregisterspill.thorstenball.com%252Fp%252Fthey-all-use-it.html

  537. MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

  538. Lil'Log

  539. Homepage: Aleksander Mądry

  540. https%253A%252F%252Farxiv.org%252Fabs%252F2410.07095%2523openai.html

  541. AI-powered coding pulls in almost $1bn of funding to claim ‘killer app’ status

  542. https%253A%252F%252Fwww.ft.com%252Fcontent%252F4868bd38-613c-4fa9-ba9d-1ed8fa8a40c8.html

  543. APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

  544. Caiming Xiong—Home Page

  545. https%253A%252F%252Farxiv.org%252Fabs%252F2406.18518%2523salesforce.html

  546. A Peter Thiel-Backed AI Startup, Cognition Labs, Seeks $2 Billion Valuation: Funding round could increase startup’s valuation nearly sixfold in a matter of weeks, reflecting AI frenzy

  547. https%253A%252F%252Fwww.wsj.com%252Ftech%252Fai%252Fa-peter-thiel-backed-ai-startup-cognition-labs-seeks-2-billion-valuation-998fa39d.html

  548. Vulnerability Detection with Code Language Models: How Far Are We?

  549. https%253A%252F%252Farxiv.org%252Fabs%252F2403.18624.html

  550. Gold-Medalist Coders Build an AI That Can Do Their Job for Them: A new startup called Cognition AI can turn a user’s prompt into a website or video game

  551. https%253A%252F%252Fwww.bloomberg.com%252Fnews%252Farticles%252F2024-03-12%252Fcognition-ai-is-a-peter-thiel-backed-coding-assistant.html

  552. Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

  553. About Me

  554. https://jack-clark.net/about/

  555. Sam Bowman

  556. Jared Kaplan

  557. https%253A%252F%252Farxiv.org%252Fabs%252F2401.05566%2523anthropic.html

  558. StarVector: Generating Scalable Vector Graphics Code from Images

  559. https%253A%252F%252Farxiv.org%252Fabs%252F2312.11556.html

  560. Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

  561. https%253A%252F%252Farxiv.org%252Fabs%252F2310.04406.html

  562. PassUntil: Predicting Emergent Abilities with Infinite Resolution Evaluation

  563. Search

  564. Ning Ding

  565. https%253A%252F%252Farxiv.org%252Fabs%252F2310.03262.html

  566. Security Weaknesses of Copilot Generated Code in GitHub

  567. https%253A%252F%252Farxiv.org%252Fabs%252F2310.02059.html

  568. Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

  569. https%253A%252F%252Farxiv.org%252Fabs%252F2308.07921.html

  570. AI Is a Lot of Work: As the technology becomes ubiquitous, a vast tasker underclass is emerging—and not going anywhere

  571. https%253A%252F%252Fwww.theverge.com%252Ffeatures%252F23764584%252Fai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots.html

  572. When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming (CDHF)

  573. https%253A%252F%252Farxiv.org%252Fabs%252F2306.04930%2523microsoft.html

  574. Introducing Microsoft 365 Copilot—your copilot for work

  575. https%253A%252F%252Fblogs.microsoft.com%252Fblog%252F2023%252F03%252F16%252Fintroducing-microsoft-365-copilot-your-copilot-for-work%252F.html

  576. Large Language Models and Simple, Stupid Bugs

  577. https%253A%252F%252Farxiv.org%252Fabs%252F2303.11455.html

  578. Larger language models do in-context learning differently

  579. Jason Wei

  580. Yi Tay

  581. https%253A%252F%252Farxiv.org%252Fabs%252F2303.03846%2523google.html

  582. ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics

  583. https%253A%252F%252Farxiv.org%252Fabs%252F2302.12433.html

  584. Google is asking employees to test potential ChatGPT competitors, including a chatbot called 'Apprentice Bard'

  585. https%253A%252F%252Fwww.cnbc.com%252F2023%252F01%252F31%252Fgoogle-testing-chatgpt-like-chatbot-apprentice-bard-with-employees.html.html

  586. An Analysis of the Automatic Bug Fixing Performance of ChatGPT

  587. https%253A%252F%252Farxiv.org%252Fabs%252F2301.08653.html

  588. General availability of Azure OpenAI Service expands access to large, advanced AI models with added enterprise benefits

  589. https%253A%252F%252Fazure.microsoft.com%252Fen-us%252Fblog%252Fgeneral-availability-of-azure-openai-service-expands-access-to-large-advanced-ai-models-with-added-enterprise-benefits%252F.html

  590. The Stack: 3 TB of permissively licensed source code

  591. Thomas Wolf

  592. Dzmitry Bahdanau

  593. https%253A%252F%252Farxiv.org%252Fabs%252F2211.15533.html

  594. Programming Possibility: Kevin Scott on AI’s Impact on Cognitive Work

  595. https%253A%252F%252Fgreylock.com%252Fgreymatter%252Fkevin-scott-ai-programming-possibility%252F.html

  596. Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them

  597. Yi Tay

  598. Jason Wei

  599. https%253A%252F%252Farxiv.org%252Fabs%252F2210.09261%2523google.html

  600. Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners

  601. Luke Zettlemoyer

  602. https%253A%252F%252Farxiv.org%252Fabs%252F2209.01975.html

  603. Can large language models reason about medical questions?

  604. https%253A%252F%252Farxiv.org%252Fabs%252F2207.08143.html

  605. Productivity Assessment of Neural Code Completion

  606. https%253A%252F%252Farxiv.org%252Fabs%252F2205.06537%2523github.html

  607. InCoder: A Generative Model for Code Infilling and Synthesis

  608. Luke Zettlemoyer

  609. Mike Lewis

  610. https%253A%252F%252Farxiv.org%252Fabs%252F2204.05999%2523facebook.html

  611. PaLM: Scaling Language Modeling with Pathways

  612. Yi Tay

  613. https://x.com/jekbradbury

  614. Vedant Misra

  615. Barret Zoph

  616. Jason Wei

  617. https%253A%252F%252Farxiv.org%252Fabs%252F2204.02311%2523google.html

  618. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

  619. %252Fdoc%252Fai%252Fnn%252Ftransformer%252Fgpt%252Fcodex%252F2022-vaithilingam.pdf.html

  620. Text and Code Embeddings by Contrastive Pre-Training

  621. Alec Radford

  622. Jong Wook Kim

  623. Gretchen Krueger

  624. Lil'Log

  625. https%253A%252F%252Farxiv.org%252Fabs%252F2201.10005%2523openai.html

  626. A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More

  627. https%253A%252F%252Farxiv.org%252Fabs%252F2112.15594.html

  628. WebGPT: Browser-assisted question-answering with human feedback

  629. Jacob Hilton's Homepage

  630. Gretchen Krueger

  631. John Schulman’s Homepage

  632. https%253A%252F%252Farxiv.org%252Fabs%252F2112.09332%2523openai.html

  633. WebGPT: Improving the factual accuracy of language models through web browsing

  634. Jacob Hilton's Homepage

  635. John Schulman’s Homepage

  636. https%253A%252F%252Fopenai.com%252Fresearch%252Fwebgpt.html

  637. Scaling Language Models: Methods, Analysis & Insights from Training Gopher

  638. Karen Simonyan

  639. https://x.com/jekbradbury

  640. Koray Kavukcuoglu

  641. https%253A%252F%252Farxiv.org%252Fabs%252F2112.11446%2523deepmind.html

  642. Can Pre-trained Language Models be Used to Resolve Textual and Semantic Merge Conflicts?

  643. https%253A%252F%252Farxiv.org%252Fabs%252F2111.11904%2523microsoft.html

  644. Solving Probability and Statistics Problems by Program Synthesis

  645. https%253A%252F%252Farxiv.org%252Fabs%252F2111.08267.html

  646. GenLine and GenForm: Two Tools for Interacting with Generative Language Models in a Code Editor

  647. %252Fdoc%252Fai%252Fnn%252Ftransformer%252Fgpt%252Flamda%252F2021-jiang-2.pdf.html