‘Codex’ directory

Annotations sorted by machine learning into ⁠inferred 'tags'⁠. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.

Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.

Wikipedia (2)

GitHub Copilot⁠
OpenAI Codex⁠ :

https://en.wikipedia.org/wiki/OpenAI_Codex⁠

Miscellaneous

Bibliography

https://arxiv.org/abs/2502.06807#openai: “Competitive Programming With Large Reasoning Models ”⁠, Ahmed El-Kishky, Alexander Wei, Andre Saraiva …, Borys Minaev, Daniel Selsam⁠, David Dohan, Francis Song, Hunter Lightman, Ignasi Clavera, Jakub Pachocki, Jerry Tworek, Lorenz Kuhn, Łukasz Kaiser⁠, ⁠Mark Chen, Max Schwarzer, Mostafa Rohaninejad, Nat McAleese⁠, o3 contributors, Oleg Mürk, Rhythm Garg, Rui Shu, Szymon Sidor, Vineet Kosaraju, Wenda Zhou
link-bibliography⁠
https://registerspill.thorstenball.com/p/they-all-use-it: “They All Use It ”, Thorsten Ball
link-bibliography⁠
https://arxiv.org/abs/2410.06992: “SWE-Bench+: Enhanced Coding Benchmark for LLMs ”⁠, Reem Aleithan, Haoran Xue, Mohammad Mahdi Mohajer …, Elijah Nnorom, Gias Uddin, Song Wang
link-bibliography⁠
https://arxiv.org/abs/2410.07095#openai: “MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering ”⁠, Jun Shern Chan, Neil Chowdhury, Oliver Jaffe …, James Aung, Dane Sherburn, Evan Mays, Giulio Starace, Kevin Liu, Leon Maksin, Tejal Patwardhan, ⁠Lilian Weng, ⁠Aleksander Madry
link-bibliography⁠
https://www.ft.com/content/4868bd38-613c-4fa9-ba9d-1ed8fa8a40c8: “AI-Powered Coding Pulls in Almost $1bn of Funding to Claim ‘Killer App’ Status ”⁠, Madhumita Murgia
link-bibliography⁠
https://arxiv.org/abs/2406.18518#salesforce: “APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets ”⁠, Zuxin Liu, Thai Hoang, Jianguo Zhang …, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, ⁠Caiming Xiong
link-bibliography⁠
https://arxiv.org/abs/2405.15793: “SWE-Agent: Agent-Computer Interfaces Enable Automated Software Engineering ”⁠, John Yang, Carlos E. Jimenez, Alexander Wettig …, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
link-bibliography⁠
https://www.wsj.com/tech/ai/a-peter-thiel-backed-ai-startup-cognition-labs-seeks-2-billion-valuation-998fa39d: “A Peter Thiel-Backed AI Startup, Cognition Labs, Seeks $2 Billion Valuation: Funding round Could Increase Startup’s Valuation Nearly Sixfold in a Matter of Weeks, Reflecting AI Frenzy ”⁠, Berber Jin
link-bibliography⁠
https://arxiv.org/abs/2403.18624: “Vulnerability Detection With Code Language Models: How Far Are We? ”⁠, Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim …, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David Wagner, Baishakhi Ray, Yizheng Chen
link-bibliography⁠
https://www.bloomberg.com/news/articles/2024-03-12/cognition-ai-is-a-peter-thiel-backed-coding-assistant: “Gold-Medalist Coders Build an AI That Can Do Their Job for Them: A New Startup Called Cognition AI Can Turn a User’s Prompt into a Website or Video Game ”⁠, Ashlee Vance⁠
link-bibliography⁠
2024-harding.pdf: “Coding on Copilot: 2023 Data Shows Downward Pressure on Code Quality, Plus Projections for 2024 ”⁠, William Harding, Matthew Kloster
link-bibliography⁠
https://arxiv.org/abs/2401.05566#anthropic: “Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training ”⁠, Evan Hubinger, Carson Denison, Jesse Mu …, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, ⁠Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, ⁠Deep Ganguli, Fazl Barez, ⁠Jack Clark⁠, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec, Yuntao Bai⁠, Zachary Witten, Marina Favaro, Jan Brauner, Holden Karnofsky⁠, Paul Christiano⁠, ⁠Samuel R. Bowman, Logan Graham, Jared Kaplan, Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, ⁠Ethan Perez
link-bibliography⁠
https://arxiv.org/abs/2312.11556: “StarVector: Generating Scalable Vector Graphics Code from Images ”⁠, Juan A. Rodriguez, Shubham Agarwal⁠, Issam H. Laradji …, Pau Rodriguez, David Vazquez, Christopher Pal, Marco Pedersoli
link-bibliography⁠
https://arxiv.org/abs/2310.04406: “Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models ”⁠, Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman …, Haohan Wang, Yu-Xiong Wang
link-bibliography⁠
https://arxiv.org/abs/2310.03262: “PassUntil: Predicting Emergent Abilities With Infinite Resolution Evaluation ”⁠, Shengding Hu, Xin Liu, Xu Han⁠ …, Xinrong Zhang, Chaoqun He, Weilin Zhao, Yankai Lin, Ning Ding⁠, Zebin Ou, Guoyang Zeng, ⁠Zhiyuan Liu, ⁠Maosong Sun
link-bibliography⁠
https://arxiv.org/abs/2310.02059: “Security Weaknesses of Copilot Generated Code in GitHub ”⁠, Yujia Fu, Peng Liang, Amjed Tahir …, Zengyang Li, Mojtaba Shahin, Jiaxin Yu, Jinfu Chen
link-bibliography⁠
https://arxiv.org/abs/2308.07921: “Solving Challenging Math Word Problems Using GPT-4 Code Interpreter With Code-Based Self-Verification ”⁠, Aojun Zhou, Ke Wang⁠, Zimu Lu …, Weikang Shi, Sichun Luo, Zipeng Qin, Shaoqing Lu, Anya Jia, Linqi Song, Mingjie Zhan, Hongsheng Li
link-bibliography⁠
https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots: “AI Is a Lot of Work: As the Technology Becomes Ubiquitous, a Vast Tasker Underclass Is Emerging—And Not Going Anywhere ”⁠, Josh Dzieza
link-bibliography⁠
https://arxiv.org/abs/2306.04930#microsoft: “When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming (CDHF) ”⁠, Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz⁠
link-bibliography⁠
https://arxiv.org/abs/2303.11455: “Large Language Models and Simple, Stupid Bugs ”⁠, Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan
link-bibliography⁠
https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-copilot-your-copilot-for-work/: “Introducing Microsoft 365 Copilot—Your Copilot for Work ”⁠, Jared Spataro
link-bibliography⁠
https://arxiv.org/abs/2303.03846#google: “Larger Language Models Do In-Context Learning Differently ”⁠, Jerry Wei, Jason Wei, ⁠Yi Tay …, Dustin Tran, Albert Webson, Yifeng Lu, Xinyun Chen, Hanxiao Liu, Da Huang, ⁠Denny Zhou, ⁠Tengyu Ma
link-bibliography⁠
https://arxiv.org/abs/2302.12433: “ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics ”⁠, Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf …, Edward W. Ayers, Dragomir Radev⁠, Jeremy Avigad⁠
link-bibliography⁠
https://www.cnbc.com/2023/01/31/google-testing-chatgpt-like-chatbot-apprentice-bard-with-employees.html: “Google Is Asking Employees to Test Potential ChatGPT Competitors, including a Chatbot Called 'Apprentice Bard' ”⁠, Jennifer Elias
link-bibliography⁠
https://arxiv.org/abs/2301.08653: “An Analysis of the Automatic Bug Fixing Performance of ChatGPT ”⁠, Dominik Sobania, Martin Briesch, Carol Hanna, Justyna Petke
link-bibliography⁠
https://azure.microsoft.com/en-us/blog/general-availability-of-azure-openai-service-expands-access-to-large-advanced-ai-models-with-added-enterprise-benefits/: “General Availability of Azure OpenAI Service Expands Access to Large, Advanced AI Models With Added Enterprise Benefits ”⁠, Eric Boyd
link-bibliography⁠
https://arxiv.org/abs/2211.15533: “The Stack: 3 TB of Permissively Licensed Source Code ”⁠, Denis Kocetkov, Raymond Li, Loubna Ben Allal …, Jia Li⁠, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell⁠, Sean Hughes, ⁠Thomas Wolf, ⁠Dzmitry Bahdanau, Leandro von Werra, Harm de Vries
link-bibliography⁠
https://greylock.com/greymatter/kevin-scott-ai-programming-possibility/: “Programming Possibility: Kevin Scott on AI’s Impact on Cognitive Work ”, Reid Hoffman⁠, Kevin Scott⁠
link-bibliography⁠
https://arxiv.org/abs/2210.09261#google: “Challenging BIG-Bench Tasks (BBH) and Whether Chain-Of-Thought Can Solve Them ”⁠, Mirac Suzgun, Nathan Scales, Nathanael Schärli …, Sebastian Gehrmann, ⁠Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le⁠, Ed H. Chi⁠, ⁠Denny Zhou, Jason Wei
link-bibliography⁠
https://arxiv.org/abs/2209.01975: “Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners ”⁠, Hongjin Su, Jungo Kasai, Chen Henry Wu …, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf⁠, Luke Zettlemoyer⁠, Noah Smith⁠, Tao Yu
link-bibliography⁠
https://arxiv.org/abs/2207.08143: “Can Large Language Models Reason about Medical Questions? ”⁠, Valentin Liévin, Christoffer Egeberg Hother, Ole Winther
link-bibliography⁠
https://arxiv.org/abs/2205.06537#github: “Productivity Assessment of Neural Code Completion ”⁠, Albert Ziegler⁠, Eirini Kalliamvakou, Shawn Simister …, Ganesh Sittampalam⁠, Alice Li, Andrew Rice⁠, Devon Rifkin⁠, Edward Aftandilian
link-bibliography⁠
https://arxiv.org/abs/2204.05999#facebook: “InCoder: A Generative Model for Code Infilling and Synthesis ”⁠, Daniel Fried⁠, Armen Aghajanyan, Jessy Lin …, Sida Wang, Eric Wallace⁠, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer⁠, Mike Lewis⁠
link-bibliography⁠
https://arxiv.org/abs/2204.02311#google: “PaLM: Scaling Language Modeling With Pathways ”⁠, Aakanksha Chowdhery, Sharan Narang, Jacob Devlin …, Maarten Bosma, Gaurav Mishra, Adam Roberts⁠, Paul Barham⁠, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes⁠, ⁠Yi Tay, Noam Shazeer⁠, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury⁠, Jacob Austin⁠, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat⁠, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, ⁠Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, ⁠Barret Zoph, Alexander Spiridonov⁠, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz⁠, Erica Moreira, ⁠Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean⁠, Slav Petrov, Noah Fiedel
link-bibliography⁠
2022-vaithilingam.pdf: “Expectation versus Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models ”⁠, Priyan Vaithilingam, Tianyi Zhang⁠, Elena Glassman
link-bibliography⁠
https://arxiv.org/abs/2201.10005#openai: “Text and Code Embeddings by Contrastive Pre-Training ”⁠, Arvind Neelakantan, Tao Xu, Raul Puri …, Alec Radford⁠, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, ⁠Jong Wook Kim, Chris Hallacy, Johannes Heidecke, Pranav Shyam, Boris Power, Tyna Eloundou Nekoul, Girish Sastry, Gretchen Krueger⁠, David Schnurr, Felipe Petroski Such, Kenny Hsu, Madeleine Thompson⁠, Tabarak Khan, Toki Sherbakov, Joanne Jang, Peter Welinder⁠, ⁠Lilian Weng
link-bibliography⁠
https://arxiv.org/abs/2112.15594: “A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More ”⁠, Iddo Drori, Sunny Tran, Roman Wang …, Newman Cheng, Kevin Liu, Leonard Tang, Elizabeth Ke, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer, Nakul Verma⁠, Eugene Wu⁠, Gilbert Strang⁠
link-bibliography⁠
https://arxiv.org/abs/2112.09332#openai: “WebGPT: Browser-Assisted Question-Answering With Human Feedback ”⁠, Reiichiro Nakano, ⁠Jacob Hilton, Suchir Balaji …, Jeff Wu Long Ouyang, Christina Kim⁠, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger⁠, Kevin Button, Matthew Knight, Benjamin Chess, ⁠John Schulman
link-bibliography⁠
https://openai.com/research/webgpt: “WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing ”⁠, ⁠Jacob Hilton, Suchir Balaji, Reiichiro Nakano, ⁠John Schulman
link-bibliography⁠
https://arxiv.org/abs/2112.11446#deepmind: “Scaling Language Models: Methods, Analysis & Insights from Training Gopher ”⁠, Jack W. Rae, Sebastian Borgeaud, Trevor Cai …, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson⁠, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese⁠, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan⁠, Michela Paganini, Laurent Sifre⁠, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d’Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury⁠, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac⁠, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals⁠, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis⁠, Koray Kavukcuoglu⁠, ⁠Geoffrey Irving
link-bibliography⁠
https://arxiv.org/abs/2111.11904#microsoft: “Can Pre-Trained Language Models Be Used to Resolve Textual and Semantic Merge Conflicts? ”⁠, Jialu Zhang, Todd Mytkowicz, Mike Kaufman …, Ruzica Piskac, Shuvendu K. Lahiri
link-bibliography⁠
https://arxiv.org/abs/2111.08267: “Solving Probability and Statistics Problems by Program Synthesis ”⁠, Leonard Tang, Elizabeth Ke, Nikhil Singh …, Nakul Verma⁠, Iddo Drori
link-bibliography⁠
2021-jiang-2.pdf: “GenLine and GenForm: Two Tools for Interacting With Generative Language Models in a Code Editor ”⁠, Ellen Jiang, Edwin Toh, Alejandra Molina …, Aaron Donsbach, Carrie Cai, Michael Terry
link-bibliography⁠