Over 18 months ago, we (me, @YiTayML, @dara_bahri, and @marc_najork) released our "Rethinking Search" paper (arxiv.org/abs/2105.02274), which envisioned how LMs could deliver deep, direct answers in response to a user information needs. A 🧵on what's played out since then.
Jan 13, 2023 · 10:40 PM UTC
After the paper was released and picked up some media attention (like technologyreview.com/2021/05…), we received a lot of feedback - some positive and some critical. Someone even asked if the paper was science fiction. It was an interesting few weeks. Then things quieted down.
In Dec '21 WebGPT (openai.com/blog/webgpt/) was announced. It leveraged some ideas from our work, specifically the ability to synthesize attributed responses from multiple sources of evidence. LaMDa and GopherCite came out around this time and had some attribution capabilities.
Things got interesting last month when several technologies were announced that are similar to what we had in mind (see Fig. 3 from our paper below) and to WebGPT. These represented the first "real-world" implementations of multi-source generative answers with attribution.
Example #1: perplexity.ai.
This direction continues to be of interest to the research community, as these systems are useful but far from perfect. Our recent work on attributed question answering is meant to help stimulate further research toward this important direction.
New preprint: Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models. arxiv.org/abs/2212.08037
In this work we ask two key questions: 1) How to measure Attribution? and 2) How well do current SotA models perform on AQA? 3/