Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler

Over 18 months ago, we (me, @YiTayML, @dara_bahri, and @marc_najork) released our "Rethinking Search" paper (arxiv.org/abs/2105.02274), which envisioned how LMs could deliver deep, direct answers in response to a user information needs. A 🧵on what's played out since then.

Rethinking Search: Making Domain Experts out of Dilettantes

When experiencing an information need, users want to engage with a domain expert, but often turn to an information retrieval system, such as a search engine, instead. Classical information...

arxiv.org

Jan 13, 2023 · 10:40 PM UTC

377

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

After the paper was released and picked up some media attention (like technologyreview.com/2021/05…), we received a lot of feedback - some positive and some critical. Someone even asked if the paper was science fiction. It was an interesting few weeks. Then things quieted down.

Language models like GPT-3 could herald a new type of search engine

The way we search online hasn’t changed in decades. A new idea from Google researchers could make it more like talking to a human expert

technologyreview.com

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

In Dec '21 WebGPT (openai.com/blog/webgpt/) was announced. It leveraged some ideas from our work, specifically the ability to synthesize attributed responses from multiple sources of evidence. LaMDa and GopherCite came out around this time and had some attribution capabilities.

WebGPT: Improving the factual accuracy of language models through web browsing

We’ve fine-tuned GPT-3 to more accurately answer open-ended questions using a text-based web browser.

openai.com

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

Things got interesting last month when several technologies were announced that are similar to what we had in mind (see Fig. 3 from our paper below) and to WebGPT. These represented the first "real-world" implementations of multi-source generative answers with attribution.

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

Example #1: perplexity.ai.

Perplexity

@perplexity_ai

7 Dec 2022

Replying to @perplexity_ai

Inspired by OpenAI WebGPT, instead of displaying a list of links, we summarize the search results and include citations so that you can easily verify the accuracy of the information provided.

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

Example #2: NeevaAI.

sridhar

@RamaswmySridhar

6 Dec 2022

At @neeva, we've been revolutionizing search w/ an ad free, privacy-first model But we’ve also been quietly upgrading the experience entirely w/cutting edge AI & LLMs. ChatGPT cannot give you real time data or fact verification. In our upcoming upgrades, @neeva can

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

Example #3: YouChat (search).

You.com

@YouSearchEngine

Jan 5

Replying to @speltex

Have you tried entering your query into the search bar

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

This direction continues to be of interest to the research community, as these systems are useful but far from perfect. Our recent work on attributed question answering is meant to help stimulate further research toward this important direction.

Pat Verga @pat_verga

16 Dec 2022

Replying to @pat_verga

New preprint: Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models. arxiv.org/abs/2212.08037 In this work we ask two key questions: 1) How to measure Attribution? and 2) How well do current SotA models perform on AQA? 3/

Don Metzler · Jan 13, 2023 · 10:40 PM UTC

Don Metzler @metzlerd

Jan 13

This is an exciting time for those working at the intersection of NLP, ML, and IR, and I suspect all of this is just the tip of the iceberg in terms of how these quickly evolving technologies will continue to bring value to users.

Omar Khattab · Jan 13, 2023 · 10:52 PM UTC

Omar Khattab @lateinteraction

Jan 13

Replying to @metzlerd @YiTayML @dara_bahri @marc_najork

Though I was never a fan of index-free IR in particular, this paper was and remains an inspiring and ahead-of-its time gem. Thanks for sharing many cool ideas in there!

more replies

Jian Liao · Apr 1, 2023 · 7:17 PM UTC

Jian Liao

@imjliao

Apr 1

Replying to @metzlerd @YiTayML @dara_bahri @marc_najork

@memdotai mem it #ir

Mem · Apr 1, 2023 · 7:19 PM UTC

Mem @memdotai

Apr 1

Saved! Here's the compiled thread: mem.ai/p/snK8cl2ZVQpEdB39ROT… 🪄 AI-generated summary: "18 months ago, a paper was released that proposed how language models could provide direct answers to user information needs. Since then, several technologies have been...

more replies