- See Also
-
Links
- “How Tech Giants Cut Corners to Harvest Data for AI: OpenAI, Google and Meta Ignored Corporate Policies, Altered Their Own Rules and Discussed Skirting Copyright Law As They Sought Online Information to Train Their Newest Artificial Intelligence Systems”, Metz et al 2024
- “Careless Whisper: Speech-To-Text Hallucination Harms”, Koenecke et al 2024
- “Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling”, Gandhi et al 2023
- “Whisper-AT: Noise-Robust Automatic Speech Recognizers Are Also Strong General Audio Event Taggers”, Gong et al 2023
- “Why YouTube Could Give Google an Edge in AI”, Victor 2023
- “WhisperX: Time-Accurate Speech Transcription of Long-Form Audio”, Bain et al 2023
- “Whisper: Robust Speech Recognition via Large-Scale Weak Supervision”, Radford et al 2022
- “ESB: A Benchmark For Multi-Domain End-To-End Speech Recognition”, Gandhi et al 2022
- “The History of Speech Recognition to the Year 2030”, Hannun 2021
- “The History of Speech Recognition to the Year 2030”, Hannun 2021
- “SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network”, Chan et al 2021
- Miscellaneous
- Bibliography
See Also
Links
“How Tech Giants Cut Corners to Harvest Data for AI: OpenAI, Google and Meta Ignored Corporate Policies, Altered Their Own Rules and Discussed Skirting Copyright Law As They Sought Online Information to Train Their Newest Artificial Intelligence Systems”, Metz et al 2024
“Careless Whisper: Speech-To-Text Hallucination Harms”, Koenecke et al 2024
“Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling”, Gandhi et al 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
“Whisper-AT: Noise-Robust Automatic Speech Recognizers Are Also Strong General Audio Event Taggers”, Gong et al 2023
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
“Why YouTube Could Give Google an Edge in AI”, Victor 2023
“WhisperX: Time-Accurate Speech Transcription of Long-Form Audio”, Bain et al 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
“Whisper: Robust Speech Recognition via Large-Scale Weak Supervision”, Radford et al 2022
Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
“ESB: A Benchmark For Multi-Domain End-To-End Speech Recognition”, Gandhi et al 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
“The History of Speech Recognition to the Year 2030”, Hannun 2021
“The History of Speech Recognition to the Year 2030”, Hannun 2021
The History of Speech Recognition to the Year 2030:
View External Link:
“SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network”, Chan et al 2021
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
Miscellaneous
-
/doc/ai/nn/transformer/gpt/whisper/2022-radford-figure1-overviewofwhispertransformerarchitecture.png
: -
/doc/ai/nn/transformer/gpt/whisper/2022-radford-figure8-whisperscalingbymodelsize.png
: -
https://cookbook.openai.com/examples/whisper_prompting_guide
: -
https://github.com/openai/whisper/discussions/1762#discussion-5819873
-
https://openai.com/blog/introducing-chatgpt-and-whisper-apis
-
https://www.lesswrong.com/posts/KbRxdBCcJqwtbiPzm/whisper-s-wild-implications-1
:View External Link:
https://www.lesswrong.com/posts/KbRxdBCcJqwtbiPzm/whisper-s-wild-implications-1
-
https://www.lesswrong.com/posts/thePw6qdyabD8XR4y/interpreting-openai-s-whisper
Bibliography
-
https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html
: “How Tech Giants Cut Corners to Harvest Data for AI: OpenAI, Google and Meta Ignored Corporate Policies, Altered Their Own Rules and Discussed Skirting Copyright Law As They Sought Online Information to Train Their Newest Artificial Intelligence Systems”, -
https://www.theinformation.com/articles/why-youtube-could-give-google-an-edge-in-ai
: “Why YouTube Could Give Google an Edge in AI”, -
https://arxiv.org/abs/2212.04356#openai
: “Whisper: Robust Speech Recognition via Large-Scale Weak Supervision”, -
https://arxiv.org/abs/2210.13352#huggingface
: “ESB: A Benchmark For Multi-Domain End-To-End Speech Recognition”,