Aaron Gokaslan · Aug 22, 2019 · 5:50 PM UTC

Aaron Gokaslan

21 Photos and videos

Pinned Tweet

Aaron Gokaslan

@SkyLi0n

22 Aug 2019

OpenGPT-2: We Replicated GPT-2 Because You Can Too! 1.5B Model Weights Released. Blogpost: medium.com/@vanya_cohen/open… Colab: colab.research.google.com/dr…

Public 1.5B OpenGPT-2 GPU Inference

Colaboratory notebook

colab.research.google.com

238

781

Abhi Venigalla · Jan 4, 2024 · 6:53 PM UTC

Aaron Gokaslan retweeted

Abhi Venigalla

@abhi_venigalla

18h

New year, new MME 🎉 @dskhudia and I profiled @Intel Gaudi2 accelerators for LLM training and inference, and found great performance and perf/$ ! databricks.com/blog/llm-trai…

LLM Training and Inference with Intel(R) Gaudi(R) 2 AI Accelerators

databricks.com

AK · Jan 2, 2024 · 6:44 PM UTC

Aaron Gokaslan retweeted

@_akhaliq

Jan 2

ByteDance announces Diffusion Model with Perceptual Loss paper page: huggingface.co/papers/2401.0… Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality, yet its surprising effectiveness is not fully understood. In this paper, We show that the effectiveness of classifier-free guidance partly originates from it being a form of implicit perceptual guidance. As a result, we can directly incorporate perceptual loss in diffusion training to improve sample quality. Since the score matching objective used in diffusion training strongly resembles the denoising autoencoder objective used in unsupervised training of perceptual networks, the diffusion model itself is a perceptual network and can be used to generate meaningful perceptual loss. We propose a novel self-perceptual objective that results in diffusion models capable of generating more realistic samples. For conditional generation, our method only improves sample quality without entanglement with the conditional input and therefore does not sacrifice sample diversity. Our method can also improve sample quality for unconditional generation, which was not possible with classifier-free guidance before.

128

813

jack morris · Jan 3, 2024 · 2:21 AM UTC

Aaron Gokaslan retweeted

jack morris @jxmnop

Jan 3

the openAI—NYT lawsuit is a big deal for copyright precedent. literally all popular models right now were trained on copyrighted data... except for one my friend from school @SkyLi0n developed a diffusion model that's not trained on any copyrighted data it's called CommonCanvas

142

Stas Bekman · Jan 1, 2024 · 5:14 AM UTC

Aaron Gokaslan retweeted

Stas Bekman

@StasBekman

Jan 1

Heads up to SLURM users: Does your SLURM task get a single cpu-core instead of many? If so, you need to be aware that in recent SLURM versions srun no longer inherits --cpus-per-task. I explained how to diagnose and fix this issue here: github.com/stas00/ml-enginee… Even though our SLURM env isn't impacted I updated our templates to future-proof those for when the SLURM version gets updated. This change in behavior was reported here github.com/Lightning-AI/pyto…

Math Lady Hazel 🇦🇷 · Dec 29, 2023 · 4:45 AM UTC

Aaron Gokaslan retweeted

Math Lady Hazel 🇦🇷 @mathladyhazel

29 Dec 2023

The Fourier Transform, explained in one sentence by Stuart Riffle. [bityl.co/NGqj]

602

4,048

Edward Z. Yang · Dec 27, 2023 · 11:29 AM UTC

Aaron Gokaslan retweeted

Edward Z. Yang @ezyang

27 Dec 2023

999 github issues on the wall, 999 github issues, take one down, fix it all done, 998 github issues on the wall 🎵

Volodymyr Kuleshov 🇺🇦 · Dec 28, 2023 · 1:30 AM UTC

Aaron Gokaslan retweeted

Volodymyr Kuleshov 🇺🇦 @volokuleshov

28 Dec 2023

✨Introducing diffusion with learned adaptive noise, a new state-of-the-art model for density estimation✨ Our key idea is to learn the diffusion process from data (instead of it being fixed). This yields a tighter ELBO, faster training, and more! Paper: arxiv.org/pdf/2312.13236.pdf

111

Volodymyr Kuleshov 🇺🇦 · Dec 26, 2023 · 1:20 AM UTC

Aaron Gokaslan retweeted

Volodymyr Kuleshov 🇺🇦 @volokuleshov

26 Dec 2023

It's crazy how many modern generative models are 15-year old Aapo Hyvarinen papers. Noise contrastive estimation => GANs Score matching => diffusion Ratio matching => discrete diffusion If I were a student today, I'd carefully read Aapo's papers, they’re a gold mine of ideas.

119

1,029

MMitchell · Dec 22, 2023 · 6:45 PM UTC

Aaron Gokaslan retweeted

MMitchell @mmitchell_ai

22 Dec 2023

Really, really thankful to @willknight for covering this *critical* piece of the AI puzzle. It's something few would otherwise pay attention to, yet will affect everything that AI will become in the future. Also great quotes from @ruchowdh & @YJernite wired.com/story/americas-ai-…

America’s Big AI Safety Plan Faces a Budget Crunch

NIST, the US agency Joe Biden tasked with curbing the risks of AI, lacks the necessary resources. Lawmakers are concerned it could be forced to rely on private companies developing the technology.

wired.com

Julien Chaumond · Dec 20, 2023 · 4:52 PM UTC

Aaron Gokaslan retweeted

Julien Chaumond

@julien_c

20 Dec 2023

Scrutiny into open source datasets used for ML is a good thing That being said, we should collectively aim to direct at least as much scrutiny (require transparency) into closed source datasets Otherwise this likely leads to (even) less transparency in data & datasets used to train AI models in the future

Jia-Bin Huang · Dec 21, 2023 · 10:17 PM UTC

Aaron Gokaslan retweeted

Jia-Bin Huang @jbhuang0604

21 Dec 2023

Academics: "You should finish your PhD with three papers that you are decidedly passionate about." Job market: "Minimum requirement: 8 top-tier conference papers for research scientist roles."

Ben Recht @beenwrekt

19 Dec 2023

Since we just wrapped up an AI megaconference, it felt like a good day to plead for fewer papers. argmin.net/p/too-much-inform…

478

Aaron Gokaslan · Dec 21, 2023 · 4:38 PM UTC

Aaron Gokaslan

@SkyLi0n

21 Dec 2023

If you are interested learning more about MuLAN, the paper is finally on Arxiv at arxiv.org/abs/2312.13236

Diffusion Models With Learned Adaptive Noise

Diffusion models have gained traction as powerful algorithms for synthesizing high-quality images. Central to these algorithms is the diffusion process, which maps data to noise according to...

arxiv.org

Deedy · Dec 20, 2023 · 2:29 AM UTC

Aaron Gokaslan retweeted

Deedy

@debarghya_das

20 Dec 2023

Stories have 6 primary arcs: • “Rags to riches” (rise) • “Tragedy” (fall) • “Man in a hole” (fall-rise) • “Icarus” (rise-fall) • “Cinderella” (rise-fall-rise) • “Oedipus” (fall-rise-fall) Programmatic analysis validates Kurt Vonnegut's legendary rejected masters thesis.

450

3,642

Ben Poole · Dec 20, 2023 · 8:16 AM UTC

Aaron Gokaslan retweeted

Ben Poole @poolio

20 Dec 2023

One paper can change your life. But which one? Overproductivity doesn't just come from paper counting, but from the desperate acts of young researchers under extreme pressure to be part of that one paper.

Ben Recht @beenwrekt

19 Dec 2023

Since we just wrapped up an AI megaconference, it felt like a good day to plead for fewer papers. argmin.net/p/too-much-inform…

137

Aaron Gokaslan · Dec 15, 2023 · 5:50 PM UTC

Aaron Gokaslan

@SkyLi0n

15 Dec 2023

At poster #1202 and #1203

Aaron Gokaslan · Dec 15, 2023 · 5:49 PM UTC

Aaron Gokaslan

@SkyLi0n

15 Dec 2023

Come check out my posters on MuLAN: Multivariate Learned Adaptive Noise and CommonCanvas at the Diffusion Model Workshop at #NeurIPS #NeurIPS2023

Andrea Tagliasacchi · Dec 10, 2023 · 5:15 AM UTC

Aaron Gokaslan retweeted

Andrea Tagliasacchi @taiyasaki

10 Dec 2023

To the ACs of @CVPR #CVPR2024 if you have not, log in to OpenReview and re-check your assignments. Hard limit to 1 student/paper, and this caused some rather random matches to happen. Reviewers with ~0 topic experience and ~0 citations appointed instead of rising stars🤯

Volodymyr Kuleshov 🇺🇦 · Dec 9, 2023 · 12:01 AM UTC

Aaron Gokaslan retweeted

Volodymyr Kuleshov 🇺🇦 @volokuleshov

9 Dec 2023

2-bit LLaMAs are here! 🦙✨ The new QuIP# ("quip-sharp") algorithm enables running the largest 70B models on consumer-level 24Gb GPUs with a only minimal drop in accuracy. Amazing work led by Cornell students @tsengalb99 @CheeJerry + colleagues @qingyao_sun @chrismdesa [1/n]

Albert Tseng @tsengalb99

8 Dec 2023

🧵 (1/n) 👉 Introducing QuIP#, a new SOTA LLM quantization method that uses incoherence processing from QuIP & lattices to achieve 2 bit LLMs with near-fp16 performance! Now you can run LLaMA 2 70B on a 24G GPU w/out offloading! 💻 cornell-relaxml.github.io/qu…

Albert Tseng · Dec 8, 2023 · 8:30 PM UTC

Aaron Gokaslan retweeted

Albert Tseng @tsengalb99

8 Dec 2023

246

1,048

Jon Barron · Dec 6, 2023 · 6:59 PM UTC

Aaron Gokaslan retweeted

Jon Barron @jon_barron

6 Dec 2023

Me on arXiv this week.

Tomb Hell @tomfellisheokay

17 Oct 2023

The more you know…

104