Hi there!
I'm a scientist, hacker, engineer, founder, and occasional investor and adviser.
I'm currently working on building networks of little sensors that use machine learning to help wind farms produce more energy over at Windscape AI. If working on such things excites you, drop me a line!
My scientific work focuses on training, understanding, and improving neural networks.
I serve as president of the ML Collective research group, where we work on fun problems and try to make ML research more accessible to all by collaborating across traditional industrial and academic lab boundaries. If you're interested in learning more, stop by our open reading group on Friday!
I'm also a scientific advisor to Recursion Pharmaceuticals, and I invest in and advise a few other companies bringing ML to bear on real world problems.
Previously I helped start Uber AI Labs after Uber acquired our startup, Geometric Intelligence.
I completed my Ph.D. at Cornell, where at various times I worked with
Hod Lipson
(at the Creative Machines Lab),
Yoshua Bengio
(at U. Montreal's MILA), Thomas Fuchs (at Caltech JPL), and
Google DeepMind. I was fortunate to be supported
by a NASA Space Technology Research Fellowship, which gave me the opportunity to trek around and work with all these great folks.
Find me on Bluesky, Twitter, or Github if you'd like.
Research
👇 Below are some older projects. 👉 More recent ones are over at ML Collective.

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
Anh Nguyen, Jeff Clune, Yoshua Bengio, Alexey Dosovitskiy, and Jason Yosinski
Methods that generate images by iteratively following class gradients in image space
in some cases have been used to produce unrealistic adversarial or fooling images (Szegedy et al, 2013, Nguyen et al, 2014) and in other cases have been used as pseudo-generative models to produce somewhat realistic images that show good global structure but still don't look fully natural (Yosinski et al, 2015) or do look natural but lack diversity (Nguyen et al. 2016). Deficiencies in previous approaches result in part from training and sampling methods that have just been hacked together to produce pretty pictures rather than designed from the ground up as a trainable, generative model.
In this paper, we formalize consistent training and sampling procedures for such models and as a result obtain much more diverse and visually compelling samples.
Read more »

Convergent Learning: Do different neural networks learn the same representations?
Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, and John Hopcroft
Deep neural networks have recently been working really well, which has prompted active investigation into the features learned in the middle of the network. The investigation is hard because it requires making sense of millions of learned parameters. But it’s also valuable, because any understanding we acquire promises to help us build and train better models. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We probe representations by training multiple networks and then comparing and contrasting their individual, learned features at the level of neurons or groups of neurons. This initial investigation has led to several insights which you will find out if you read the paper.
Read more »
Understanding Neural Networks Through Deep Visualization
Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson
Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Here we introduce two tools for better visualizing and interpreting neural nets. The first is a set of new regularization methods for finding preferred activations using optimization, which leads to clearer and more interpretable images than had been found before. The second tool is
an interactive toolbox that visualizes the activations produced on each layer of a trained convnet. You can input image files or read video from your webcam, which we've found fun and informative. Both tools are open source.
Read more »
Deep Neural Networks are Easily Fooled
Anh Nguyen, Jason Yosinski, and Jeff Clune
Deep neural networks (DNNs) have recently been doing very well at visual classification problems (e.g. recognizing that one image is of a lion and another image is of a school bus). A recent study by Szegedy et al. showed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a network to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). We show methods of producing fooling images both with and without the class gradient in pixel space. The results shed light on interesting differences between human vision and state-of-the-art DNNs.
Read more »
How Transferable are Features in Deep Neural Networks?
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson
Many deep neural networks trained on natural
images exhibit a curious phenomenon: they all learn roughly the same Gabor filters
and color blobs on the first layer. These features seem to
be generic — useful for many datasets and tasks
— as opposed to specific — useful for only one
dataset and task. By the last layer features must be task specific,
which prompts the question: how do features transition from general
to specific throughout the network? In this paper, presented at NIPS
2014, we show the manner in which features transition from general
to specific, and also uncover a few other interesting results along
the way.
Read more »

Generative Stochastic Networks
Yoshua Bengio, Éric Thibodeau-Laufer, Guillaume Alain, and Jason Yosinski
Unsupervised learning of models for probability distributions
can be difficult due to intractable partition
functions. We introduce a general family of models called
Generative Stochastic
Networks (GSNs) as an
alternative to maximum likelihood. Briefly, we show how to learn the
transition operator of a Markov chain whose stationary distribution
estimates the data distribution. Because this transition distribution
is a conditional distribution, it's often much easier to learn than
the data distribution itself. Intuitively, this works by pushing
the complexity that normally lives in the partition function into the
“function approximation” part of the transition operator,
which can be learned via simple backprop. We validate the theory by
showing several successful experiments on two image datasets and with
a particular architecture that mimics the Deep Boltzmann Machine but
without the need for layerwise pretraining.
Aracna
Sara Lohmann*, Jason Yosinski*, Eric Gold, Jeff Clune, Jeremy Blum and Hod Lipson
(read the
paper) Many labs work on gait learning research, but since they each use different robotic
platforms to test out their ideas, it is hard to compare results
between these teams. To encourage greater collaboration between
scientists, we have developed
Aracna, an open-source, 3D printed platform that anyone can use for robotic
experiments.
AI vs. AI
Igor Labutov*, Jason Yosinski*, and Hod Lipson
As part of a class
project, Igor
Labutov and I cobbled together a speech-to-text + chatbot + text-to-speech system that could converse with a user. We
then hooked up two such systems, gave them names (Alan and Sruthi), and
let them talk together,
producing endless
robotic comedy. Somehow
the video became popular. There
was an XKCD about it, and Sruthi
even told Robert Siegel to “be afraid”
on NPR. Dress appropriately for the
coming robot uprising with one of our
fashionable t-shirts.
Gait Learning on QuadraTot
Jason Yosinski, Jeff Clune, Diana Hidalgo, Sarah Nguyen, Juan Zagal, and Hod Lipson
(read
the paper) Getting robots to walk is tricky. We compared many
algorithms for automating the creation of quadruped gaits, with all
the learning done in hardware (read: very time consuming). The best
gaits we found were nearly 9 times faster than a hand-designed gait
and exhibited complex motion patterns that contained multiple
frequencies, yet showed coordinated leg movement. More
recent work blends learning in simulation and reality to create
even faster gaits.
Nevermind all this, just show me the videos!
Or, if you prefer, here's a slightly outdated CV.
Selected Papers and Postersmore »
-

(pdf)
-

(pdf)
-

(pdf)
-

(pdf)
-

(pdf)
-

(pdf)
Google scholar |
see all 39 papers and posters »
Selected Pressmore »
Through the Wormhole with Morgan Freeman: Through the Wormhole with Morgan Freeman: Are Robots the Future of Human Evolution? See our walking robots from 7:00 - 7:45 and 9:40 - 11:10. (Season 4, episode 7. unreliable video link) July 10, 2013
see more press »
Miscellaneous
Before grad school, I did my undergrad
at Caltech and then worked on
estimation at a research-oriented applied math
startup for a couple years.