Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
CMD: Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
TF-T2V: A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
W.A.L.T: Photorealistic Video Generation with Diffusion Models
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Where Memory Ends and Generative AI Begins: New photo manipulation tools from Google and Adobe are blurring the lines between real memories and those dreamed up by AI
Parsing-Conditioned Anime Translation: A New Dataset and Method
OpenAI CEO Sam Altman on GPT-4: ‘people are begging to be disappointed and they will be’
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Latent Video Diffusion Models for High-Fidelity Video Generation with Arbitrary Lengths
AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies
Imagen Video: High Definition Video Generation with Diffusion Models
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Make-A-Video: Text-to-Video Generation without Text-Video Data
InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images
NUWA-∞: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
OmniMAE: Single Model Masked Pretraining on Images and Videos
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
TATS: Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Reinforcement Learning with Action-Free Pre-Training from Videos
Transframer: Arbitrary Frame Prediction with Generative Models
General-purpose, long-context autoregressive modeling with Perceiver AR
Microdosing: Knowledge Distillation for GAN based Compression
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN-2
U.S. vs. China Rivalry Boosts Tech—and Tensions: Militarized AI threatens a new arms race
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Learning a perceptual manifold with deep features for animation video resequencing
Autoregressive Latent Video Prediction with High-Fidelity Image Generator
GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)
NWT: Towards natural audio-to-video generation with representation learning
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’: The Beijing Academy of Artificial Intelligence (BAAI) releases Wu Dao 1.0, China’s first large-scale pretraining model.
Greedy Hierarchical Variational Autoencoders (GHVAEs) for Large-Scale Video Prediction
SIREN: Implicit Neural Representations with Periodic Activation Functions
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks
Learning to Predict Without Looking Ahead: World Models Without Forward Prediction
Learning to Predict Without Looking Ahead: World Models Without Forward Prediction [blog]
NoGAN: Decrappification, DeOldification, and Super Resolution
THUDM/CogVideo: Text-To-Video Generation. The Repo for ICLR2023 Paper "CogVideo: Large-Scale Pretraining for Text-To-Video Generation via Transformers"
PaintsUndo: A Base Model of Drawing Behaviors in Digital Paintings
https://blog.metaphysic.ai/the-road-to-realistic-full-body-deepfakes/
https://lilianweng.github.io/posts/2024-04-12-diffusion-video/
https://research.google/blog/google-research-2022-beyond-language-vision-and-generative-models/
https://research.google/blog/videopoet-a-large-language-model-for-zero-shot-video-generation/
https://www.bloomberg.com/news/articles/2023-04-27/fed-s-powell-tricked-by-russian-pranksters-posing-as-zelenskiy?y
https://www.chinatalk.media/p/reflections-from-neurips-the-worlds#%C2%A7chinas-ai-generated-youtube-propaganda
https://www.reddit.com/r/OpenAI/comments/1bgcvut/the_world_will_never_be_the_same_after_sora/
https://www.reddit.com/r/StableDiffusion/comments/119vvzg/bad_apple_but_its_rendered_and_colorized_with/
https://www.reddit.com/r/StableDiffusion/comments/12pvhhm/animov01_highresolution_anime_finetune_of/
https://www.reddit.com/r/StableDiffusion/comments/161qkeb/ai_burger_commercial_source_matancohengrumi/
https://www.reddit.com/r/StableDiffusion/comments/17b4dfc/my_first_try_with_video/
https://www.reddit.com/r/StableDiffusion/comments/1avou9y/the_current_state_of_img2vid_will_smith_eating/
https://www.reddit.com/r/StableDiffusion/comments/1bhs3rl/openai_keeps_dropping_more_insane_sora_videos/
https://www.reddit.com/r/StableDiffusion/comments/1f5x795/movement_is_almost_human_with_klingai/
https://www.reddit.com/r/StableDiffusion/comments/ys434h/animating_generated_face_test/
https://www.reddit.com/r/midjourney/comments/12xw3d2/definitely_wasted_3_hours_of_my_life_making_this/
https://www.reddit.com/r/midjourney/comments/1g7hk22/cursed_shore/
https://www.reddit.com/r/midjourney/comments/1gi1ptl/morphing_within_a_morphing/
https://www.samdickie.me/writing/experiment-1-creating-a-landing-page-using-ai-tools-no-code
https://www.theguardian.com/world/2023/nov/06/chinese-influencers-using-ai-digital-clones-of-themselves-to-pump-out-content
https://www.tomshardware.com/news/nvidia-hints-at-dlss-10-delivering-full-neural-rendering-potentially-replacing-rasterization-and-ray-tracing
https://www.wired.com/story/yahoo-boys-real-time-deepfake-scams/
https://yosefk.com/blog/the-state-of-ai-for-hand-drawn-animation-inbetweening.html
TF-T2V: A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
https%253A%252F%252Farxiv.org%252Fabs%252F2312.15770%2523alibaba.html
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
https%253A%252F%252Farxiv.org%252Fabs%252F2311.18829%2523microsoft.html
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
https%253A%252F%252Farxiv.org%252Fabs%252F2311.04145%2523alibaba.html
https%253A%252F%252Farxiv.org%252Fabs%252F2302.01329%2523google.html
OpenAI CEO Sam Altman on GPT-4: ‘people are begging to be disappointed and they will be’
https%253A%252F%252Fwww.theverge.com%252F23560328%252Fopenai-gpt-4-rumor-release-date-sam-altman-interview.html
https%253A%252F%252Farxiv.org%252Fabs%252F2212.05199%2523google.html
NUWA-∞: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
https%253A%252F%252Farxiv.org%252Fabs%252F2207.09814%2523microsoft.html
OmniMAE: Single Model Masked Pretraining on Images and Videos
https%253A%252F%252Farxiv.org%252Fabs%252F2206.08356%2523facebook.html
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
TATS: Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
https%253A%252F%252Farxiv.org%252Fabs%252F2204.03638%2523facebook.html
General-purpose, long-context autoregressive modeling with Perceiver AR
https%253A%252F%252Farxiv.org%252Fabs%252F2202.07765%2523deepmind.html
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN-2
U.S. vs. China Rivalry Boosts Tech—and Tensions: Militarized AI threatens a new arms race
https%253A%252F%252Fspectrum.ieee.org%252Fchina-us-militarized-ai.html
https%253A%252F%252Farxiv.org%252Fabs%252F2106.04615%2523deepmind.html
China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’: The Beijing Academy of Artificial Intelligence (BAAI) releases Wu Dao 1.0, China’s first large-scale pretraining model.
https%253A%252F%252Fsyncedreview.com%252F2021%252F03%252F23%252Fchinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0%252F%2523baai.html
https%253A%252F%252Farxiv.org%252Fabs%252F2010.14701%2523openai.html
Wikipedia Bibliography: