- Bot Eat Brain
- Posts
- VASA-1 generates realistic talking videos from images
VASA-1 generates realistic talking videos from images
PLUS: Meta. Meta everwhere.
TOGETHER WITH
Good morning, human brains, and welcome back to your daily munch of AI news.
Here’s what’s on the menu today:
Turn your shallow-fake into a deepfake 🦾 🤖
VASA-1 generates editable, realistic talking head videos.
57.3% more realistic than reality? 🌹 💨
PhysDreamer adds realistic motion to objects in AI-generated videos.
Meta has its own ChatGPT now 😳 💻
Meta launched a chat interface powered by Llama 3.
MAIN COURSE
Boom. Headshot. 🥸 👀
Last week, Microsoft researchers introduced VASA-1. It generates realistic talking head videos from an image and an audio clip.
What does it do?
VASA-1 generates videos with precise lip-syncing, lifelike facial expressions, natural head movements, and more.
Is it good?
It outperforms existing state-of-the-art methods in lip synchronization, head motion realism, and video quality. It allows you to control the gaze direction, head distance, and emotional expressions, and more.
What kind of vids can it make?
It produces 512 × 512 pixel videos at up to 40 frames per second. It’s highly versatile and tackles artistic photographs, singing vocals, and speech in many languages.
Why do I care?
VASA-1 could enhance learning experiences, provide support for individuals with disabilities, improve patient care through advanced diagnostics and personalized treatment plans, and more.
SPONSORED BY VANTA
Free SOC 2 Compliance Checklist from Vanta
Are you building a business? Achieving SOC 2 compliance can help you win bigger deals, enter new markets, and deepen trust with your customers — but it can also cost you real time and money.
Vanta automates up to 90% of the work for SOC 2 (along with other in-demand frameworks like ISO 27001, HIPAA, and GDPR), getting you audit-ready in weeks instead of months and saving you up to 85% of associated costs.
SIDE SALAD
AI-generated motion > Real motion 🌹 💨
What kind of dreams do hotels have? 🏨 Suite dreams… 🤭
On Friday, MIT, Stanford, Columbia, and Cornell Researchers introduced Physdreamer. It adds realistic, interactive dynamics to objects in AI-generated videos.
It does what, now?
PhysDreamer is a technique that gives 3D objects realistic motion. It uses principles from physics and insights from video models to make 3D objects react more naturally to wind, touch, movement, and more.
How does it work?
It uses 3D Gaussians to represent objects, a neural field to model the physical properties, and a flexible physics engine called the Material Point Method (MPM) to simulate the motion.
Dope or nope?
In a user study, 53.7% of people thought the generated movements were more realistic than other methods, even when compared to real video footage.
A LITTLE SOMETHING EXTRA
Meta. Meta everywhere… 😳 💻
The internet had a… Meta-morphosis 🥴
On Thursday, Meta launched meta.ai. It’s a website with a chat interface that’s powered by Meta’s AI model, Llama 3.
It looks… Familiar… 🧐
What can it do?
Meta claims Llama 3 is the most powerful free AI assistant available on the web. It can help you plan trips, generate images from text prompts, convert images into GIFs, and more.
When can I use it?
Meta.ai is available to use now in the US, Australia, Canada, Africa, and more.
YOUR DAILY MUNCH
Otio
Enhance your research efficiency with Otio. Automatically summarize content from papers, videos, tweets, and more. Engage directly with insights through our interactive AI. Write, edit, and paraphrase with precision using our AI-powered text editor.
Tools
Direqt — a chatbot trained on your data that integrates onto your site.
CoachVox — clone yourself and charge people to interact with you.
MailSplash — an email marketing assistant that designs, writes, and more.
PhoneScreen — a recruitment tool that ranks your candidates with AI.
Think Pieces
How much AI is too much AI? A look at 11 recently introduced models and how they’re miniscule in the grand scheme of AI.
Open-source AI isn’t really open-source. Many of the largest open-source AI models don’t have their training data openly available.
The age of internet users is decreasing. How U.K. regulators look into combatting malicious content involving children.
Startup News
OpenAI unveiled enhancements to its Assistants API. The improvements include efficient data handling, improved file search, vector storage, and more.
Meta launched chatbots powered by Llama 3. You’ll now see “Meta AI” on Facebook and Instagram.
Hugging Face released Open-Medical-LLM. It’s a benchmark that tests how generative AI models perform on various health-related tasks.
Research
HQ-Edit — generates pairs of before and after images that show detailed changes based on written instructions to enhance image editing software.
SIMA — a large dataset of gameplay from various video games that trains AI agents to follow instructions, comprehend real-world environments, and more.
TRIFORCE — a method that speeds up LLMs’ generation of long text sequences and minimizes memory usage and computational delays.
MEMES FOR DESSERT
TWEET OF THE DAY
AI can create art utilizing radial gradients, random placements of color particles, and more.
Tag us on Twitter @BotEatBrain for a chance to be featured here tomorrow.
AI ART-SHOW
Until next time 🤖😋🧠
What'd you think of today's newsletter? |