Bot Eat Brain
Posts
Meta announced Video Avatars

Meta announced Video Avatars

PLUS: Meta's new peeping tom

Michael Parrish
September 27, 2024

In partnership with

Good morning, human brains, and welcome back to your daily munch of AI news.

Here’s what’s on the menu today:

Clone yourself, ditch meetings 🎥 👤
Meta announced its video avatars.
Siri just wet herself 🗣️ 🤖
OpenAI rolled out Advanced Voice.
Your future: Now with more ads 👓 🔮
Meta unveiled Orion, its prototype AR device.
New here? Subscribe!😎

Sponsor Bot Eat Brain and reach over 20,000 readers.

(Now 50% off) 🤯

Peep today's ‘What Would You Do?’ at the bottom. 👇

MAIN COURSE

Zuckerburg stole your face 🎥 👤

Yesterday, Meta announced its video avatars. These allow you to generate realistic digital versions of yourself.

Source: Meta

What's new?

They can now engage in realistic voice conversations across Messenger, Instagram, and WhatsApp.

Is it impressive?

In the demo, the avatar's speech patterns and lip sync were far from perfect, but they were pretty convincing.

Can I use them now?

Not yet. Meta plans to test these video avatars next year.

Aren’t there already AI avatars?

Oh, yeah. Last month, we reported on Synthesia’s Personal Avatars. You can create a digital clone by reading a 60-second text on a webcam and waiting a day for it to generate.

Last year, we covered a paralyzed woman that spoke for the first time in 18 years. She used a brain implant developed by UCSF and UC Berkeley that let her speak through a digital avatar.

SIDE SALAD

AI remembers. Delete your chats. 🗣️ 🤖

On Tuesday, OpenAI rolled out Advanced Voice to ChatGPT Plus and Teams users. This update aims to bring more natural and responsive AI-powered voice interactions.

Source: OpenAI

What's new with Advanced Voice?

It uses GPT-4o, a multimodal model that natively understands speech, including non-verbal cues.

What are the features?

5 new voices, for a total of 9 distinct personalities.
Improved accent recognition.
Support for over 50 languages.
Custom Instructions and Memory now work with voice.

Are there any usage limits?

Yes, you get a 15-minute warning before reaching the daily cap.

Is it safe to use?

Audio clips are stored with chat transcripts but deleted within 30 days if you remove the chat. OpenAI doesn't train on voice data without an opt-in.

A LITTLE SOMETHING EXTRA

Meta's new peeping tom 👓 🔮

On Wednesday, Meta unveiled Orion. It's a prototype Augmented Reality (AR) device.

Source: Meta

What is Orion?

It's Meta's vision for AR glasses that blend digital information with the physical world, resembling everyday eyewear.

Is it available now?

No, it's still a prototype and not ready for consumer release.

Has it been in development long?

Orion has been a decade in the making, starting with Meta's acquisition of Oculus in 2014.

Are there any other AR glasses?

Last week, we reported on Snap’s 5th-generation AI glasses. They offer voice-controlled AR filters and integration with OpenAI's AI models.

In April, we covered Meta’s updates to its smart glasses. These include video calling, hands-free queries, and more.

Back in February, we reported Oppo’s Air Glass 3 smart glasses. It boasted smart technology, AI integration, and more.

YOUR DAILY MUNCH

Think Piece 🧠

Is AI the most cost-depreciating tech ever? OpenAI’s cost per million tokens fell from $36 to $0.25 in the last 18 months.

Startup News 💰

Meta released Llama 3.2. They’re lightweight models for edge devices and larger multimodal models with vision capabilities. It comes in 1B, 3B, 11B, and 90B models.

MIT researchers developed PDF2Audio. It’s an open-source tool that converts complex documents into audio formats such as podcasts, lectures, and summaries.

Figma launched First Draft. Its design tool, previously known as Make Designs, that was initially pulled after it generated designs resembling Apple's iPhone weather app.

Research 👨‍🔬

RACER — a language-guided failure recovery system for robotic manipulation, combining vision-language models (VLM) and language-conditioned visuomotor policies.

PixWizard — a versatile image-to-image visual assistant designed to generate, manipulate, and translate images based on open-language instructions.

HelloBench — a new benchmark designed to evaluate the long text generation capabilities of Large Language Models (LLMs).

FURRY FRIENDS

Respond to this email with your pet’s name and pic for a chance to be featured here tomorrow!

MEMES FOR DESSERT

AI ART SHOW

“Broken Laughter” by @digitallywired

WHAT WOULD YOU DO?

Let’s go camping… 🏕️

Which is your favorite? 👇