Bot Eat Brain
Posts
Combating AI Hallucination

Combating AI Hallucination

PLUS: Wall Street AI

Anthony Castrio, Michael Parrish & Dan Bogachek
June 02, 2023

Good morning, human brains. Welcome back to your daily munch of AI news.

Here’s what’s on the menu today:

Latest world AI regulations 🌎
The scoop on the US-EU Trade & Technology Council
Combating AI hallucination 🧠
OpenAI cracks down on LLM mistakes, starting with math 🤓
AI on Wall Street 💸
JPMorgan files a new trademark for their upcoming AI tool

Sponsor Bot Eat Brain | New here? Subscribe!

APPETIZER

The latest on worldwide AI regulation 🌎

A split emerges in the Biden administration: should we strictly control new AI tools like Europe is doing, or should we keep rules lax for a competitive edge against the rest of the world?

The US-EU Trade and Technology Council

As part of the US-EU Trade and Technology Council in Sweden, both sides are discussing newly proposed EU laws, which would require AI developers to document copyrighted material used for training and track its usage.

With U.S. Congress unlikely to pass similar rules, it may be the EU who ends up calling the regulatory shots — requiring more paperwork and oversight for developing AI tools in the future via legislation similar to GDPR.

U.S. Secretary of State & ECVP Margrethe Vestager at the council.

Regulatory attention is shifting to AI risks like the spread of misinformation and privacy violations.

Our take: AI regulation is essential, but striking the right balance between nurturing innovation and maintaining control is tricky. If the EU implements moves forward with stiff AI regulations, they risk pushing AI out of Europe.

One more thing: it’s a long read, but for even more context, here’s the White House’s Official National AI Research and Development Strategic Plan.

WITH SCRIBE

The first AI tool for documenting processes, so you don’t have to.

Scribe AI (powered by GPT-4) auto-generates easy step-by-step guides complete with screenshots and text instructions — automatically with screen capture, as you click and type.

A screenshot of the introduction page for Scribe AI

Using your guides, Scribe AI can create full process documentation (including headings, subheadings, and detailed text) with your guides embedded.

No more staring at a blank document thinking, "ok, I have to teach someone how to do this, where do I start? What are all the steps?" Scribe does it all for you.

BUZZWORD OF THE DAY

RLHF

RLHF is an acroynym of Reinforced Learning from Human Feedback. It’s an AI training method that keeps humans in the loop to ensure accurate and relevant results.

OpenAI’s GPT-3 and GPT-4 were trained using RLHF, which resulted in a big step up in performance.

MAIN COURSE

Solving the tripping AI problem 🍄

Ok, it’s not THAT kind of hallucination…

AI chatbots periodically produce factually incorrect responses or false logic. These are known as hallucinations.

In their recent paper, OpenAI introduced process supervision. An improvement on outcome supervision, process supervision is a new method designed to combat hallucination.

Process Supervision Outperforming Outcome Supervision

In this study, OpenAI trained their model on a math data set. While in the past, ChatGPT has famously struggled with basic arithmetic, OpenAI’s new research model is stepping through some pretty gnarly trigonometry problems:

While it’s unclear how well this will translate in more general applications, it already looks like night and day.

Middle & high school math teachers… watch out.

Quick Definitions:

Outcome Supervision: The status quo RLHF training method. The model gets rewarded for a correct end result, regardless of how it got there. In this method, it’s ok to skip steps as long as the final output makes sense.

Process Supervision: A new approach. During training, the model is rewarded for each right step, encouraging sequentially correct logic and a human-like thought process.

In a time when misinformation spreads like wildfire, the importance of accurate, trustworthy AI cannot be overstated. ChatGPT’s hallucinations are making front-page headlines all over…

The courthouse where ChatGPT produced false claims on trial.

During a Federal case in New York, lawyers using ChatGPT produced non-existent legal cases and arguments used in a personal injury lawsuit. These lawyers could face sanctions, maybe even disbarment.

Order in the court!

A Texas judge now requires lawyers to declare (and fact-check) all AI-generated content in their cases. He’s the first to put in strict requirements, but he probably won’t be the last.

Our take: OpenAI’s new process supervision method looks like it’s unlocking some big new reasoning abilities and aligns much better with how humans think. Let’s see how it does in subsequent updates — we’re skeptically optimistic for now.

A LITTLE SOMETHING EXTRA

JPMorgan’s new AI 💸

JPMorgan Chase is developing a generative AI tool. They’ve dubbed it: IndexGPT.

J.P. Morgan’s official trademark application for IndexGPT

The finance giant intends to apply it in advertising, business consulting, and various finance-focused software services.

CEO Jamie Dimon is bullish on AI and big data.

He revealed that the company is already leveraging AI in 300 different applications, ranging from risk management to customer experience.

As of now, there is no official release date for IndexGPT.

Our take: It’ll be interesting to see how JPMorgan navigates privacy and AI safety when its sole mission is to maximize profit…

MEMES FOR DESSERT

YOUR DAILY MUNCH

Think Pieces

Pew Research: Only 14% of American adults have used ChatGPT even though a majority have heard of it.

Sam Altman: future plans for OpenAI.

The Center for AI Safety: Official statement on the risks of AI. If you wish, you can add your signature.

Startup News

OpenAI releases its security portal, covering data privacy, legal and compliance policies, etc…

OpenAI is awarding several $100,000 grants to better understand AI governance.

Research

Diffusion Language Models Based on Likelihood: Clever tweaks that make language models 10x more efficient.

LLM’s: An A-Z guide covering everything from training, building, and getting the most out of your large language model.

Tools

Learn.xyz: Learn anything via AI-generated and curated courses. Use code “boteatbrain“ for early access.

Roll: A new iOS App that brings film-quality production to iPhone videos.

AI ART-SHOW

G R U N T by @BenDover4Ai

Until next time 🤖😋🧠