The real research behind the wild rumors about OpenAI’s Q* project (2024)

The real research behind the wild rumors about OpenAI’s Q* project (1)

On November 22, a few days after OpenAI fired (and then re-hired) CEO Sam Altman,The Information reportedthat OpenAI had made a technical breakthrough that would allow it to “develop far more powerful artificial intelligence models.” Dubbed Q* (and pronounced “Q star”) the new model was “able to solve math problems that it hadn’t seen before.”

Reuterspublished a similar story, but details were vague.

Both outlets linked this supposed breakthrough to the board’s decision to fire Altman. Reuters reported that several OpenAI staffers sent the board a letter “warning of a powerful artificial intelligence discovery that they said could threaten humanity.” However, “Reuters was unable to review a copy of the letter,” and subsequent reporting hasn’t connected Altman’s firing to concerns over Q*.

The Information reported that earlier this year, OpenAI built “systems that could solve basic math problems, a difficult task for existing AI models.” Reuters described Q* as “performing math on the level of grade-school students.”

Instead of immediately leaping in with speculation, I decided to take a few days to do some reading. OpenAI hasn’t published details on its supposed Q* breakthrough, but ithaspublished two papers about its efforts to solve grade-school math problems. And a number of researchers outside of OpenAI—including at Google’s DeepMind—have been doing important work in this area.

I’m skeptical that Q*—whatever it is—isthecrucial breakthrough that will lead to artificial general intelligence. I certainly don’t think it’s a threat to humanity. But it might be an important step toward an AI with general reasoning abilities.

In this piece, I’ll offer a guided tour of this important area of AI research and explain why step-by-step reasoning techniques designed for math problems could have much broader applications.

The power of reasoning step by step

Consider the following math problem:

John gave Susan five apples and then gave her six more. Susan then ate three apples and gave three to Charlie. She gave her remaining apples to Bob, who ate one. Bob then gave half his apples to Charlie. John gave seven apples to Charlie, who gave Susan two-thirds of his apples. Susan then gave four apples to Charlie. How many apples does Charlie have now?

Before you continue reading, see if you can solve the problem yourself. I’ll wait.

Most of us memorized basic math facts like 5+6=11 in grade school. So if the problem just said, “John gave Susan five apples and then gave her six more,” we’d be able to tell at a glance that Susan had 11 apples.

But for more complicated problems, most of us need to keep a running tally—either on paper or in our heads—as we work through it. So first we add up 5+6=11. Then we take 11-3=8. Then 8-3=5, and so forth. By thinking step-by-step, we’ll eventually get to the correct answer: 8.

The same trick works for large language models. In afamous January 2022 paper, Google researchers pointed out that large language models produce better results if they are prompted to reason one step at a time. Here’s a key graphic from their paper:

The real research behind the wild rumors about OpenAI’s Q* project (2)

This paper was published before “zero-shot” prompting was common, so they prompted the model by giving an example solution. In the left-hand column, the model is prompted to jump straight to the final answer—and gets it wrong. On the right, the model is prompted to reason one step at a time and gets the right answer. The Google researchers dubbed this technique chain-of-thought prompting; it is still widely used today.

If you read ourJuly articleexplaining large language models, you might be able to guess why this happens.

To a large language model, numbers like “five” and “six” are tokens—no different from “the” or “cat.” An LLM learns that 5+6=11 because this sequence of tokens (and variations like “five and six make eleven”) appears thousands of times in its training data. But an LLM’s training data probably doesn’t include any examples of a long calculation like ((5+6-3-3-1)/2+3+7)/3+4=8. So if a language model is asked to do this calculation in a single step, it’s more likely to get confused and produce the wrong answer.

Another way to think about it is that large language models don’t have any external “scratch space” to store intermediate results like 5+6=11. Chain-of-thought reasoning enables an LLM to effectively use its own output as scratch space. This allows it to break a complicated problem down into bite-sized steps—each of which is likely to match examples in the model’s training data.

The real research behind the wild rumors about OpenAI’s Q* project (2024)

References

Top Articles
Daisy Edgar Jones shares photos with Paul Mescal at Glastonbury
Ford Motor Company (F) Stock Price, Quote & News - Stock Analysis
Craftsman M230 Lawn Mower Oil Change
Nehemiah 4:1–23
Junk Cars For Sale Craigslist
Trabestis En Beaumont
Costco The Dalles Or
Fcs Teamehub
Derpixon Kemono
Luciipurrrr_
Craigslist Labor Gigs Albuquerque
Truck Toppers For Sale Craigslist
Simon Montefiore artikelen kopen? Alle artikelen online
Tracking Your Shipments with Maher Terminal
Slope Tyrones Unblocked Games
WEB.DE Apps zum mailen auf dem SmartPhone, für Ihren Browser und Computer.
Tamilyogi Proxy
Talbots.dayforce.com
Jc Green Obits
Craigs List Tallahassee
European city that's best to visit from the UK by train has amazing beer
Piri Leaked
Bay Area Craigslist Cars For Sale By Owner
480-467-2273
Hesburgh Library Catalog
Safeway Aciu
Taylored Services Hardeeville Sc
How To Improve Your Pilates C-Curve
Dairy Queen Lobby Hours
FREE Houses! All You Have to Do Is Move Them. - CIRCA Old Houses
Kaiserhrconnect
Rock Salt Font Free by Sideshow » Font Squirrel
Haunted Mansion Showtimes Near Cinemark Tinseltown Usa And Imax
Emily Katherine Correro
Panchang 2022 Usa
Metra Union Pacific West Schedule
Tamil Play.com
Tgh Imaging Powered By Tower Wesley Chapel Photos
Flashscore.com Live Football Scores Livescore
Hell's Kitchen Valley Center Photos Menu
Mvnt Merchant Services
„Wir sind gut positioniert“
Taylor University Baseball Roster
Other Places to Get Your Steps - Walk Cabarrus
All-New Webkinz FAQ | WKN: Webkinz Newz
Lamp Repair Kansas City Mo
National Weather Service Richmond Va
The Blackening Showtimes Near Ncg Cinema - Grand Blanc Trillium
Smoke From Street Outlaws Net Worth
O'reilly's On Marbach
Provincial Freeman (Toronto and Chatham, ON: Mary Ann Shadd Cary (October 9, 1823 – June 5, 1893)), November 3, 1855, p. 1
Craigslist Centre Alabama
Latest Posts
Article information

Author: Stevie Stamm

Last Updated:

Views: 5944

Rating: 5 / 5 (80 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Stevie Stamm

Birthday: 1996-06-22

Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

Phone: +342332224300

Job: Future Advertising Analyst

Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.