5 Wall Street Legends Just Bought This One Stock

"If you're not at the table, you're on the menu," says former Goldman Sachs VP, Dr. David Eifrig.

AI 'gold rush' for chatbot training data could run out of human-written text

MATT O'BRIEN
June 06, 2024

Artificial intelligence systems like ChatGPT could soon run out of what keeps making them smarter -- the tens of trillions of words people have written and shared online.

A new study released Thursday by research group Epoch AI projects that tech companies will exhaust the supply of publicly available training data for AI language models by roughly the turn of the decade -- sometime between 2026 and 2032.

Comparing it to a "literal gold rush" that depletes finite natural resources, Tamay Besiroglu, an author of the study, said the AI field might face challenges in maintaining its current pace of progress once it drains the reserves of human-generated writing.

In the short term, tech companies like ChatGPT-maker OpenAI and Google are racing to secure and sometimes pay for high-quality data sources to train their AI large language models - for instance, by signing deals to tap into the steady flow of sentences coming out of Reddit forums and news media outlets.

In the longer term, there won't be enough new blogs, news articles and social media commentary to sustain the current trajectory of AI development, putting pressure on companies to tap into sensitive data now considered private -- such as emails or text messages -- or relying on less-reliable "synthetic data" spit out by the chatbots themselves.

"There is a serious bottleneck here," Besiroglu said. "If you start hitting those constraints about how much data you have, then you can't really scale up your models efficiently anymore. And scaling up models has been probably the most important way of expanding their capabilities and improving the quality of their output."

The researchers first made their projections two years ago -- shortly before ChatGPT's debut -- in a working paper that forecast a more imminent 2026 cutoff of high-quality text data. Much has changed since then, including new techniques that enabled AI researchers to make better use of the data they already have and sometimes "overtrain" on the same sources multiple times.

But there are limits, and after further research, Epoch now foresees running out of public text data sometime in the next two to eight years.

The team's latest study is peer-reviewed and due to be presented at this summer's International Conference on Machine Learning in Vienna, Austria. Epoch is a nonprofit institute hosted by San Francisco-based Rethink Priorities and funded by proponents of effective altruism -- a philanthropic movement that has poured money into mitigating AI's worst-case risks.

Besiroglu said AI researchers realized more than a decade ago that aggressively expanding two key ingredients -- computing power and vast stores of internet data -- could significantly improve the performance of AI systems.

The amount of text data fed into AI language models has been growing about 2.5 times per year, while computing has grown about 4 times per year, according to the Epoch study. Facebook parent company Meta Platforms recently claimed the largest version of their upcoming Llama 3 model -- which has not yet been released -- has been trained on up to 15 trillion tokens, each of which can represent a piece of a word.

But how much it's worth worrying about the data bottleneck is debatable.

"I think it's important to keep in mind that we don't necessarily need to train larger and larger models," said Nicolas Papernot, an assistant professor of computer engineering at the University of Toronto and researcher at the nonprofit Vector Institute for Artificial Intelligence.

Papernot, who was not involved in the Epoch study, said building more skilled AI systems can also come from training models that are more specialized for specific tasks. But he has concerns about training generative AI systems on the same outputs they're producing, leading to degraded performance known as "model collapse."

Training on AI-generated data is "like what happens when you photocopy a piece of paper and then you photocopy the photocopy. You lose some of the information," Papernot said. Not only that, but Papernot's research has also found it can further encode the mistakes, bias and unfairness that's already baked into the information ecosystem.

If real human-crafted sentences remain a critical AI data source, those who are stewards of the most sought-after troves -- websites like Reddit and Wikipedia, as well as news and book publishers -- have been forced to think hard about how they're being used.

"Maybe you don't lop off the tops of every mountain," jokes Selena Deckelmann, chief product and technology officer at the Wikimedia Foundation, which runs Wikipedia. "It's an interesting problem right now that we're having natural resource conversations about human-created data. I shouldn't laugh about it, but I do find it kind of amazing."

While some have sought to close off their data from AI training -- often after it's already been taken without compensation -- Wikipedia has placed few restrictions on how AI companies use its volunteer-written entries. Still, Deckelmann said she hopes there continue to be incentives for people to keep contributing, especially as a flood of cheap and automatically generated "garbage content" starts polluting the internet.

AI companies should be "concerned about how human-generated content continues to exist and continues to be accessible," she said.

From the perspective of AI developers, Epoch's study says paying millions of humans to generate the text that AI models will need "is unlikely to be an economical way" to drive better technical performance.

As OpenAI begins work on training the next generation of its GPT large language models, CEO Sam Altman told the audience at a United Nations event last month that the company has already experimented with "generating lots of synthetic data" for training.

"I think what you need is high-quality data. There is low-quality synthetic data. There's low-quality human data," Altman said. But he also expressed reservations about relying too heavily on synthetic data over other technical methods to improve AI models.

"There'd be something very strange if the best way to train a model was to just generate, like, a quadrillion tokens of synthetic data and feed that back in," Altman said. "Somehow that seems inefficient."

------------

The Associated Press and OpenAI have a licensing and technology agreement that allows OpenAI access to part of AP's text archives.

Continue Reading...

Popular

Trump Reportedly Makes Decision On VP Pick, Says Running Mate Will Be At Debate

Former President Donald Trump has made a decision on his vice presidential nominee for the forthcoming election.

Trump Vs. Biden: New 2024 Election Poll Show Tie, But Here's Who Independent Voters Favor

A new 2024 election poll shows Donald Trump and Joe Biden tied. The poll shows inflation a key concern for voters moving forward.

Weekend Trading: The Key to Conquer Volatility with Tim Sykes! - Ad

Discover the power of weekend trading with Tim Sykes' strategy. Turn market turbulence into your advantage. Trade with us and win in any market condition.

Powerball Winnings Gone: What Happens To Unclaimed Lottery Tickets

Powerball winners often take home tens to hundreds of millions of dollars. Tickets that are lost or not claimed in time end in a different result.

Wall St. Legend: Millions of Americans About to Fall Out of the 1% - Ad

Wall Street insider reveals the financial tidal wave about to knock millions of Americans out of the One Percent. Predicts it will be far worse for everyone else. Warns EVERYONE to seek higher ground now!

Trump Is Raising Enough Money To Wipe Out Biden's Once Significant Cash Advantage

The campaign of former President Donald Trump has seen a significant increase in donations, surpassing that of President Joe Biden, following Trump's felony conviction.

America's 'Silent Invasion' - Is Your Town in the Crosshairs? - Ad

A silent invasion is happening all over America. It has nothing to do with the border but will end with even more disastrous results. Why? A faceless entity is coming by air, rail, port, and highway. What does it mean for you and your money? There's still time to prepare, but you must act today.

Fire at lithium battery factory in South Korea kills 1, leaves more than 20 missing

SEOUL, South Korea (AP) — A fire at a lithium battery manufacturing factory near South Korea's capital on Monday left one person dead, three injured and more than 20 others missing, officials said.

The Tiny Tech Disrupting the $28.4B Hospital Infection Epidemic - Ad

Hospital-acquired infections are a silent killer, but one small company's nanotechnology could put an end to this costly crisis. Their proactive approach is turning heads in the healthcare industry.

Dog fight! Joey Chestnut out of July 4 hot dog eating contest due to deal with rival brand

NEW YORK (AP) — America’s perennial hot dog swallowing champion won’t compete in this year’s Independence Day competition due to a contract dispute, organizers said Tuesday.

Event Contract Trading Signals Uncertainty In Apple Monopoly Case During WWDC

Apple's chance of being seen as monopoly fluctuated during WWDC, with event contract trader Kalshi reporting a high of 68% on April 11.

This Billionaire's Final Masterpiece: "X-9840" - Ad

The investment legend who predicted the rise of Bitcoin, Facebook and streaming services like Netflix... Just released the details on what he's calling Project X-9840. This business mogul is now set to revolutionize MONEY. He has said he could flip the switch "as early as mid 2024."

Why Alibaba Stock Could Rally: Golden Cross In Sight

Alibaba is nearing a bullish Golden Cross, with strong AI and cloud ambitions, despite facing competition and bearish momentum. Solid business, attractive valuation, and potential uptrend signal future growth.

The 7th Trillion Dollar Company? - Ad

There are six American companies worth over $1 trillion. Could there be a seventh soon? This company signed a massive deal with Apple until 2040 and is also involved with Nvidia, Google, and Samsung. Could this be the next trillion-dollar company? A top stock expertpicker thinks so.

Is Altseason Around The Corner? Real Vision Analyst Reveals Key Indicators

Real Vision analyst Jamie Coutts on Thursday presented an optimistic outlook for high-quality altcoins, highlighting that the current setup appears increasingly favorable as global liquidity gears up for expansion.

The US fines Middle Eastern airline Emirates $1.8 million for flights that passed too low over Iraq

WASHINGTON (AP) — The Transportation Department said Thursday that it fined Middle Eastern $1.8 million for flights in regions off-limits to U.S. airlines while it allowed JetBlue Airways to sell seats on the planes.

Elon Musk's Crazy New Experiment...REVEALED - Ad

On January 28th of 2024... Elon Musk launched a crazy AI experiment involving a real human in California. Elon already invested $100 million of his own money into this AI project... Because he knows the profits here could be ridiculous.

Tesla's Chinese Rival NIO Unveils Fourth-Generation Battery Swap Stations With Nvidia Chips, Multi-Brand Support: Report

NIO debuts advanced fourth-gen battery swap stations with NVIDIA chips, enabling autonomous capabilities and faster service, revolutionizing electric vehicle charging infrastructure in China.

This Biopharma Stock Is Trading 23% Higher In Pre-Market After Wrapping Up $4M Public Offering

The public offering included 1,199,448 shares of common stock and pre-funded warrants to purchase up to 2,568,110 shares of common stock.

Discover "U.S.'s New Money" Before Biden Wins - Ad

A new form of money has emerged in America, and it's making some folks wildly rich... (Musk and Bezos both use it). Find out how you can too in this free video.

Joe Biden's Debate Performance Triggers Democratic Panic: 'Biden Sounds Hoarse, Looks Tired And Is Babbling'

President Joe Biden faced a challenging time during the recent presidential debate, leaving Democrats anxious about his ability to secure a second term.

Claim Your Free "AI Income Playbook" Now! - Ad

Income expert Marc Lichtenfeld has a special gift reserved just for you... It's his brand-new "AI Income Playbook" - and it's yours FREE! Inside, you'll discover how to tap the unlimited profit potential of artificial intelligence... With three explosive AI stocks that are handing out monster dividends like clockwork.

General Motors Adjusts EV Production Target, Resumes Cruise Operations In Houston

GM's Cruise resumes operations in Houston. The company adjusted its EV production target; approved $6B stock buyback.

Russia Confirms Contact With US Over Possible Swap For Detained WSJ Reporter Evan Gershkovich

Russia has confirmed that discussions are underway with the United States regarding a potential prisoner exchange involving WSJ reporter Evan Gershkovich.

Elon Musk's Chilling Warning for Humanity - Ad

Elon Musk warned that humanity will soon be 'obsolete.' Every port, railroad, highway, and airport in America is facilitating a kind of 'invasion' that will bring about centuries worth of change in the next few years. If Elon and the research are correct -- the results could be devastating for the average American.

Judges hear Elizabeth Holmes' appeal of fraud conviction while she remains in Texas prison

SAN FRANCISCO (AP) — A panel of federal judges spent two hours on Tuesday wrestling with a series of legal issues raised in an attempt to overturn a fraud conviction that sent Theranos CEO Elizabeth Holmes to prison after a meteoric rise to Silicon Valley stardom.

Bitcoin Rebounds From ETF Outflows, Back Above $61K

Bitcoin (CRYPTO: BTC) has bounced from its poor showing in Monday trading and is trading back above $61,000, despite significant net outflows from spot ETFs.

Buy This Sub-$5 Play on His Final Masterpiece - Ad

I just issued my latest prediction on how this billionaire is about to shock the world with what I believe will be his final masterpiece. Once he flips the switch... Which could happen as early as mid-2024... I believe it could send this sub-$5 play skyrocketing in the coming months.

Bitcoin Continues Sideways Despite $21M ETF Inflow On Wednesday

Bitcoin spot ETFs saw net inflows of $21.5 million, signaling a shift in investor sentiment. Experts will discuss this and more at Benzinga's event.

Bitcoin Fails To Get A Mention In The Fiery Trump-Biden Presidential Debate, Coins Themed Around The Two Tumble

The first 2024 presidential debate between incumbent Joe Biden and challenger Donald Trump turned out to be a dampener for cryptocurrency enthusiasts, with no mention of Bitcoin (CRYPTO: BTC) or significant policy issues surrounding what many anticipated to be a key election to

America Has New Type of Money, Making Some Rich...Should You Get It? - Ad

One of America's most controversial CEOs says, "A new form of money in America is making some people (including Elon Musk, Jeff Bezos, and 17 of America's 25 wealthiest individuals), rich. It has nothing to do with gold, bitcoin, or anything like that, but it's 100% legal. The problem? Few Americans have a real understanding of how it works.

Micro communities for the homeless sprout in US cities eager for small, quick and cheap solutions

ATLANTA (AP) — In a dreary part of downtown Atlanta, shipping containers have been transformed into an oasis for dozens of previously unsheltered people who now proudly call a former parking lot home.

Man Who Called Nvidia at $1.10 Says Buy This Now... - Ad

In 2004, a man predicted Nvidia's rise. Now, he says a new company, which IPO'd in 2023, could soar like Nvidia. It signed a major deal with Apple for its AI tech in iPhones and iMacs. Could it be the next trillion-dollar company? See why he believes it's among "The Next Magnificent Seven."

US Supreme Court refuses to take up challenge to Florida's online sports betting compact

FORT LAUDERDALE, Fla. (AP) — The U.S. Supreme Court on Monday refused to take up a challenge to exclusive rights to handle online sports betting in Florida, dealing a blow to the deal's opponents.

Trending Now

Information, charts or examples are for illustration and educational purposes only and not for individualized investment management This message contains commercial elements, such as advertising. We only send these offers to those who have opted in to our newsletter. Past performance is not indicative of future results. For these reasons we strongly suggest trading in a DEMO/Simulated account. The information provided by us is for educational and informational purposes only. We make no representations or warranties concerning the products, practices or procedures of any company or entity mentioned or recommended and have not determined if the statements and opinions of the advertiser are accurate, correct or truthful. If you use, act upon or make decisions in reliance on information contained or any external source linked within it, you do so at your own peril and agree to hold us, our officers, directors, shareholders, affiliates and agents without fault.

Copyright smartmoneytrading.net
Privacy Policy | Terms of Service