The post Our Last Hope Before The AI Bubble Detonates: Taming LLMs appeared on BitcoinEthereumNews.com. The most ideal way to soften the AI bubble’s looming explosion would be to boost AI’s realized value. How? A new reliability layer that tames large language models. Eric Siegel To know that we’re in an AI bubble, you don’t need OpenAI chair Bret Taylor or Databricks CEO Ali Ghodsi to admit it, as they have. Nor do you need to analyze the telltale economics of inflated valuations, underwhelming revenues and circular financing. Instead, just examine the outlandish claim that’s been driving the hype: We’re nearing artificial general intelligence, computers that would amount to “artificial humans,” capable of almost everything humans can do. But there’s still hope: AI could realize some of its overzealous promise of great autonomy with the introduction of a new reliability layer that tames large language models. By boosting AI’s realized value, this would be the most ideal way to soften the AI bubble’s burst. Here’s how it works. The Culprit: AI’s Deadly Reliability Problem On the one hand, we’ve genuinely entered a new age. The capabilities of LLMs are unprecedented. For example, they can often reliably handle dialogues (chat sessions) that pertain to, say, ten or fifteen written pages of background information. But it’s easy as hell to think up an unrealistic goal for AI. LLMs are so seemingly humanlike, people envision computers replacing all customer service agents, summarizing or answering questions about a collection of thousands of documents, taking on the wholesale role of a data scientist or even making a company’s executive decisions. Even modest ambitions test AI’s limitations. Crippling failures quickly overshadow an AI system’s potential value as its intended scope of capabilities widens. Things might go awry if, for example, you increase the system’s knowledge base from ten written pages to a few dozen documents, if you involve sensitive data that… The post Our Last Hope Before The AI Bubble Detonates: Taming LLMs appeared on BitcoinEthereumNews.com. The most ideal way to soften the AI bubble’s looming explosion would be to boost AI’s realized value. How? A new reliability layer that tames large language models. Eric Siegel To know that we’re in an AI bubble, you don’t need OpenAI chair Bret Taylor or Databricks CEO Ali Ghodsi to admit it, as they have. Nor do you need to analyze the telltale economics of inflated valuations, underwhelming revenues and circular financing. Instead, just examine the outlandish claim that’s been driving the hype: We’re nearing artificial general intelligence, computers that would amount to “artificial humans,” capable of almost everything humans can do. But there’s still hope: AI could realize some of its overzealous promise of great autonomy with the introduction of a new reliability layer that tames large language models. By boosting AI’s realized value, this would be the most ideal way to soften the AI bubble’s burst. Here’s how it works. The Culprit: AI’s Deadly Reliability Problem On the one hand, we’ve genuinely entered a new age. The capabilities of LLMs are unprecedented. For example, they can often reliably handle dialogues (chat sessions) that pertain to, say, ten or fifteen written pages of background information. But it’s easy as hell to think up an unrealistic goal for AI. LLMs are so seemingly humanlike, people envision computers replacing all customer service agents, summarizing or answering questions about a collection of thousands of documents, taking on the wholesale role of a data scientist or even making a company’s executive decisions. Even modest ambitions test AI’s limitations. Crippling failures quickly overshadow an AI system’s potential value as its intended scope of capabilities widens. Things might go awry if, for example, you increase the system’s knowledge base from ten written pages to a few dozen documents, if you involve sensitive data that…

Our Last Hope Before The AI Bubble Detonates: Taming LLMs

The most ideal way to soften the AI bubble’s looming explosion would be to boost AI’s realized value. How? A new reliability layer that tames large language models.

Eric Siegel

To know that we’re in an AI bubble, you don’t need OpenAI chair Bret Taylor or Databricks CEO Ali Ghodsi to admit it, as they have. Nor do you need to analyze the telltale economics of inflated valuations, underwhelming revenues and circular financing.

Instead, just examine the outlandish claim that’s been driving the hype: We’re nearing artificial general intelligence, computers that would amount to “artificial humans,” capable of almost everything humans can do.

But there’s still hope: AI could realize some of its overzealous promise of great autonomy with the introduction of a new reliability layer that tames large language models. By boosting AI’s realized value, this would be the most ideal way to soften the AI bubble’s burst. Here’s how it works.

The Culprit: AI’s Deadly Reliability Problem

On the one hand, we’ve genuinely entered a new age. The capabilities of LLMs are unprecedented. For example, they can often reliably handle dialogues (chat sessions) that pertain to, say, ten or fifteen written pages of background information.

But it’s easy as hell to think up an unrealistic goal for AI. LLMs are so seemingly humanlike, people envision computers replacing all customer service agents, summarizing or answering questions about a collection of thousands of documents, taking on the wholesale role of a data scientist or even making a company’s executive decisions.

Even modest ambitions test AI’s limitations. Crippling failures quickly overshadow an AI system’s potential value as its intended scope of capabilities widens. Things might go awry if, for example, you increase the system’s knowledge base from ten written pages to a few dozen documents, if you involve sensitive data that the system must divulge only selectively or if you empower the system to enact consequential transactions – such as purchases or changes to paid reservations.

What goes wrong? It’s more than just hallucination. AI systems address topics outside their purpose (such as a healthcare administration bot advising on personal finances), produce unethical or offensive content, purchase the wrong kind of product or just plain fail to address a user’s fundamental need. Accordingly, 95% of generative AI pilots fail to reach production.

A New Reliability Layer That Tames LLMs

Here’s our last hope: taming LLMs. If we can succeed, this represents AI’s brave new frontier. By curbing the problematic behavior of LLMs, we can progress from promising genAI pilots to reliable products.

A reliability layer installed on top of an LLM can tame it. This reliability layer must 1) continually expand and adapt, 2) strategically embed humans in the loop – indefinitely – and 3) form-fit the project with extensive customization.

1) Continually-Expanding Guardrails

Impressive AI pilots abound, but it’s become painfully clear that developing one only gets you five percent of the way toward a robust, production-ready system.

Now the real work begins: The team must engage in a prolific variation of “whack-a-mole,” identifying gotchas and improving the system accordingly. As the MIT report famed for reporting genAI’s 95% failure rate puts it, “Organizations on the right side of the GenAI Divide share a common approach: they build adaptive, embedded systems that learn from feedback.”

For example, the communications leader Twilio has launched a conversational AI assistant that continually evolves. This system, named Isa, performs both customer support and sales roles, assisting the user by responding to questions and by proactively guiding throughout the customer lifecycle as the user increases their adoption of Twilio solutions.

Isa continually expands, semi-automatically. With human oversight, its array of guardrails lengthens, placing a hold when it’s about to make missteps such as:

  • Go too far off topic.
  • Provide a fictional URL or an incorrect product price.
  • Promise to set up an unauthorized meeting with a human or to “check with my legal team.”

As this list grows to multitudes, an AI system becomes robust. The continual expansion and refinement of such guardrails becomes a core fundamental for the system’s development. In this way, the reliability layer learns where the LLM falls short. This isn’t only how the system keeps adapting to the changing world in which it operates – it’s how the system evolves to be production-ready in the first place.

2) Humans Strategically Embedded In The Loop, Indefinitely

The widely accepted promise of AI has become too audacious: complete autonomy. If that goal isn’t sensibly compromised, the AI industry will continue to realize returns far below its potential.

Thankfully, there’s a feasible alternative: a semi-automatic process that iteratively refines the system until it’s robust and production-worthy. In this paradigm, humans play two roles: They oversee how each new guardrail is defined and implemented, and they remain in the loop moving forward as gatekeepers, reviewing each case that’s placed in a hold when a guardrail triggers.

Except for more modestly-scoped AI projects, humans must remain in the loop – indefinitely, yet always decreasingly so. The more the reliability layer improves, the more autonomous the AI system will become. Its demand on humans will continually diminish as a result of their help in expanding the guardrails. But for AI systems that take on substantial tasks, the need for looped-in humans will never reach zero (short of achieving artificial general intelligence, which, I argue, we are not approaching).

3) A Bespoke Architecture Customized For Each AI Project

AI is generally oversold. A common, overzealous message positions the LLM as a stand-alone, general-purpose solution. With only light-weight efforts, the story goes, it can succeed at almost any task. This “one and done” fallacy is known as solutionism.

But AI is not plug-and-play. Developing an AI system is a consulting gig, not a technology install. We can stand on the shoulders of giants and leverage the unprecedented potential of LLMs, but only with an extensive, highly problem-specific customization effort to design a workable reliability layer. Each such project intrinsically involves an “R&D” experimental aspect.

To build a reliability layer that tames an LLM, begin with another LLM (or a different session with the same LLM). LLMs help themselves – to a certain degree. Depending on the project, another LLM (or “agent,” if you have to call it that) may serve as a central component of the reliability layer. Each time the base LLM delivers content, the reliability LLM can review it, actively checking and enforcing the guardrails – thereby deciding which cases to hold for human review – and generating suggestions for new guardrails, also screened by humans.

An effective reliability layer doesn’t necessarily hinge on advanced tech. For many projects, this simple architecture – an LLM serving as a “guardrail manager” – can serve as the basis for reliability layer development. Alternatively, more advanced technical methods can respond to feedback by modifying the weights of the foundational LLM model itself – but that approach is often overkill. Weight-adjusting has likely already been employed in the development of the LLM in the first place, so that it’s aligned with requirements that pertain to many possible use cases. But now, the customized use of the LLM can often be guardrailed with a separate, simpler layer.

Think of it this way. AI can heal itself – to some extent. When it comes to overcoming its own limitations, an LLM is still not a stand-alone panacea.

Reliability layers also depend on the other main form of AI: predictive AI. After all, we’re talking about improving a system by learning from feedback and experience. That’s the very function of machine learning. When machine learning is applied to optimize large-scale enterprise operations, we call it predictive AI. Here, a deployed LLM is just one more large-scale operation that benefits from “playing the odds” – predictively flagging the riskiest cases where humans should best target their efforts, just the same as for targeting fraud investigations, factory machine maintenance and medical testing. I cover how this works in the article “How Predictive AI Will Solve GenAI’s Deadly Reliability Problem,” and will do so during my presentation, “Seven Ways to Hybridize Predictive AI and GenAI That Deliver Business Value,” at the free online event IBM Z Day (live on November 12, 2025, and available on-demand thereafter).

An Entire New Paradigm, Discipline And Opportunity

The reliability layer is AI’s new frontier – but it’s not yet firmly established, well-known, or even properly named. What should we call it? AI “reliability,” “customization” or “guardrailing” are platitudes. “Taming LLMs” describes the end, not the means. “Agentic AI” inherently overpromises by suggesting supreme autonomy and by anthropomorphizing. But a paradigm can’t take off without a name.

No matter what you call it, developing the reliability layer is a critical, emerging discipline. It’s vital for establishing system robustness that can make an AI pilot ready for deployment. And it’s a fruitful way to test the limits of LLMs, exploring and expanding the feasibility of ever-increasing AI ambitions.

Source: https://www.forbes.com/sites/ericsiegel/2025/10/20/our-last-hope-before-the-ai-bubble-detonates-taming-llms/

Market Opportunity
Sleepless AI Logo
Sleepless AI Price(AI)
$0.04148
$0.04148$0.04148
+0.19%
USD
Sleepless AI (AI) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

SharpLink Gaming advances ethereum treasury strategy with $170 million Linea deployment

SharpLink Gaming advances ethereum treasury strategy with $170 million Linea deployment

Ethereum Treasury moves ahead as SharpLink shifts $170 million of ETH to Linea, seeking higher yields while preserving custody
Share
The Cryptonomist2026/01/09 22:57
CEO Sandeep Nailwal Shared Highlights About RWA on Polygon

CEO Sandeep Nailwal Shared Highlights About RWA on Polygon

The post CEO Sandeep Nailwal Shared Highlights About RWA on Polygon appeared on BitcoinEthereumNews.com. Polygon CEO Sandeep Nailwal highlighted Polygon’s lead in global bonds, Spiko US T-Bill, and Spiko Euro T-Bill. Polygon published an X post to share that its roadmap to GigaGas was still scaling. Sentiments around POL price were last seen to be bearish. Polygon CEO Sandeep Nailwal shared key pointers from the Dune and RWA.xyz report. These pertain to highlights about RWA on Polygon. Simultaneously, Polygon underlined its roadmap towards GigaGas. Sentiments around POL price were last seen fumbling under bearish emotions. Polygon CEO Sandeep Nailwal on Polygon RWA CEO Sandeep Nailwal highlighted three key points from the Dune and RWA.xyz report. The Chief Executive of Polygon maintained that Polygon PoS was hosting RWA TVL worth $1.13 billion across 269 assets plus 2,900 holders. Nailwal confirmed from the report that RWA was happening on Polygon. The Dune and https://t.co/W6WSFlHoQF report on RWA is out and it shows that RWA is happening on Polygon. Here are a few highlights: – Leading in Global Bonds: Polygon holds 62% share of tokenized global bonds (driven by Spiko’s euro MMF and Cashlink euro issues) – Spiko U.S.… — Sandeep | CEO, Polygon Foundation (※,※) (@sandeepnailwal) September 17, 2025 The X post published by Polygon CEO Sandeep Nailwal underlined that the ecosystem was leading in global bonds by holding a 62% share of tokenized global bonds. He further highlighted that Polygon was leading with Spiko US T-Bill at approximately 29% share of TVL along with Ethereum, adding that the ecosystem had more than 50% share in the number of holders. Finally, Sandeep highlighted from the report that there was a strong adoption for Spiko Euro T-Bill with 38% share of TVL. He added that 68% of returns were on Polygon across all the chains. Polygon Roadmap to GigaGas In a different update from Polygon, the community…
Share
BitcoinEthereumNews2025/09/18 01:10
U.S. Supreme Court’s Decision on Trump’s Tariffs: Implications for Crypto Markets

U.S. Supreme Court’s Decision on Trump’s Tariffs: Implications for Crypto Markets

The Supreme Court's ruling on Trump's tariffs could have significant impacts on U.S. markets and the cryptocurrency landscape.Read more...
Share
Coinstats2026/01/09 22:45