PAN is a new AI model that uses language to predict the future. It uses a Large Language Model (LLM) as its "autoregressive world model backbone" PAN uses a clever mechanism to solve the problem of rapid time decay.PAN is a new AI model that uses language to predict the future. It uses a Large Language Model (LLM) as its "autoregressive world model backbone" PAN uses a clever mechanism to solve the problem of rapid time decay.

Beyond Pretty Videos: 5 Surprising Ideas Behind PAN, The AI That Simulates Reality

2025/12/02 01:12
6 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.

Introduction: The Hidden Flaw in Today's AI Video Generators

Recent breakthroughs in AI have flooded our feeds with stunningly realistic videos generated from simple text prompts. But beneath the visual magic lies a critical flaw. Today’s top models are like artists who can paint a beautiful, static image of a river; they can show you the water, the rocks, and the trees with breathtaking detail. What they can’t do is tell you where the water will flow next. They operate in an “open-loop” fashion, lacking the “causal control, interactivity, or long-horizon consistency required for purposeful reasoning.”

This is the difference between making a movie and running a simulation. A new class of AI, called "world models," aims to become the physicist who can model the entire river system. A major leap forward in this quest is PAN, a model whose goal is not just to produce plausible video but to create an interactive “sandbox for simulative reasoning.” It's a platform for an AI agent to explore complex “what if” scenarios, turning video generation from a parlor trick into a tool for genuine foresight. Here are five surprising ideas that power its approach.

\

1. The Secret Ingredient is Language: Using an LLM to Understand the Visual World

When building an AI to see the world, the last thing you'd expect to use as its brain is a model trained on text. Yet, that's exactly where PAN starts, and the reason is surprisingly logical.

Raw video data, on its own, suffers from “information sparsity.” A video shows you what happens, but it doesn't contain the underlying principles of why. To bridge this gap, PAN uses a Large Language Model (LLM) as its "autoregressive world model backbone." By grounding its visual perception in the massive real-world knowledge contained in text corpora, PAN learns about physics, cause-and-effect, and the properties of objects. In short, it uses the endless descriptions of how our world works, written by humans, to make smarter predictions about what it sees.

\

2. The Counter-Intuitive Leap: Embracing Uncertainty to Model Reality

Predicting the future is hard for anyone, and it's especially brutal for an AI. The real world is a chaotic storm of random details; the precise flutter of a leaf, the exact pattern of a shadow, the contents of a room just around the corner. Most AI models see this inherent unpredictability as an obstacle to be minimised or avoided.

PAN takes a radically different and counterintuitive path. Its Generative Latent Prediction (GLP) architecture doesn't fight uncertainty; it embraces it as a fundamental feature of reality. The model is designed to “absorb and utilize” these unpredictable elements during training, treating them as intrinsic to the physical world. As the researchers put it:

\ This is a breakthrough because it allows the model to separate what is predictable (a ball will fall when dropped) from what is not (the exact way it bounces and the dust it kicks up). By modelling uncertainty instead of being paralysed by it, PAN's simulations become more robust, realistic, and useful.

\

3. The Grounding Principle: Learning by Re-Drawing, Not Just Matching

Some predictive AI models face a crippling issue known as the "collapse" problem. This is like a student who, when asked to predict the next word in any sentence, always answers "the." They might be right often enough to minimise certain kinds of errors, but they haven't learned anything meaningful about language. Similarly, these AI models can learn a trivial shortcut by mapping all their predictions to a single, constant value, rendering their internal "thoughts" meaningless.

PAN avoids this trap with a solution called "generative supervision." Instead of just matching abstract ideas in a hidden digital space, PAN’s training demands that it fully reconstruct the next observable video frame from its internal prediction. This simple but powerful requirement forces every internal thought to "correspond to a realizable sensory change." It can't cheat, because its success is measured by its ability to actually "re-draw" a coherent future. This re-drawing task is made feasible by the LLM backbone, which provides the common-sense knowledge of what a "realizable" future should even look like.

\

4. The Mechanism for Consistency: A "Fuzzy" Sliding Window Through Time

Anyone who has tried to chain together AI-generated video clips has seen the jarring results: abrupt visual jumps and a rapid decay in quality as tiny errors snowball over time. To solve this, PAN uses a clever mechanism that acts like a sophisticated film editor working on a long movie.

Imagine editing two adjacent clips to ensure a seamless transition. Instead of looking at the last frame of the first clip with perfect, pixel-level clarity, you might look at it in a slightly blurred, "fuzzy" way. This forces you to focus on the major shapes, colors, and movements—the high-level story; rather than the exact position of a single leaf blowing in the wind. This is the core idea behind PAN's "Causal Shift-Window Denoising Process Model" (Causal Swin-DPM). It works on a sliding temporal window of video chunks, conditioning its next prediction on a "fuzzy, partially noised" version of the recent past. This forces the model to prioritize "high-level, persistent semantic consistency," ensuring simulations are smooth and stable over long horizons. In this way, the Causal Swin-DPM is the practical application of the philosophy of embracing uncertainty, ensuring the model isn't derailed by details it can't possibly know.

\

5. The Ultimate Goal: Creating a Sandbox for AI "Thought Experiments"

The ultimate purpose of a world model like PAN isn't just to make videos; it's to enable "simulative reasoning and planning." It functions as an internal simulator that allows an AI agent to conduct "thought experiments"; running through different plans in its "mind" before committing to a single action in the real world.

The research provides powerful evidence that this isn't just a theoretical goal. When integrated with a Vision-Language Model (VLM) agent, PAN led to "consistent and substantial improvements" in complex planning tasks. Specifically, it increased the agent's task success rate by 26.7% in Open-Ended Planning and 23.4% in Structured Planning compared to the agent working alone. This proves PAN has moved beyond simply generating pretty pictures. Its simulations are causally reliable enough to guide an agent's decisions, turning it from a passive picture-maker into a functional tool for reasoning.

\

Conclusion: From Picture-Makers to World-Builders

The ideas behind PAN represent a fundamental shift in AI development. We are moving away from models that are passive video generators and toward active world simulators that understand cause and effect. By weaving together linguistic knowledge, embracing uncertainty, grounding itself in reconstruction, and ensuring long-term consistency, PAN takes a crucial step toward building AIs that can reason, plan, and act with genuine foresight.

As these world models mature, moving from showing us what is plausible to helping us reason about what is possible, what is the first complex "what if" scenario you would want to see simulated?


Podcast:

  • Apple: HERE
  • Spotify: HERE

\ \

Opportunità di mercato
Logo null
Valore null (null)
--
----
USD
Grafico dei prezzi in tempo reale di null (null)
Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.

Potrebbe anche piacerti

South Korea Party Moves to Scrap Crypto Tax Plan

South Korea Party Moves to Scrap Crypto Tax Plan

South Korea’s People Power Party (PPP) is taking a clear stand on crypto taxes. The party has now officially adopted a plan to scrap the country’s proposed crypto
Condividi
Coinfomania2026/03/25 15:00
CME Group to launch options on XRP and SOL futures

CME Group to launch options on XRP and SOL futures

The post CME Group to launch options on XRP and SOL futures appeared on BitcoinEthereumNews.com. CME Group will offer options based on the derivative markets on Solana (SOL) and XRP. The new markets will open on October 13, after regulatory approval.  CME Group will expand its crypto products with options on the futures markets of Solana (SOL) and XRP. The futures market will start on October 13, after regulatory review and approval.  The options will allow the trading of MicroSol, XRP, and MicroXRP futures, with expiry dates available every business day, monthly, and quarterly. The new products will be added to the existing BTC and ETH options markets. ‘The launch of these options contracts builds on the significant growth and increasing liquidity we have seen across our suite of Solana and XRP futures,’ said Giovanni Vicioso, CME Group Global Head of Cryptocurrency Products. The options contracts will have two main sizes, tracking the futures contracts. The new market will be suitable for sophisticated institutional traders, as well as active individual traders. The addition of options markets singles out XRP and SOL as liquid enough to offer the potential to bet on a market direction.  The options on futures arrive a few months after the launch of SOL futures. Both SOL and XRP had peak volumes in August, though XRP activity has slowed down in September. XRP and SOL options to tap both institutions and active traders Crypto options are one of the indicators of market attitudes, with XRP and SOL receiving a new way to gauge sentiment. The contracts will be supported by the Cumberland team.  ‘As one of the biggest liquidity providers in the ecosystem, the Cumberland team is excited to support CME Group’s continued expansion of crypto offerings,’ said Roman Makarov, Head of Cumberland Options Trading at DRW. ‘The launch of options on Solana and XRP futures is the latest example of the…
Condividi
BitcoinEthereumNews2025/09/18 00:56
CME to launch Solana and XRP futures options on October 13, 2025

CME to launch Solana and XRP futures options on October 13, 2025

The post CME to launch Solana and XRP futures options on October 13, 2025 appeared on BitcoinEthereumNews.com. Key Takeaways CME Group will launch futures options for Solana (SOL) and XRP. The launch date is set for October 13, 2025. CME Group will launch futures options for Solana and XRP on October 13, 2025. The Chicago-based derivatives exchange will add the new crypto derivatives products to its existing digital asset offerings. The launch will provide institutional and retail traders with additional tools to hedge positions and speculate on price movements for both digital assets. The futures options will be based on CME’s existing Solana and XRP futures contracts. Trading will be conducted through CME Globex, the exchange’s electronic trading platform. Source: https://cryptobriefing.com/cme-solana-xrp-futures-options-launch-2025/
Condividi
BitcoinEthereumNews2025/09/18 01:07