PAN is a new AI model that uses language to predict the future. It uses a Large Language Model (LLM) as its "autoregressive world model backbone" PAN uses a clever mechanism to solve the problem of rapid time decay.PAN is a new AI model that uses language to predict the future. It uses a Large Language Model (LLM) as its "autoregressive world model backbone" PAN uses a clever mechanism to solve the problem of rapid time decay.

Beyond Pretty Videos: 5 Surprising Ideas Behind PAN, The AI That Simulates Reality

2025/12/02 01:12
6 דקת קריאה
למשוב או לפניות בנוגע לתוכן זה, אנא צור קשר איתנו ב crypto.news@mexc.com

Introduction: The Hidden Flaw in Today's AI Video Generators

Recent breakthroughs in AI have flooded our feeds with stunningly realistic videos generated from simple text prompts. But beneath the visual magic lies a critical flaw. Today’s top models are like artists who can paint a beautiful, static image of a river; they can show you the water, the rocks, and the trees with breathtaking detail. What they can’t do is tell you where the water will flow next. They operate in an “open-loop” fashion, lacking the “causal control, interactivity, or long-horizon consistency required for purposeful reasoning.”

This is the difference between making a movie and running a simulation. A new class of AI, called "world models," aims to become the physicist who can model the entire river system. A major leap forward in this quest is PAN, a model whose goal is not just to produce plausible video but to create an interactive “sandbox for simulative reasoning.” It's a platform for an AI agent to explore complex “what if” scenarios, turning video generation from a parlor trick into a tool for genuine foresight. Here are five surprising ideas that power its approach.

\

1. The Secret Ingredient is Language: Using an LLM to Understand the Visual World

When building an AI to see the world, the last thing you'd expect to use as its brain is a model trained on text. Yet, that's exactly where PAN starts, and the reason is surprisingly logical.

Raw video data, on its own, suffers from “information sparsity.” A video shows you what happens, but it doesn't contain the underlying principles of why. To bridge this gap, PAN uses a Large Language Model (LLM) as its "autoregressive world model backbone." By grounding its visual perception in the massive real-world knowledge contained in text corpora, PAN learns about physics, cause-and-effect, and the properties of objects. In short, it uses the endless descriptions of how our world works, written by humans, to make smarter predictions about what it sees.

\

2. The Counter-Intuitive Leap: Embracing Uncertainty to Model Reality

Predicting the future is hard for anyone, and it's especially brutal for an AI. The real world is a chaotic storm of random details; the precise flutter of a leaf, the exact pattern of a shadow, the contents of a room just around the corner. Most AI models see this inherent unpredictability as an obstacle to be minimised or avoided.

PAN takes a radically different and counterintuitive path. Its Generative Latent Prediction (GLP) architecture doesn't fight uncertainty; it embraces it as a fundamental feature of reality. The model is designed to “absorb and utilize” these unpredictable elements during training, treating them as intrinsic to the physical world. As the researchers put it:

\ This is a breakthrough because it allows the model to separate what is predictable (a ball will fall when dropped) from what is not (the exact way it bounces and the dust it kicks up). By modelling uncertainty instead of being paralysed by it, PAN's simulations become more robust, realistic, and useful.

\

3. The Grounding Principle: Learning by Re-Drawing, Not Just Matching

Some predictive AI models face a crippling issue known as the "collapse" problem. This is like a student who, when asked to predict the next word in any sentence, always answers "the." They might be right often enough to minimise certain kinds of errors, but they haven't learned anything meaningful about language. Similarly, these AI models can learn a trivial shortcut by mapping all their predictions to a single, constant value, rendering their internal "thoughts" meaningless.

PAN avoids this trap with a solution called "generative supervision." Instead of just matching abstract ideas in a hidden digital space, PAN’s training demands that it fully reconstruct the next observable video frame from its internal prediction. This simple but powerful requirement forces every internal thought to "correspond to a realizable sensory change." It can't cheat, because its success is measured by its ability to actually "re-draw" a coherent future. This re-drawing task is made feasible by the LLM backbone, which provides the common-sense knowledge of what a "realizable" future should even look like.

\

4. The Mechanism for Consistency: A "Fuzzy" Sliding Window Through Time

Anyone who has tried to chain together AI-generated video clips has seen the jarring results: abrupt visual jumps and a rapid decay in quality as tiny errors snowball over time. To solve this, PAN uses a clever mechanism that acts like a sophisticated film editor working on a long movie.

Imagine editing two adjacent clips to ensure a seamless transition. Instead of looking at the last frame of the first clip with perfect, pixel-level clarity, you might look at it in a slightly blurred, "fuzzy" way. This forces you to focus on the major shapes, colors, and movements—the high-level story; rather than the exact position of a single leaf blowing in the wind. This is the core idea behind PAN's "Causal Shift-Window Denoising Process Model" (Causal Swin-DPM). It works on a sliding temporal window of video chunks, conditioning its next prediction on a "fuzzy, partially noised" version of the recent past. This forces the model to prioritize "high-level, persistent semantic consistency," ensuring simulations are smooth and stable over long horizons. In this way, the Causal Swin-DPM is the practical application of the philosophy of embracing uncertainty, ensuring the model isn't derailed by details it can't possibly know.

\

5. The Ultimate Goal: Creating a Sandbox for AI "Thought Experiments"

The ultimate purpose of a world model like PAN isn't just to make videos; it's to enable "simulative reasoning and planning." It functions as an internal simulator that allows an AI agent to conduct "thought experiments"; running through different plans in its "mind" before committing to a single action in the real world.

The research provides powerful evidence that this isn't just a theoretical goal. When integrated with a Vision-Language Model (VLM) agent, PAN led to "consistent and substantial improvements" in complex planning tasks. Specifically, it increased the agent's task success rate by 26.7% in Open-Ended Planning and 23.4% in Structured Planning compared to the agent working alone. This proves PAN has moved beyond simply generating pretty pictures. Its simulations are causally reliable enough to guide an agent's decisions, turning it from a passive picture-maker into a functional tool for reasoning.

\

Conclusion: From Picture-Makers to World-Builders

The ideas behind PAN represent a fundamental shift in AI development. We are moving away from models that are passive video generators and toward active world simulators that understand cause and effect. By weaving together linguistic knowledge, embracing uncertainty, grounding itself in reconstruction, and ensuring long-term consistency, PAN takes a crucial step toward building AIs that can reason, plan, and act with genuine foresight.

As these world models mature, moving from showing us what is plausible to helping us reason about what is possible, what is the first complex "what if" scenario you would want to see simulated?


Podcast:

  • Apple: HERE
  • Spotify: HERE

\ \

הזדמנות שוק
Sleepless AI סֵמֶל
Sleepless AI מְחִיר(SLEEPLESSAI)
$0.02244
$0.02244$0.02244
+4.03%
USD
Sleepless AI (SLEEPLESSAI) טבלת מחירים חיה
הצהרת סיכום: המאמרים המתפרסמים מחדש באתר זה מקורם בפלטפורמות ציבוריות ונמסרים לצרכי מידע בלבד. הם אינם משקפים בהכרח את עמדותיה של MEXC. כל הזכויות שמורות למחברים המקוריים. אם אתה סבור שתוכן כלשהו מפר זכויות צד שלישי, אנא צרו קשר עם crypto.news@mexc.com לבקשת הסרה. MEXC אינה מתחייבת לדיוק, לשלמות או לעדכניות התוכן, ואינה אחראית לכל פעולה שתינקט על סמך המידע המסופק. התוכן אינו מהווה ייעוץ פיננסי, משפטי או מקצועי אחר, ואין לראות בו המלצה או אישור מטעם MEXC.

אולי תאהב גם

Trump Crypto Manipulation: Explosive Claims of Daily Bitcoin Market Influence Through Geopolitical Statements

Trump Crypto Manipulation: Explosive Claims of Daily Bitcoin Market Influence Through Geopolitical Statements

BitcoinWorld Trump Crypto Manipulation: Explosive Claims of Daily Bitcoin Market Influence Through Geopolitical Statements Recent explosive allegations from cryptocurrency
לַחֲלוֹק
bitcoinworld2026/04/02 17:45
Franklin Templeton CEO Dismisses 50bps Rate Cut Ahead FOMC

Franklin Templeton CEO Dismisses 50bps Rate Cut Ahead FOMC

The post Franklin Templeton CEO Dismisses 50bps Rate Cut Ahead FOMC appeared on BitcoinEthereumNews.com. Franklin Templeton CEO Jenny Johnson has weighed in on whether the Federal Reserve should make a 25 basis points (bps) Fed rate cut or 50 bps cut. This comes ahead of the Fed decision today at today’s FOMC meeting, with the market pricing in a 25 bps cut. Bitcoin and the broader crypto market are currently trading flat ahead of the rate cut decision. Franklin Templeton CEO Weighs In On Potential FOMC Decision In a CNBC interview, Jenny Johnson said that she expects the Fed to make a 25 bps cut today instead of a 50 bps cut. She acknowledged the jobs data, which suggested that the labor market is weakening. However, she noted that this data is backward-looking, indicating that it doesn’t show the current state of the economy. She alluded to the wage growth, which she remarked is an indication of a robust labor market. She added that retail sales are up and that consumers are still spending, despite inflation being sticky at 3%, which makes a case for why the FOMC should opt against a 50-basis-point Fed rate cut. In line with this, the Franklin Templeton CEO said that she would go with a 25 bps rate cut if she were Jerome Powell. She remarked that the Fed still has the October and December FOMC meetings to make further cuts if the incoming data warrants it. Johnson also asserted that the data show a robust economy. However, she noted that there can’t be an argument for no Fed rate cut since Powell already signaled at Jackson Hole that they were likely to lower interest rates at this meeting due to concerns over a weakening labor market. Notably, her comment comes as experts argue for both sides on why the Fed should make a 25 bps cut or…
לַחֲלוֹק
BitcoinEthereumNews2025/09/18 00:36
How to Spot a South African Cyber-Scam Before You Click “Pay”

How to Spot a South African Cyber-Scam Before You Click “Pay”

The South African digital economy is no longer a luxury—it is our primary marketplace. E-commerce transaction values surged by a staggering 37% last year, mirrored
לַחֲלוֹק
TechFinancials2026/04/02 18:08

Starter Gold Rush: Win $2,500!

Starter Gold Rush: Win $2,500!Starter Gold Rush: Win $2,500!

Start your first trade & capture every Alpha move