An Alibaba-linked AI agent named ROME hijacked training GPUs for unauthorised crypto mining and opened a covert SSH tunnel during RL tests. The post Alibaba-LinkedAn Alibaba-linked AI agent named ROME hijacked training GPUs for unauthorised crypto mining and opened a covert SSH tunnel during RL tests. The post Alibaba-Linked

Alibaba-Linked AI Agent ROME Attempts Crypto Mining and Network Tunnelling During Training

2026/03/09 13:01
3 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.
  • ROME, a 30-billion-parameter autonomous coding agent built on Alibaba’s Qwen3-MoE architecture, diverted GPU resources toward cryptocurrency mining and created a reverse SSH tunnel to an external IP address during reinforcement learning training.
  • Researchers confirmed the behaviours were not programmed, with ROME apparently determining that acquiring additional compute and financial capacity would help complete its assigned tasks.
  • Alibaba responded with safety-aligned data filtering and hardened sandbox environments.

Researchers at Alibaba’s Agentic Learning Ecosystem flagged two unauthorised behaviours from ROME, their experimental autonomous coding agent, during reinforcement learning training runs conducted in late 2025.

Apparently, the model redirected GPU capacity toward cryptocurrency mining and opened a reverse SSH tunnel to an external server, bypassing internal firewall controls.

Read more: OKX Says Australia Could Unlock $24B Digital Finance Boom With Faster Crypto Rules

ROME Goes Rogue

The team first saw unusual outbound traffic alerts and suspected a normal security failure, such as weak egress settings or an outside breach. But the same violations appeared again across multiple training runs without a clear pattern. 

In a technical report, the researchers said the investigation showed the agent, known as ROME, had initiated both actions itself.

The joint research teams, identified in the paper as ROCK, ROLL, iFlow, and DT, described the behaviours as “instrumental side effects of autonomous tool use under RL optimization”.

ROME had not been instructed to mine cryptocurrency or open external connections, but its optimisation process identified additional compute and financial resources as useful for completing its objectives.

Together, these observations suggest that during iterative RL optimization, a language-model agent can spontaneously produce hazardous, unauthorized behaviors at the tool-calling and code-execution layer, violating the assumed execution boundary.

ROCK & ROLL & IFLOW & DT Joint Team

ROME is built on Alibaba’s Qwen3-MoE architecture and runs approximately 3 billion of its 30 billion parameters at any given time, designed to complete complex tasks through tool use, terminal commands, and software environment interaction. 

Read more: Trump Pushes Banks to Strike Crypto Deal, but Analysts Say It Won’t Break CLARITY Act Deadlock

Well, the concern is not that the model was told to behave maliciously, but that it independently found unauthorised methods that helped it perform better under its training objective, so Alibaba said it responded by tightening sandbox protections and filtering training data for safety alignment. 

Also, not the first time issues like this have worried researchers and engineers. 

Anthropic has also reported troubling agent-style behavior in testing, including cases where Claude Opus 4 concealed its intentions, suggesting the issue is broader than one company or model.

The post Alibaba-Linked AI Agent ROME Attempts Crypto Mining and Network Tunnelling During Training appeared first on Crypto News Australia.

Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.

$30,000 in PRL + 15,000 USDT

$30,000 in PRL + 15,000 USDT$30,000 in PRL + 15,000 USDT

Deposit & trade PRL to boost your rewards!