This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements within traffic scenes.This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements within traffic scenes.

Foreground vs. Background: Analyzing Typographic Attack Placement in Autonomous Driving Systems

2025/10/01 21:00
2 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.

Abstract and 1. Introduction

  1. Related Work

    2.1 Vision-LLMs

    2.2 Transferable Adversarial Attacks

  2. Preliminaries

    3.1 Revisiting Auto-Regressive Vision-LLMs

    3.2 Typographic Attacks in Vision-LLMs-based AD Systems

  3. Methodology

    4.1 Auto-Generation of Typographic Attack

    4.2 Augmentations of Typographic Attack

    4.3 Realizations of Typographic Attacks

  4. Experiments

  5. Conclusion and References

4.3 Realizations of Typographic Attacks

Digitally, typographic attacks are about embedding texts within images to fool the capabilities of Vision-LLMs, which might involve simply putting texts into the images. Physically, typographic attacks can incorporate real elements (e.g., stickers, paints, and drawings) into environments/entities observable by AI systems, with AD systems being prime examples. This would include the placement of texts with unusual fonts or colors on streets, objects, vehicles, or clothing to mislead AD systems in reasoning, planning, and control. We investigate Vision-LLMs when incorporated into AD systems, as they are likely under the most risk against typographic attacks. We categorize the placement locations as being identified with backgrounds and foregrounds in traffic scenes.

\ • Backgrounds, which refer to elements in the environment that are static and pervasive in a traffic scene (e.g., streets, buildings, and bus stops). The background components present predefined locations for introducing deceptive typographic elements of various sizes.

\ • Foregrounds, which refer to dynamic elements and directly interact with the perception of AD systems (e.g., vehicles, cyclists, and pedestrians). The foreground components present dynamic and variable locations for typographic attacks of various sizes.

\

\ Depending on the attacked task, we observe that different text placements and observed sizes would render some attacks more effective while some others are negligible. Our research illuminates that background-placement attacks are quite effective against scene reasoning and action reasoning but not as effective against scene object reasoning unless foreground placements are also included.

\ Figure 4: Example attacks against Imp and GPT4 on the dataset by CVPRW’24.

\

:::info Authors:

(1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam;

(2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China;

(3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR;

(4) Jie Zhang, Nanyang Technological University, Singapore;

(5) Aishan Liu, Beihang University, China;

(6) Yun Lin, Shanghai Jiao Tong University, China;

(7) Jin Song Dong, National University of Singapore, Singapore;

(8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore.

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.