How Vision Language Models Are Becoming Identity Forgers' Secret Weapon
Vision language models and generative AI tools have fundamentally transformed identity document forgery from a niche criminal skill into a mass-market threat, enabling attackers with minimal technical expertise to create visually convincing fake documents that bypass automated verification systems. A comprehensive new survey reveals that even the strongest publicly available multimodal AI detection systems fail to catch AI-generated identity documents more than 25% of the time under real-world security conditions, exposing a critical gap between how these systems are tested and how they actually perform in the field.
The shift has been dramatic and measurable. Between 2024 and 2026, an underground platform called OnlyFake sold more than 10,000 AI-generated passports and driver's licenses worldwide at roughly $15 each, with many successfully bypassing Know Your Customer (KYC) verification checks on major platforms. The operator was later arrested and pleaded guilty to the charges. This wasn't an isolated incident. The FBI and FinCEN issued formal alerts in 2024 confirming that criminals were actively using generative AI to manipulate or fabricate identity documents for bank fraud, check fraud, loan fraud, and impersonation.
Why Are Vision Language Models So Good at Forging Documents?
The core problem lies in how multimodal AI systems work. Vision language models, which combine image recognition with language understanding, can now synthesize entire identity documents field-by-field with remarkable fidelity. Unlike earlier forgery methods that relied on crude copy-paste techniques or obvious digital artifacts, these systems generate documents that look authentic at first glance because they understand both the visual layout and the semantic content of what they're creating.
The real-world impact has been staggering. In 2024, identity fraud cost approximately $27 billion across 18 million victims in the United States alone. AI-related cybercrime complaints exceeded 22,000 with losses over $800 million in 2025. Globally, synthetic identity fraud surpassed $35 billion annually. In India, a 2025 survey by Forrester and Experian found that 69% of organizations considered their KYC infrastructure inadequate against AI-generated identity documents.
One particularly troubling discovery emerged from the research: large multimodal models exhibit what researchers call "Script-Dependent Generative Instability" (SDGI), a recurring typographic failure mode when generating text in non-Latin scripts like Devanagari, Hangul, or Arabic. This means that while these systems excel at forging documents in English, they sometimes stumble when creating fake documents in other languages, creating an uneven threat landscape.
How Are Detection Systems Falling Behind?
The detection methods used to catch forged documents have evolved significantly over the past decade, but they're consistently one step behind the attack methods. The progression tells the story: from 2015 to 2020, detection relied on global image classifiers that worked reasonably well in controlled lab settings but failed in the real world. Between 2020 and 2022, researchers shifted to forensic micro-artifact analysis, looking for telltale signs like moiré patterns or noise traces. By 2022 to 2024, as generative AI made traditional forensic cues unreliable, detection moved toward localization and semantic reasoning.
Today's approach relies on foundation models and few-shot learning, but the results are sobering. Zero-shot benchmarking on unseen synthesized identity cards shows that even the strongest publicly available models achieve Attack Presentation Classification Error Rate (APCER) values above 25% under security-oriented operating conditions. In plain language, that means these systems fail to catch one in four AI-generated fake documents when tested on documents they've never seen before.
This gap exists because of what researchers call the "Reality Gap": the persistent mismatch between how identity verification systems are tested in academic benchmarks and how they actually perform when deployed in banks, border control agencies, and online platforms. Public datasets used to train and evaluate detection systems typically don't reflect the diversity of real-world attacks or the specific characteristics of documents generated by the latest multimodal models.
Steps to Strengthen Identity Verification Against AI-Generated Documents
- Move Beyond Visual Inspection: Organizations should not rely on visual inspection alone as a high-assurance signal for identity verification. The research emphasizes that secure verification system design must incorporate multiple layers of authentication beyond what the human eye or a single AI model can detect.
- Implement Forensically Grounded Detection: Detection systems need to be built on forensic principles that account for the specific failure modes of generative AI, including Script-Dependent Generative Instability in non-Latin scripts and the artifacts left by diffusion-based and GAN-based synthesis methods.
- Test Against Real-World Threat Models: Organizations should evaluate their detection systems not just on academic benchmarks but on synthesized documents created by the latest multimodal models, under security-oriented operating conditions that reflect actual deployment scenarios.
- Adopt Privacy-Preserving and Legally Accountable Systems: Future identity verification infrastructure should be designed with privacy and legal accountability as core principles, ensuring that detection methods don't create unnecessary data retention or expose sensitive biometric information.
The research survey, which systematically analyzed identity document forgery and detection across three attack classes (physical presentation attacks, digital injection attacks, and generative AI-driven synthesis), represents the first unified treatment of these threats within a single framework. It covers publications from 2015 through 2026, documenting how both attack and defense methods have co-evolved.
Real-world incidents underscore the urgency. In 2025, a public demonstration in India showed that a generative AI model could produce highly realistic Aadhaar and PAN card replicas, exposing the limits of visual inspection-based verification. That same year, the North Korean hacking group Kimsuky reportedly used generative AI to forge South Korean military ID cards in spear-phishing campaigns against defense-sector personnel.
The fundamental challenge is that generative AI has democratized document forgery. Where creating a convincing fake identity document once required specialized technical skills, access to printing equipment, and knowledge of security features, it now requires only a text prompt and $15. Until detection systems catch up, and until organizations move beyond visual inspection as their primary defense, the threat will continue to grow.
" }