600GB Leak Exposes China’s Great Firewall Blueprint: Source Code, Training Data, and Who Runs the Digital Cage

Post date: November 6, 2025 · Discovered: April 23, 2026 · 4 posts, 0 comments

A massive data leak—estimated at 600GB and containing over 100,000 documents—has surfaced, detailing the internal workings of entities managing China’s Great Firewall (GFW). This dump reveals source code, operational data, and critical metadata, offering visibility into the GFW's entire infrastructure.

The analysis points to specific technical reveals: a searchable Elasticsearch database containing 133,000 examples used to train an AI censorship model, and the system’s capability to flag subtle criticism beyond mere keywords. The data suggests control spans a 'wide constellation of public-private partnerships, military-academic collaborations.'

The weight of the information suggests a comprehensive operational map of Chinese digital control. The leaked metadata provides unprecedented traceability, mapping out the organizational structure and the flow of decision-making authority across various domestic entities.

Key Points

#1The leak reveals technical enforcement mechanics.

The data offers a cross-section showing not just *what* is filtered, but *who* builds and *how* the censorship apparatus functions.

#2AI censorship uses advanced techniques.

The system moves beyond simple blacklists, employing LLMs to flag nuanced criticism concerning pollution, fraud, or political satire.

#3Organizational structure is exposed via metadata.

Leaked metadata allows tracing engineers and researchers across public-private partnerships and military-academic nodes.

#4The leak follows a structured investigation narrative.

The exposé is structured in parts: data contents first, followed by a deep dive into technical architecture (e.g., detecting Psiphon/V2Ray), and concluding with geopolitical fallout.

Source Discussions (4)

This report was synthesized from the following Lemmy discussions, ranked by community score.

229
points
The Great Firewall: Massive data leak reveals the inner workings of China's censorship regime
[email protected]·11 comments·11/6/2025·by Hotznplotzn·dti.domaintools.com
21
points
The Great Firewall: Massive data leak reveals the inner workings of China's censorship regime
[email protected]·0 comments·11/6/2025·by Hotznplotzn·dti.domaintools.com
15
points
The Great Firewall: Massive data leak reveals the inner workings of China's censorship regime
[email protected]·0 comments·11/6/2025·by floofloof·dti.domaintools.com
10
points
Leaked data exposes a Chinese AI censorship machine
[email protected]·0 comments·3/31/2025·by Hotznplotzn·techcrunch.com