Insecure Deserialization: Remote Code Execution Through Object Manipulation

Serialization converts a live object — a user session, a configuration structure, a message payload — into a flat sequence of bytes or characters that can be stored, transmitted, or cached. Deserialization is the reverse: taking that flat representation and reconstructing the object in memory.

The problem arises when an application deserializes data it received from a user without verifying what it is being asked to reconstruct. In several common language runtimes, the deserialization process itself executes code. The attacker does not need to find a separate code execution path. The parsing step is the execution step.

Why Deserialization Can Execute Code

Most developers assume that parsing data is safe. They understand that running eval() on user input is dangerous, but expect that reconstructing an object from a structured binary format is equivalent to reading a file — inert, deterministic, without side effects.

That assumption is wrong for native serialization formats in Java, PHP, Python, Ruby, and .NET.

Each of these runtimes supports the ability for classes to define behavior that runs automatically during object reconstruction: cleanup logic, integrity checks, lazy initialization. These hooks — readObject() in Java, __wakeup() and __destruct() in PHP, __reduce__() in Python — exist for legitimate purposes. But when an attacker controls the serialized data, they control which classes are instantiated and what those hooks do when they fire.

Java Gadget Chains

Java's native deserialization mechanism calls ObjectInputStream.readObject() to reconstruct objects from a byte stream. Any class that implements java.io.Serializable can be deserialized, and any such class can override readObject() to execute arbitrary code during reconstruction.

The practical severity of Java deserialization vulnerabilities comes not from the application's own classes but from gadget chains — sequences of method calls through existing library classes that ultimately invoke Runtime.exec() or equivalent system command execution.

The Shape of a Gadget Chain

A gadget chain begins with an entry point: a class whose readObject() or hashCode() method is called automatically during deserialization. From there, each step in the chain calls a method on another object that the attacker controlled in the serialized data, until the chain terminates at a dangerous sink — typically Runtime.exec(), ProcessBuilder.start(), or a file write operation.

A simplified version of the Apache Commons Collections chain looks like this:

ObjectInputStream.readObject()
  → HashMap.readObject() calls hashCode() on keys
    → TiedMapEntry.hashCode() calls getValue()
      → LazyMap.get() invokes the transformer
        → InvokerTransformer.transform() calls Method.invoke()
          → Runtime.exec("cmd")

The application being exploited does not contain any of this logic. It contains a dependency on Apache Commons Collections and a network endpoint that deserializes user-supplied bytes. That is sufficient.

Identifying Java Deserialization in the Wild

Java serialized objects always begin with the same four bytes: AC ED 00 05. In contexts where binary data is base64-encoded — cookies, POST parameters, WebSocket payloads — this becomes rO0AB. When these signatures appear in network traffic or stored data that originates from user input, the endpoint deserializes Java objects.

Common locations:

Session cookies (JSESSIONID or custom session tokens in older J2EE applications)
RMI and IIOP service endpoints
JMX management interfaces
Custom binary protocols in enterprise middleware
HTTP POST bodies to endpoints that accept application/octet-stream

The presence of Java serialization bytes is enough to warrant testing. The exploitability depends on which libraries are on the classpath — but given the ubiquity of Commons Collections, Spring Framework, and Groovy in enterprise Java applications, exploitable gadget chains are more common than not.

PHP Object Injection

PHP's unserialize() function reconstructs PHP objects from a string representation. The format is human-readable:

O:4:"User":2:{s:8:"username";s:5:"alice";s:4:"role";s:4:"user";}

This represents a User object with two properties. The string fully controls what object is created: the class name, the property names, and the property values.

Magic Methods as Attack Vectors

PHP provides a set of magic methods that execute automatically during specific lifecycle events:

__wakeup() is called immediately after unserialize() reconstructs the object
__destruct() is called when the object is garbage collected, which typically happens at the end of the request
__toString() is called when the object is used in a string context
__call() is invoked when an inaccessible method is called on the object
__get() fires when an inaccessible property is read

An attacker who controls the serialized string controls which class is instantiated and what values its properties hold when these methods fire. If any class in the application (or any included library) implements a magic method that performs a dangerous operation using object properties — writing to a file, evaluating a string as code, making a network request — it becomes a gadget.

A Practical PHP Gadget Example

Consider a logging class that writes its logPath property to disk when garbage collected:

php

class FileLogger {
    public $logPath;
    public $logContent;
    
    public function __destruct() {
        file_put_contents($this->logPath, $this->logContent);
    }
}

If the application unserializes user input anywhere — a cookie, a POST parameter, a deserialized session — an attacker can supply:

O:10:"FileLogger":2:{s:7:"logPath";s:27:"/var/www/html/webshell.php";s:10:"logContent";s:29:"<?php system($_GET['cmd']); ?>";}

When the request ends and the object is garbage collected, __destruct() writes a webshell to a web-accessible path. The application never explicitly called FileLogger — the attacker's serialized string did.

This pattern generalizes. PHP applications and frameworks — Laravel, Symfony, WordPress with plugins — contain many classes with magic methods that operate on property values. The attacker does not need to write new code. They need to find a class that already does something dangerous and craft a serialized string that instantiates it with the right properties.

Python Pickle

Python's pickle module serializes and deserializes Python objects. Unlike JSON, pickle is not a data format — it is a stack-based bytecode language for reconstructing arbitrary Python objects.

The module's own documentation states: "The pickle module is not secure. Only unpickle data you trust."

The danger is the __reduce__ method. When present on a class, pickle calls it during serialization and uses its return value to reconstruct the object during deserialization. __reduce__ can return a tuple of (callable, args), and pickle will call callable(*args) to reconstruct the object.

A malicious pickle payload looks like this in Python:

python

import pickle, os
 
class Exploit:
    def __reduce__(self):
        return (os.system, ('id > /tmp/pwned',))
 
payload = pickle.dumps(Exploit())

When pickle.loads(payload) is called anywhere in the application, os.system('id > /tmp/pwned') executes immediately. No additional steps, no second-order evaluation. The command runs during parsing.

Applications that use pickle to cache objects in Redis, store session data, pass messages through Celery queues, or persist model state are all vulnerable if any attacker-controlled data can reach pickle.loads().

Other Affected Languages

.NET BinaryFormatter

The .NET BinaryFormatter class is a direct analog to Java's ObjectInputStream. It deserializes arbitrary types from a binary stream, executes type constructors during reconstruction, and has been the source of numerous critical vulnerabilities in Microsoft products. Microsoft deprecated BinaryFormatter in .NET 5 and removed it from .NET 7. Legacy applications using it remain vulnerable through gadget chains in the .NET Framework class library.

Ruby Marshal

Ruby's Marshal.load deserializes Ruby objects and calls marshal_load on any class that defines it. Gadget chains exploiting Gem::Requirement and related classes in the RubyGems standard library have been well-documented. Rails applications that store serialized objects in cookies using the Marshal format are exploitable if the cookie can be tampered with.

Node.js

Node.js does not have a native serialization format equivalent to Java's ObjectInputStream, but several common libraries introduce the same class of vulnerability. The node-serialize package executes JavaScript functions embedded in serialized JSON objects. Any library that evaluates serialized data as code, or that instantiates JavaScript objects by name from attacker-controlled strings, creates equivalent exposure.

Finding Deserialization Vulnerabilities

The first step is identifying where serialized data enters the application.

Check all cookies. Session cookies, remember-me tokens, and preference cookies are frequent carriers of serialized objects. Decode base64 values and look for known magic bytes. A cookie value of rO0ABXNy... is a Java serialized object.

Review HTTP request bodies. API endpoints that accept binary formats or non-JSON content types may deserialize objects. Look for application/x-java-serialized-object, application/octet-stream, and multipart/form-data endpoints that accept opaque binary fields.

Check for the PHPSESSID and viewstate patterns. PHP session data is sometimes stored with serialization, and ASP.NET ViewState in older applications is a base64-encoded serialized structure.

Look for deserialization libraries in dependency lists. The presence of commons-collections, spring-core, kryo, or similar libraries in a Java application's dependency tree increases the probability of exploitable gadget chains. For Python, search for pickle, marshal, shelve, or joblib.load calls.

Fuzz with known signatures. Replacing a base64-encoded value with a Java serialized object header (rO0AB) and observing an error that differs from normal error responses often indicates that deserialization is occurring.

Remediation

Eliminate Untrusted Deserialization

The most effective remediation is stopping deserialization of untrusted data entirely. Replace native serialization with JSON, Protocol Buffers, or MessagePack. Validate the schema of the data before processing it. JSON parsers do not instantiate arbitrary classes, do not call constructors, and do not execute lifecycle hooks.

Implement Class Allowlists

When native deserialization cannot be avoided, restrict which classes can be instantiated. Java's JEP 415 provides ObjectInputFilter for this purpose:

java

ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
    "com.example.app.SafeClass;!*"
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);

This allows only com.example.app.SafeClass to be deserialized. Any other class in the stream causes deserialization to fail. In PHP, unserialize() accepts an allowed_classes parameter with the same effect.

Sign and Verify Serialized Data

If the application serializes data that it later deserializes — for session state, caching, or inter-service communication — cryptographically sign the serialized bytes before storing or transmitting them. Verify the signature before deserialization. An attacker who cannot forge a valid signature cannot supply a malicious payload.

python

import hmac, hashlib, base64
 
def sign(data: bytes, secret: bytes) -> str:
    sig = hmac.new(secret, data, hashlib.sha256).digest()
    return base64.b64encode(sig).decode()
 
def verify_and_load(data: bytes, signature: str, secret: bytes) -> object:
    expected = sign(data, secret)
    if not hmac.compare_digest(expected, signature):
        raise ValueError("Invalid signature — refusing to deserialize")
    return pickle.loads(data)

This pattern does not fix the underlying vulnerability — a compromised secret invalidates it — but it is a practical defense for applications that cannot immediately migrate away from native serialization.

Sandboxed Deserialization

For applications that must deserialize untrusted data from multiple classes, run the deserialization in a sandboxed process with no access to the network, filesystem, or privileged system calls. The sandbox limits what any exploited gadget chain can accomplish. This is a mitigation layer, not a fix, but it significantly reduces the blast radius of a successful exploit.

The Fundamental Problem

Insecure deserialization is dangerous because it inverts the normal attack model. In most injection vulnerabilities, an attacker gets their payload into the application and finds a way to execute it. In deserialization vulnerabilities, execution happens during input processing itself. The application's own libraries become the attack surface.

The lesson for defensive code review is to treat all deserialization of external data as equivalent to eval() — because in these runtimes, it often is. The safeguard is the same: do not parse external data using mechanisms that execute code, and if you must, verify integrity and restrict what can be instantiated before the parsing begins.

Need your application's deserialization attack surface assessed? Get in touch.

Insecure Deserialization: Remote Code Execution Through Object Manipulation

Insecure Deserialization: Remote Code Execution Through Object Manipulation

Why Deserialization Can Execute Code

Java Gadget Chains

The Shape of a Gadget Chain

Identifying Java Deserialization in the Wild

PHP Object Injection

Magic Methods as Attack Vectors

A Practical PHP Gadget Example

Python Pickle

Other Affected Languages

.NET BinaryFormatter

Ruby Marshal

Node.js

Finding Deserialization Vulnerabilities

Remediation

Eliminate Untrusted Deserialization

Implement Class Allowlists

Sign and Verify Serialized Data

Sandboxed Deserialization

The Fundamental Problem

Get new knowledge base articles in your inbox

Need your application tested?

From the Knowledge Base

Server-Side Template Injection: From Template Engines to Code Execution

API Rate Limiting Bypass: When Throttling Fails

Broken Access Control: OWASP's #1 Web Application Vulnerability

Summary

Key Takeaways

Frequently Asked Questions