Insecure Deserialization: Remote Code Execution Through Object Manipulation
Serialization converts a live object — a user session, a configuration structure, a message payload — into a flat sequence of bytes or characters that can be stored, transmitted, or cached. Deserialization is the reverse: taking that flat representation and reconstructing the object in memory.
The problem arises when an application deserializes data it received from a user without verifying what it is being asked to reconstruct. In several common language runtimes, the deserialization process itself executes code. The attacker does not need to find a separate code execution path. The parsing step is the execution step.
Why Deserialization Can Execute Code
Most developers assume that parsing data is safe. They understand that running eval() on user input is dangerous, but expect that reconstructing an object from a structured binary format is equivalent to reading a file — inert, deterministic, without side effects.
That assumption is wrong for native serialization formats in Java, PHP, Python, Ruby, and .NET.
Each of these runtimes supports the ability for classes to define behavior that runs automatically during object reconstruction: cleanup logic, integrity checks, lazy initialization. These hooks — readObject() in Java, __wakeup() and __destruct() in PHP, __reduce__() in Python — exist for legitimate purposes. But when an attacker controls the serialized data, they control which classes are instantiated and what those hooks do when they fire.
Java Gadget Chains
Java's native deserialization mechanism calls ObjectInputStream.readObject() to reconstruct objects from a byte stream. Any class that implements java.io.Serializable can be deserialized, and any such class can override readObject() to execute arbitrary code during reconstruction.
The practical severity of Java deserialization vulnerabilities comes not from the application's own classes but from gadget chains — sequences of method calls through existing library classes that ultimately invoke Runtime.exec() or equivalent system command execution.
The Shape of a Gadget Chain
A gadget chain begins with an entry point: a class whose readObject() or hashCode() method is called automatically during deserialization. From there, each step in the chain calls a method on another object that the attacker controlled in the serialized data, until the chain terminates at a dangerous sink — typically Runtime.exec(), ProcessBuilder.start(), or a file write operation.
A simplified version of the Apache Commons Collections chain looks like this:
ObjectInputStream.readObject()
→ HashMap.readObject() calls hashCode() on keys
→ TiedMapEntry.hashCode() calls getValue()
→ LazyMap.get() invokes the transformer
→ InvokerTransformer.transform() calls Method.invoke()
→ Runtime.exec("cmd")
The application being exploited does not contain any of this logic. It contains a dependency on Apache Commons Collections and a network endpoint that deserializes user-supplied bytes. That is sufficient.
Identifying Java Deserialization in the Wild
Java serialized objects always begin with the same four bytes: AC ED 00 05. In contexts where binary data is base64-encoded — cookies, POST parameters, WebSocket payloads — this becomes rO0AB. When these signatures appear in network traffic or stored data that originates from user input, the endpoint deserializes Java objects.
Common locations:
- Session cookies (
JSESSIONIDor custom session tokens in older J2EE applications) - RMI and IIOP service endpoints
- JMX management interfaces
- Custom binary protocols in enterprise middleware
- HTTP POST bodies to endpoints that accept
application/octet-stream
The presence of Java serialization bytes is enough to warrant testing. The exploitability depends on which libraries are on the classpath — but given the ubiquity of Commons Collections, Spring Framework, and Groovy in enterprise Java applications, exploitable gadget chains are more common than not.
PHP Object Injection
PHP's unserialize() function reconstructs PHP objects from a string representation. The format is human-readable:
O:4:"User":2:{s:8:"username";s:5:"alice";s:4:"role";s:4:"user";}
This represents a User object with two properties. The string fully controls what object is created: the class name, the property names, and the property values.
Magic Methods as Attack Vectors
PHP provides a set of magic methods that execute automatically during specific lifecycle events:
__wakeup()is called immediately afterunserialize()reconstructs the object__destruct()is called when the object is garbage collected, which typically happens at the end of the request__toString()is called when the object is used in a string context__call()is invoked when an inaccessible method is called on the object__get()fires when an inaccessible property is read
An attacker who controls the serialized string controls which class is instantiated and what values its properties hold when these methods fire. If any class in the application (or any included library) implements a magic method that performs a dangerous operation using object properties — writing to a file, evaluating a string as code, making a network request — it becomes a gadget.
A Practical PHP Gadget Example
Consider a logging class that writes its logPath property to disk when garbage collected:
class FileLogger {
public $logPath;
public $logContent;
public function __destruct() {
file_put_contents($this->logPath, $this->logContent);
}
}If the application unserializes user input anywhere — a cookie, a POST parameter, a deserialized session — an attacker can supply:
O:10:"FileLogger":2:{s:7:"logPath";s:27:"/var/www/html/webshell.php";s:10:"logContent";s:29:"<?php system($_GET['cmd']); ?>";}
When the request ends and the object is garbage collected, __destruct() writes a webshell to a web-accessible path. The application never explicitly called FileLogger — the attacker's serialized string did.
This pattern generalizes. PHP applications and frameworks — Laravel, Symfony, WordPress with plugins — contain many classes with magic methods that operate on property values. The attacker does not need to write new code. They need to find a class that already does something dangerous and craft a serialized string that instantiates it with the right properties.
Python Pickle
Python's pickle module serializes and deserializes Python objects. Unlike JSON, pickle is not a data format — it is a stack-based bytecode language for reconstructing arbitrary Python objects.
The module's own documentation states: "The pickle module is not secure. Only unpickle data you trust."
The danger is the __reduce__ method. When present on a class, pickle calls it during serialization and uses its return value to reconstruct the object during deserialization. __reduce__ can return a tuple of (callable, args), and pickle will call callable(*args) to reconstruct the object.
A malicious pickle payload looks like this in Python:
import pickle, os
class Exploit:
def __reduce__(self):
return (os.system, ('id > /tmp/pwned',))
payload = pickle.dumps(Exploit())When pickle.loads(payload) is called anywhere in the application, os.system('id > /tmp/pwned') executes immediately. No additional steps, no second-order evaluation. The command runs during parsing.
Applications that use pickle to cache objects in Redis, store session data, pass messages through Celery queues, or persist model state are all vulnerable if any attacker-controlled data can reach pickle.loads().
Other Affected Languages
.NET BinaryFormatter
The .NET BinaryFormatter class is a direct analog to Java's ObjectInputStream. It deserializes arbitrary types from a binary stream, executes type constructors during reconstruction, and has been the source of numerous critical vulnerabilities in Microsoft products. Microsoft deprecated BinaryFormatter in .NET 5 and removed it from .NET 7. Legacy applications using it remain vulnerable through gadget chains in the .NET Framework class library.
Ruby Marshal
Ruby's Marshal.load deserializes Ruby objects and calls marshal_load on any class that defines it. Gadget chains exploiting Gem::Requirement and related classes in the RubyGems standard library have been well-documented. Rails applications that store serialized objects in cookies using the Marshal format are exploitable if the cookie can be tampered with.
Node.js
Node.js does not have a native serialization format equivalent to Java's ObjectInputStream, but several common libraries introduce the same class of vulnerability. The node-serialize package executes JavaScript functions embedded in serialized JSON objects. Any library that evaluates serialized data as code, or that instantiates JavaScript objects by name from attacker-controlled strings, creates equivalent exposure.
Finding Deserialization Vulnerabilities
The first step is identifying where serialized data enters the application.
Check all cookies. Session cookies, remember-me tokens, and preference cookies are frequent carriers of serialized objects. Decode base64 values and look for known magic bytes. A cookie value of rO0ABXNy... is a Java serialized object.
Review HTTP request bodies. API endpoints that accept binary formats or non-JSON content types may deserialize objects. Look for application/x-java-serialized-object, application/octet-stream, and multipart/form-data endpoints that accept opaque binary fields.
Check for the PHPSESSID and viewstate patterns. PHP session data is sometimes stored with serialization, and ASP.NET ViewState in older applications is a base64-encoded serialized structure.
Look for deserialization libraries in dependency lists. The presence of commons-collections, spring-core, kryo, or similar libraries in a Java application's dependency tree increases the probability of exploitable gadget chains. For Python, search for pickle, marshal, shelve, or joblib.load calls.
Fuzz with known signatures. Replacing a base64-encoded value with a Java serialized object header (rO0AB) and observing an error that differs from normal error responses often indicates that deserialization is occurring.
Remediation
Eliminate Untrusted Deserialization
The most effective remediation is stopping deserialization of untrusted data entirely. Replace native serialization with JSON, Protocol Buffers, or MessagePack. Validate the schema of the data before processing it. JSON parsers do not instantiate arbitrary classes, do not call constructors, and do not execute lifecycle hooks.
Implement Class Allowlists
When native deserialization cannot be avoided, restrict which classes can be instantiated. Java's JEP 415 provides ObjectInputFilter for this purpose:
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.example.app.SafeClass;!*"
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);This allows only com.example.app.SafeClass to be deserialized. Any other class in the stream causes deserialization to fail. In PHP, unserialize() accepts an allowed_classes parameter with the same effect.
Sign and Verify Serialized Data
If the application serializes data that it later deserializes — for session state, caching, or inter-service communication — cryptographically sign the serialized bytes before storing or transmitting them. Verify the signature before deserialization. An attacker who cannot forge a valid signature cannot supply a malicious payload.
import hmac, hashlib, base64
def sign(data: bytes, secret: bytes) -> str:
sig = hmac.new(secret, data, hashlib.sha256).digest()
return base64.b64encode(sig).decode()
def verify_and_load(data: bytes, signature: str, secret: bytes) -> object:
expected = sign(data, secret)
if not hmac.compare_digest(expected, signature):
raise ValueError("Invalid signature — refusing to deserialize")
return pickle.loads(data)This pattern does not fix the underlying vulnerability — a compromised secret invalidates it — but it is a practical defense for applications that cannot immediately migrate away from native serialization.
Sandboxed Deserialization
For applications that must deserialize untrusted data from multiple classes, run the deserialization in a sandboxed process with no access to the network, filesystem, or privileged system calls. The sandbox limits what any exploited gadget chain can accomplish. This is a mitigation layer, not a fix, but it significantly reduces the blast radius of a successful exploit.
The Fundamental Problem
Insecure deserialization is dangerous because it inverts the normal attack model. In most injection vulnerabilities, an attacker gets their payload into the application and finds a way to execute it. In deserialization vulnerabilities, execution happens during input processing itself. The application's own libraries become the attack surface.
The lesson for defensive code review is to treat all deserialization of external data as equivalent to eval() — because in these runtimes, it often is. The safeguard is the same: do not parse external data using mechanisms that execute code, and if you must, verify integrity and restrict what can be instantiated before the parsing begins.
Need your application's deserialization attack surface assessed? Get in touch.