Skip to content
Fast-turnaround security assessments available — 10+ years development & security experienceGet started
vulnerabilityCWE-502OWASP A08:2021Typical severity: Critical

Insecure Deserialization: Remote Code Execution Through Object Manipulation

·10 min read

Insecure Deserialization: Remote Code Execution Through Object Manipulation

Serialization converts a live object — a user session, a configuration structure, a message payload — into a flat sequence of bytes or characters that can be stored, transmitted, or cached. Deserialization is the reverse: taking that flat representation and reconstructing the object in memory.

The problem arises when an application deserializes data it received from a user without verifying what it is being asked to reconstruct. In several common language runtimes, the deserialization process itself executes code. The attacker does not need to find a separate code execution path. The parsing step is the execution step.

Why Deserialization Can Execute Code

Most developers assume that parsing data is safe. They understand that running eval() on user input is dangerous, but expect that reconstructing an object from a structured binary format is equivalent to reading a file — inert, deterministic, without side effects.

That assumption is wrong for native serialization formats in Java, PHP, Python, Ruby, and .NET.

Each of these runtimes supports the ability for classes to define behavior that runs automatically during object reconstruction: cleanup logic, integrity checks, lazy initialization. These hooks — readObject() in Java, __wakeup() and __destruct() in PHP, __reduce__() in Python — exist for legitimate purposes. But when an attacker controls the serialized data, they control which classes are instantiated and what those hooks do when they fire.

Java Gadget Chains

Java's native deserialization mechanism calls ObjectInputStream.readObject() to reconstruct objects from a byte stream. Any class that implements java.io.Serializable can be deserialized, and any such class can override readObject() to execute arbitrary code during reconstruction.

The practical severity of Java deserialization vulnerabilities comes not from the application's own classes but from gadget chains — sequences of method calls through existing library classes that ultimately invoke Runtime.exec() or equivalent system command execution.

The Shape of a Gadget Chain

A gadget chain begins with an entry point: a class whose readObject() or hashCode() method is called automatically during deserialization. From there, each step in the chain calls a method on another object that the attacker controlled in the serialized data, until the chain terminates at a dangerous sink — typically Runtime.exec(), ProcessBuilder.start(), or a file write operation.

A simplified version of the Apache Commons Collections chain looks like this:

ObjectInputStream.readObject()
  → HashMap.readObject() calls hashCode() on keys
    → TiedMapEntry.hashCode() calls getValue()
      → LazyMap.get() invokes the transformer
        → InvokerTransformer.transform() calls Method.invoke()
          → Runtime.exec("cmd")

The application being exploited does not contain any of this logic. It contains a dependency on Apache Commons Collections and a network endpoint that deserializes user-supplied bytes. That is sufficient.

Identifying Java Deserialization in the Wild

Java serialized objects always begin with the same four bytes: AC ED 00 05. In contexts where binary data is base64-encoded — cookies, POST parameters, WebSocket payloads — this becomes rO0AB. When these signatures appear in network traffic or stored data that originates from user input, the endpoint deserializes Java objects.

Common locations:

  • Session cookies (JSESSIONID or custom session tokens in older J2EE applications)
  • RMI and IIOP service endpoints
  • JMX management interfaces
  • Custom binary protocols in enterprise middleware
  • HTTP POST bodies to endpoints that accept application/octet-stream

The presence of Java serialization bytes is enough to warrant testing. The exploitability depends on which libraries are on the classpath — but given the ubiquity of Commons Collections, Spring Framework, and Groovy in enterprise Java applications, exploitable gadget chains are more common than not.

PHP Object Injection

PHP's unserialize() function reconstructs PHP objects from a string representation. The format is human-readable:

O:4:"User":2:{s:8:"username";s:5:"alice";s:4:"role";s:4:"user";}

This represents a User object with two properties. The string fully controls what object is created: the class name, the property names, and the property values.

Magic Methods as Attack Vectors

PHP provides a set of magic methods that execute automatically during specific lifecycle events:

  • __wakeup() is called immediately after unserialize() reconstructs the object
  • __destruct() is called when the object is garbage collected, which typically happens at the end of the request
  • __toString() is called when the object is used in a string context
  • __call() is invoked when an inaccessible method is called on the object
  • __get() fires when an inaccessible property is read

An attacker who controls the serialized string controls which class is instantiated and what values its properties hold when these methods fire. If any class in the application (or any included library) implements a magic method that performs a dangerous operation using object properties — writing to a file, evaluating a string as code, making a network request — it becomes a gadget.

A Practical PHP Gadget Example

Consider a logging class that writes its logPath property to disk when garbage collected:

php
class FileLogger {
    public $logPath;
    public $logContent;
    
    public function __destruct() {
        file_put_contents($this->logPath, $this->logContent);
    }
}

If the application unserializes user input anywhere — a cookie, a POST parameter, a deserialized session — an attacker can supply:

O:10:"FileLogger":2:{s:7:"logPath";s:27:"/var/www/html/webshell.php";s:10:"logContent";s:29:"<?php system($_GET['cmd']); ?>";}

When the request ends and the object is garbage collected, __destruct() writes a webshell to a web-accessible path. The application never explicitly called FileLogger — the attacker's serialized string did.

This pattern generalizes. PHP applications and frameworks — Laravel, Symfony, WordPress with plugins — contain many classes with magic methods that operate on property values. The attacker does not need to write new code. They need to find a class that already does something dangerous and craft a serialized string that instantiates it with the right properties.

Python Pickle

Python's pickle module serializes and deserializes Python objects. Unlike JSON, pickle is not a data format — it is a stack-based bytecode language for reconstructing arbitrary Python objects.

The module's own documentation states: "The pickle module is not secure. Only unpickle data you trust."

The danger is the __reduce__ method. When present on a class, pickle calls it during serialization and uses its return value to reconstruct the object during deserialization. __reduce__ can return a tuple of (callable, args), and pickle will call callable(*args) to reconstruct the object.

A malicious pickle payload looks like this in Python:

python
import pickle, os
 
class Exploit:
    def __reduce__(self):
        return (os.system, ('id > /tmp/pwned',))
 
payload = pickle.dumps(Exploit())

When pickle.loads(payload) is called anywhere in the application, os.system('id > /tmp/pwned') executes immediately. No additional steps, no second-order evaluation. The command runs during parsing.

Applications that use pickle to cache objects in Redis, store session data, pass messages through Celery queues, or persist model state are all vulnerable if any attacker-controlled data can reach pickle.loads().

Other Affected Languages

.NET BinaryFormatter

The .NET BinaryFormatter class is a direct analog to Java's ObjectInputStream. It deserializes arbitrary types from a binary stream, executes type constructors during reconstruction, and has been the source of numerous critical vulnerabilities in Microsoft products. Microsoft deprecated BinaryFormatter in .NET 5 and removed it from .NET 7. Legacy applications using it remain vulnerable through gadget chains in the .NET Framework class library.

Ruby Marshal

Ruby's Marshal.load deserializes Ruby objects and calls marshal_load on any class that defines it. Gadget chains exploiting Gem::Requirement and related classes in the RubyGems standard library have been well-documented. Rails applications that store serialized objects in cookies using the Marshal format are exploitable if the cookie can be tampered with.

Node.js

Node.js does not have a native serialization format equivalent to Java's ObjectInputStream, but several common libraries introduce the same class of vulnerability. The node-serialize package executes JavaScript functions embedded in serialized JSON objects. Any library that evaluates serialized data as code, or that instantiates JavaScript objects by name from attacker-controlled strings, creates equivalent exposure.

Finding Deserialization Vulnerabilities

The first step is identifying where serialized data enters the application.

Check all cookies. Session cookies, remember-me tokens, and preference cookies are frequent carriers of serialized objects. Decode base64 values and look for known magic bytes. A cookie value of rO0ABXNy... is a Java serialized object.

Review HTTP request bodies. API endpoints that accept binary formats or non-JSON content types may deserialize objects. Look for application/x-java-serialized-object, application/octet-stream, and multipart/form-data endpoints that accept opaque binary fields.

Check for the PHPSESSID and viewstate patterns. PHP session data is sometimes stored with serialization, and ASP.NET ViewState in older applications is a base64-encoded serialized structure.

Look for deserialization libraries in dependency lists. The presence of commons-collections, spring-core, kryo, or similar libraries in a Java application's dependency tree increases the probability of exploitable gadget chains. For Python, search for pickle, marshal, shelve, or joblib.load calls.

Fuzz with known signatures. Replacing a base64-encoded value with a Java serialized object header (rO0AB) and observing an error that differs from normal error responses often indicates that deserialization is occurring.

Remediation

Eliminate Untrusted Deserialization

The most effective remediation is stopping deserialization of untrusted data entirely. Replace native serialization with JSON, Protocol Buffers, or MessagePack. Validate the schema of the data before processing it. JSON parsers do not instantiate arbitrary classes, do not call constructors, and do not execute lifecycle hooks.

Implement Class Allowlists

When native deserialization cannot be avoided, restrict which classes can be instantiated. Java's JEP 415 provides ObjectInputFilter for this purpose:

java
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
    "com.example.app.SafeClass;!*"
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);

This allows only com.example.app.SafeClass to be deserialized. Any other class in the stream causes deserialization to fail. In PHP, unserialize() accepts an allowed_classes parameter with the same effect.

Sign and Verify Serialized Data

If the application serializes data that it later deserializes — for session state, caching, or inter-service communication — cryptographically sign the serialized bytes before storing or transmitting them. Verify the signature before deserialization. An attacker who cannot forge a valid signature cannot supply a malicious payload.

python
import hmac, hashlib, base64
 
def sign(data: bytes, secret: bytes) -> str:
    sig = hmac.new(secret, data, hashlib.sha256).digest()
    return base64.b64encode(sig).decode()
 
def verify_and_load(data: bytes, signature: str, secret: bytes) -> object:
    expected = sign(data, secret)
    if not hmac.compare_digest(expected, signature):
        raise ValueError("Invalid signature — refusing to deserialize")
    return pickle.loads(data)

This pattern does not fix the underlying vulnerability — a compromised secret invalidates it — but it is a practical defense for applications that cannot immediately migrate away from native serialization.

Sandboxed Deserialization

For applications that must deserialize untrusted data from multiple classes, run the deserialization in a sandboxed process with no access to the network, filesystem, or privileged system calls. The sandbox limits what any exploited gadget chain can accomplish. This is a mitigation layer, not a fix, but it significantly reduces the blast radius of a successful exploit.

The Fundamental Problem

Insecure deserialization is dangerous because it inverts the normal attack model. In most injection vulnerabilities, an attacker gets their payload into the application and finds a way to execute it. In deserialization vulnerabilities, execution happens during input processing itself. The application's own libraries become the attack surface.

The lesson for defensive code review is to treat all deserialization of external data as equivalent to eval() — because in these runtimes, it often is. The safeguard is the same: do not parse external data using mechanisms that execute code, and if you must, verify integrity and restrict what can be instantiated before the parsing begins.

Need your application's deserialization attack surface assessed? Get in touch.

Need your application tested?

We find these vulnerabilities in real applications every day. Get a comprehensive security assessment with detailed remediation.

Request an Assessment

Summary

Insecure deserialization occurs when an application reconstructs objects from untrusted serialized data without validating what is being instantiated. In languages like Java and PHP, the deserialization process itself triggers method execution — no secondary injection step required — making this one of the few vulnerability classes where parsing user input is enough to achieve remote code execution.

Key Takeaways

  • 1Insecure deserialization occurs when applications reconstruct objects from untrusted data without validating class types before the deserialization process executes
  • 2Java deserialization gadget chains exploit existing library classes on the application's classpath — the application does not need to contain vulnerable code itself, only a vulnerable dependency
  • 3PHP object injection exploits magic methods like __wakeup, __destruct, and __toString that execute automatically during and after unserialize() calls
  • 4Python's pickle module is explicitly unsafe for untrusted input — any pickled object can contain a __reduce__ method that executes arbitrary system commands when loaded
  • 5Remediation requires eliminating deserialization of untrusted data, replacing it with safer formats like JSON, or implementing strict class allowlists before any deserialization occurs

Frequently Asked Questions

Insecure deserialization occurs when an application takes a serialized object — a structured binary or text representation of program state — from an untrusted source and reconstructs it into a live object without validating what it is being asked to create. The danger is that the deserialization process itself may execute code embedded in or triggered by the serialized data, before any application-level validation has the opportunity to run. Code execution happens during the act of parsing, not through a secondary step.

Java gadget chains exploit the readObject() method called during native deserialization. By crafting a serialized payload that instantiates specific classes from popular libraries on the application's classpath — such as Apache Commons Collections or Spring Framework — an attacker chains method invocations so that reconstructing the outer object triggers a sequence of calls that ultimately executes an operating system command. The application does not need to contain vulnerable code. It only needs to have a gadget library present as a dependency.

Look for serialized data in cookies, HTTP headers, POST bodies, and hidden form fields. Java serialized objects start with the magic bytes AC ED 00 05 in hex, or rO0AB when base64-encoded. PHP serialized strings begin with O: followed by a class name length and the class name. Cookies with names like JSESSIONID, rememberMe, or viewstate often carry serialized objects. Also inspect API endpoints that accept binary data or content types like application/x-java-serialized-object.

Yes. The pickle module is explicitly designed to reconstruct arbitrary Python objects, including those with __reduce__ methods that execute system commands. The Python documentation explicitly warns that pickle data from untrusted sources should never be loaded. Any application that deserializes pickle data from user-controlled input — cookies, API request bodies, message queue payloads — is potentially vulnerable to arbitrary code execution with the privileges of the process.

The safest remediation is to avoid deserializing data from untrusted sources entirely. Replace native serialization with JSON or XML and validate the schema independently of parsing. If native deserialization is unavoidable, implement a class allowlist that restricts which types can be instantiated before any object is reconstructed — in Java via JEP 415 deserialization filters, in PHP via the allowed_classes parameter to unserialize(). Cryptographically sign serialized data and verify the signature before deserializing. Never deserialize data that you cannot authenticate.