Cracking Java byte-code encryption (original) (raw)

Why Java obfuscation schemes based on byte-code encryption won't work

May 9, 2003

Q: If I encrypt my .class files and use a custom classloader to load and decrypt them on the fly, will this prevent decompilation?

A: The problem of preventing Java byte-code decompilation is almost as old the language itself. Despite a range of obfuscation tools available on the market, novice Java programmers continue to think of new and clever ways to protect their intellectual property. In this Java Q&A installment, I dispel some myths around an idea frequently rehashed in discussion forums.

The extreme ease with which Java .class files can be reconstructed into Java sources that closely resemble the originals has a lot to do with Java byte-code design goals and trade-offs. Among other things, Java byte code was designed for compactness, platform independence, network mobility, and ease of analysis by byte-code interpreters and JIT (just-in-time)/HotSpot dynamic compilers. Arguably, the compiled .class files express the programmer’s intent so clearly they could be easier to analyze than the original source code.

Several things can be done, if not to prevent decompilation completely, at least to make it more difficult. For example, as a post-compilation step you could massage the .class data to make the byte code either harder to read when decompiled or harder to decompile into valid Java code (or both). Techniques like performing extreme method name overloading work well for the former, and manipulating control flow to create control structures not possible to represent through Java syntax work well for the latter. The more successful commercial obfuscators use a mix of these and other techniques.

Unfortunately, both approaches must actually change the code the JVM will run, and many users are afraid (rightfully so) that this transformation may add new bugs to their applications. Furthermore, method and field renaming can cause reflection calls to stop working. Changing actual class and package names can break several other Java APIs (JNDI (Java Naming and Directory Interface), URL providers, etc.). In addition to altered names, if the association between class byte-code offsets and source line numbers is altered, recovering the original exception stack traces could become difficult.

Then there is the option of obfuscating the original Java source code. But fundamentally this causes a similar set of problems.

Encrypt, not obfuscate?

Perhaps the above has made you think, “Well, what if instead of manipulating byte code I encrypt all my classes after compilation and decrypt them on the fly inside the JVM (which can be done with a custom classloader)? Then the JVM executes my original byte code and yet there is nothing to decompile or reverse engineer, right?”

Unfortunately, you would be wrong, both in thinking that you were the first to come up with this idea and in thinking that it actually works. And the reason has nothing to do with the strength of your encryption scheme.

A simple class encoder

To illustrate this idea, I implemented a sample application and a very trivial custom classloader to run it. The application consists of two short classes:

public class Main
{
    public static void main (final String [] args)
    {   
        System.out.println ("secret result = " + MySecretClass.mySecretAlgorithm ());
    }
} // End of class
package my.secret.code;
import java.util.Random;
public class MySecretClass
{
    /**
     * Guess what, the secret algorithm just uses a random number generator... 
     */
    public static int mySecretAlgorithm ()
    {
        return (int) s_random.nextInt ();
    }
    private static final Random s_random = new Random (System.currentTimeMillis ());
} // End of class

My aspiration is to hide the implementation of my.secret.code.MySecretClass by encrypting the relevant .class files and decrypting them on the fly at runtime. To that effect, I use the following tool (some details omitted; you can download the full source from Resources):

public class EncryptedClassLoader extends URLClassLoader
{    
    public static void main (final String [] args)
        throws Exception
    {        
        if ("-run".equals (args [0]) && (args.length >=  3))
        {
            // Create a custom loader that will use the current loader as
            // delegation parent:
            final ClassLoader appLoader =
                new EncryptedClassLoader (EncryptedClassLoader.class.getClassLoader (),
                new File (args [1]));
            
            // Thread context loader must be adjusted as well:
            Thread.currentThread ().setContextClassLoader (appLoader);
            
            final Class app = appLoader.loadClass (args [2]);
            
            final Method appmain = app.getMethod ("main", new Class [] {String [].class});
            final String [] appargs = new String [args.length - 3];
            System.arraycopy (args, 3, appargs, 0, appargs.length);
            
            appmain.invoke (null, new Object [] {appargs});
        }
        else if ("-encrypt".equals (args [0]) && (args.length >= 3))
        {
            ... encrypt specified classes ...
        }
        else
            throw new IllegalArgumentException (USAGE);        
    }
    
    /**
     * Overrides java.lang.ClassLoader.loadClass() to change the usual parent-child
     * delegation rules just enough to be able to "snatch" application classes
     * from under system classloader's nose.
     */
    public Class loadClass (final String name, final boolean resolve)
        throws ClassNotFoundException
    {
        if (TRACE) System.out.println ("loadClass (" + name + ", " + resolve + ")");
        
        Class c = null;
        
        // First, check if this class has already been defined by this classloader
        // instance:
        c = findLoadedClass (name);
        
        if (c == null)
        {
            Class parentsVersion = null;
            try
            {
                // This is slightly unorthodox: do a trial load via the
                // parent loader and note whether the parent delegated or not;
                // what this accomplishes is proper delegation for all core
                // and extension classes without my having to filter on class name: 
                parentsVersion = getParent ().loadClass (name);
                
                if (parentsVersion.getClassLoader () != getParent ())
                    c = parentsVersion;
            }
            catch (ClassNotFoundException ignore) {}
            catch (ClassFormatError ignore) {}
            
            if (c == null)
            {
                try
                {
                    // OK, either 'c' was loaded by the system (not the bootstrap
                    // or extension) loader (in which case I want to ignore that
                    // definition) or the parent failed altogether; either way I
                    // attempt to define my own version:
                    c = findClass (name);
                }
                catch (ClassNotFoundException ignore)
                {
                    // If that failed, fall back on the parent's version
                    // [which could be null at this point]:
                    c = parentsVersion;
                }
            }
        }
        
        if (c == null)
            throw new ClassNotFoundException (name);
        
        if (resolve)
            resolveClass (c);
        
        return c;
    }
        
    /**
     * Overrides java.new.URLClassLoader.defineClass() to be able to call
     * crypt() before defining a class.
     */
    protected Class findClass (final String name)
        throws ClassNotFoundException
    {
        if (TRACE) System.out.println ("findClass (" + name + ")");
        
        // .class files are not guaranteed to be loadable as resources;
        // but if Sun's code does it, so perhaps can mine...
        final String classResource = name.replace ('.', '/') + ".class";
        final URL classURL = getResource (classResource);
        
        if (classURL == null)
            throw new ClassNotFoundException (name);
        else
        {
            InputStream in = null;
            try
            {
                in = classURL.openStream ();
                final byte [] classBytes = readFully (in);
                
                // "decrypt":
                crypt (classBytes);
                if (TRACE) System.out.println ("decrypted [" + name + "]");
                
                return defineClass (name, classBytes, 0, classBytes.length);
            }
            catch (IOException ioe)
            {
                throw new ClassNotFoundException (name);
            }
            finally
            {
                if (in != null) try { in.close (); } catch (Exception ignore) {}
            }
        }
    }
    
    /**
     * This classloader is only capable of custom loading from a single directory. 
     */
    private EncryptedClassLoader (final ClassLoader parent, final File classpath)
        throws MalformedURLException
    {
        super (new URL [] {classpath.toURL ()}, parent);
        
        if (parent == null)
            throw new IllegalArgumentException ("EncryptedClassLoader" +
                " requires a non-null delegation parent");
    }
    
    /**
     * De/encrypts binary data in a given byte array. Calling the method again
     * reverses the encryption.
     */
    private static void crypt (final byte [] data)
    {
        for (int i = 8; i < data.length; ++ i) data [i] ^= 0x5A;
    }
    ... more helper methods ...
    
} // End of class

EncryptedClassLoader has two basic operations: encrypting a given set of classes in a given classpath directory and running a previously encrypted application. The encryption is very straightforward: it consists of basically flipping some bits of every byte in the binary class contents. (Yes, the good old XOR (exclusive OR) is almost no encryption at all, but bear with me. This is just an illustration.)

Classloading by EncryptedClassLoader deserves a little more attention. My implementation subclasses java.net.URLClassLoader and overrides both loadClass() and defineClass() to accomplish two goals. One is to bend the usual Java 2 classloader delegation rules and get a chance to load an encrypted class before the system classloader does it, and another is to invoke crypt() immediately before the call to defineClass() that otherwise happens inside URLClassLoader.findClass().

After compiling everything into the bin directory:

>javac -d bin src/*.java src/my/secret/code/*.java

I “encrypt” both Main and MySecretClass classes:

>java -cp bin EncryptedClassLoader -encrypt bin Main my.secret.code.MySecretClass
encrypted [Main.class]
encrypted [mysecretcodeMySecretClass.class]

These two classes in bin have now been replaced with encrypted versions, and to run the original application, I must run the application through EncryptedClassLoader:

>java -cp bin Main
Exception in thread "main" java.lang.ClassFormatError: Main (Illegal constant pool type)
        at java.lang.ClassLoader.defineClass0(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:502)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:250)
        at java.net.URLClassLoader.access00(URLClassLoader.java:54)
        at java.net.URLClassLoader.run(URLClassLoader.java:193)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:186)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:299)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:265)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:255)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:315)
>java -cp bin EncryptedClassLoader -run bin Main
decrypted [Main]
decrypted [my.secret.code.MySecretClass]
secret result = 1362768201

Sure enough, running any decompiler (such as Jad) on encrypted classes does not work.

Time to add a sophisticated password protection scheme, wrap this into a native executable, and charge hundreds of dollars for a “software protection solution,” right? Of course not.

ClassLoader.defineClass(): The inevitable intercept point

All ClassLoaders have to deliver their class definitions to the JVM via one well-defined API point: the java.lang.ClassLoader.defineClass() method. The ClassLoader API has several overloads of this method, but all of them call into the defineClass(String, byte[], int, int, ProtectionDomain) method. It is a final method that calls into JVM native code after doing a few checks. It is important to understand that no classloader can avoid calling this method if it wants to create a new Class.

The defineClass() method is the only place where the magic of creating a Class object out of a flat byte array can take place. And guess what, the byte array must contain the unencrypted class definition in a well-documented format (see the class file format specification). Breaking the encryption scheme is now a simple matter of intercepting all calls to this method and decompiling all interesting classes to your heart’s desire (I mention another option, JVM Profiler Interface (JVMPI), later).

Doing this interception is not hard at all. In fact, breaking my own protection scheme takes less time that it took to implement it! First, I get the source for java.lang.ClassLoader for my Java 2 Platform, Standard Development Kit (J2SDK) and modify defineClass(String, byte[], int, int, ProtectionDomain) to have some additional class logging:

    ...
        c = defineClass0(name, b, off, len, protectionDomain);
        
        // Intercept classes defined by the system loader and its children:
        if (isAncestor (getSystemClassLoader ().getParent ()))
        {
            // Choose your own dump location here [use an absolute pathname]:
            final File parentDir = new File ("c:/TEMP/classes/");
            File dump = new File (parentDir,
                name.replace ('.', File.separatorChar) + "[" +
                getClass ().getName () + "@" + 
                Long.toHexString (System.identityHashCode (this)) + "].class");
                            
            dump.getParentFile ().mkdirs ();
            
            FileOutputStream out = null;
            try
            {
                out = new FileOutputStream (dump);                
                out.write (b, off, len);
            }
            catch (IOException ioe)
            {
                ioe.printStackTrace (System.out);
            }
            finally
            {
                if (out != null) try { out.close (); } catch (Exception ignore) {}
            }
        }
    ...

Note that the added lines are guarded by an if statement that filters for classes loaded by the system (-classpath) and its descendant classloaders. Also, the logging occurs only if defineClass() does not fail. Finally, because it is not inconceivable that more than one ClassLoader instance might load a class, I disambiguate the results by embedding the classloader identity in the dumped filename.

The final step is to temporarily replace rt.jar used by my Java Runtime Environment (JRE) (note that it could be different from the one used by J2SDK) with one that contains my doctored java.lang.ClassLoader implementation. Or, you could use the -Xbootclasspath/p option.

I run the encrypted application again and voila, I have recovered all my unencrypted and thus easily decompilable .class definitions. And note I have not used any knowledge of EncryptedClassLoader inner workings to accomplish this.

Observe that if I did not want to instrument a system class, I could have used other options such as a custom JVMPI agent that handles JVMPI_EVENT_CLASS_LOAD_HOOK events.

Lessons learned

I hope you found this quick excursion into details of Java classloading interesting. An important point to realize is that some tools on the market promise solutions to Java’s easy reverse engineering problem through class encryption, and you should think twice before buying one. Until JVM architecture changes to, say, support class decoding inside native code, you will be better off with traditional obfuscators that perform byte-code transformations.

There is another, more useful side to such tricks as well: debugging Java classloading. Being able to get a load trace for a custom classloader could be invaluable, especially if you are trying to track down the cause of a classloader constraint violation (more on this in future Java Q&A posts). So, maybe Java was born to be a language for pure open source development after all? Of course, other architectures based on platform-neutral byte code (such as .Net) are equally prone to reverse engineering. I will leave you with this thought for now.

Vladimir Roubtsov has programmed in a variety of languages for more than 13 years, including Java since 1995. Currently, he develops enterprise software as a senior engineer for Trilogy in Austin, Texas.