Skip to content

Decompile

Nymphoenix edited this page Apr 28, 2016 · 30 revisions

Decompiling

Decompiling of Java bytecode is divided by Cruiser into several parts:

  • Parameter and Local Variable Analysis
  • Arithmetic
  • Arrays
  • Accessing Constant Pool
  • Control Flow
  • Invoking Methods and Constructors
  • Throwing and Handling Exceptions
  • Synchronization
  • Annotations

Parameter and Local Variable Analysis

When parsing method signature in method_info, aka parameter table and return type, Cruiser counts the number of parameters(Np) and pass it to decompiling module. Then Cruiser retrieves the value(Nl) of member max_locals in corresponding Code attribute.

Context Free Grammar

To parse bytecode into tokens, we need to define a set of terminals, a set of non-terminals, a set of productions and a start symbol.

Mapping opcodes into symbols

Opcode Symbol
From aconst_null(0x01) to aload_3(0x2d) V
From iaload(0x2e) to saload(0x35) V'
From istore(0x36) to astore_3(0x4e) A
From iastore(0x4f) to sastore(0x56) A'
From iadd(0x60) to iinc(0x84) Σ
From lcmp(0x94) to if_acmpne(0xa6) C
goto(0xa7), goto_w(0xc8) G
jsr(0xa8), jsr_w(0xc9) J
ret(0xa9) R'
tableswitch(0xaa) ST
lookupswitch(0xab) SL
From ireturn(0xac) to return(0xb1) R
From getstatic(0xb2) to putfield(0xb5) F
From invokevirtual(0xb6) to invokedynamic(0xba) I
new(0xbb) N
newarray(0xbc), anewarray(0xbd) NA
multianewarray(0xc5) NM
arraylength(0xbe) L
athrow(0xbf) T
checkcast(0xc0) X
instanceof(0xc1) CI
monitorenter(0xc2) Mi
monitorexit(0xc3) Mo
wide(0xc4) W
ifnull(0xc6), ifnonnull(0xc7) CO
Other ε

Productions

Given code(E) retrieved from Code attribute, it could derive:

E  → E R                 // return from the method
   | V"                  // solo value
   | V" A                // assignment
   | V" V" A'            // assignment of array entry
   | G E C'              // while loop
   | C' E E              // if-else
   | S                   // switch
   | T"                  // throw a Throwable
   | T'                  // handle exceptions
   | Mi E Mo              // synchronized block
   | ε                   // empty
   | W'

V" → V                   // solo value
   | V" V" V'            // retrieve entry from array
   | V" L                // get length of the array
   | V" V" Σ             // arithmetic calculation
   | N I                 // new object
   | V NA                 // new array
   | N'M                 // new multi-dimension aray
   | I'                  // invoke methods
   | F'                  // field access

C' → V V C               // compare integer values
   | V V C C             // compare double values
   | V Co                // compare object values
   | V" CI ifeq          // instanceof

I' → V I                 // invoke instance methods without arguments
   | I                   // invoke static methods without arguments
   | V I'                // invoke methods with arguments

F' → V V F               // putfield
   | V F                 // putstatic, getfield
   | F                   // getstatic

N'M → V N'M              // recursive
    | NM

S  → V ST E'             // table switch
   | V SL E'             // lookup switch
E' → E' E                // expressions following switches

T" → V" T                // throw
T' → E Y                 // single try-catch block
   | T' Y                // try and multi-catch block
Y  → A E                 // catch block

F" → E F"                // single try-finally could result from this grammar
   | G Y F"              // add one catch block
   | G F"                // `javac` compiler used to output necessary `goto`
   | J R  A J T" A E R'  // finally block

W' → W iinc
   | W iload
   | W fload
   | W aload
   | W lload
   | W dload
   | W istore
   | W fstore
   | W astore
   | W lstore
   | W dstore
   | W ret
Clone this wiki locally