otx: Objective-C disassembly

Let’s say that you have a closed-source Objective-C program on your Mac that needs a minor modification. Maybe the company is delayed in sending you the license file that you purchased, or the company went out of business and there’s no way that the code will ever get fixed.

I’m not very sophisticated at this sort of thing, but I’ve twiddled a few bits here and there in the past. Usually simple stuff like replacing a set of instructions with no-ops or altering a constant in a move or arithmetic instruction. However, it was always in regular C programs, so the standard GNU binutils like nm and objdump, together with xxd, have been sufficient to get an idea of what was going on in the program. Unfortunately, these tools aren’t sufficient by themselves for an Objective-C program.

Here is an excerpt from the normal objdump -d from the binary that lives inside of GitX:

4018:       89 74 24 04             mov    %esi,0x4(%esp)
401c:       89 04 24                mov    %eax,(%esp)
401f:       e8 88 21 02 00          call   261ac <.objc_class_name_PBNSURLPathUserDefaultsTransfomer+0x3b0c>
4024:       83 c4 20                add    $0x20,%esp
4027:       31 c0                   xor    %eax,%eax
4029:       5b                      pop    %ebx
402a:       5e                      pop    %esi
402b:       c9                      leave
402c:       c3                      ret
402d:       55                      push   %ebp
402e:       89 e5                   mov    %esp,%ebp
4030:       83 ec 38                sub    $0x38,%esp
4033:       89 5d f4                mov    %ebx,0xfffffff4(%ebp)
4036:       89 75 f8                mov    %esi,0xfffffff8(%ebp)
4039:       8b 75 10                mov    0x10(%ebp),%esi
403c:       89 7d fc                mov    %edi,0xfffffffc(%ebp)
403f:       8b 3d 30 1b 02 00       mov    0x21b30,%edi
4045:       c7 44 24 10 00 00 00    movl   $0x0,0x10(%esp)

Here is the same range of bytes interpreted by otx (otx -e < /Applications/GitX.app/Contents/MacOS/GitX):

+100  00004018  89742404                movl        %esi,0x04(%esp)
+104  0000401c  890424                  movl        %eax,(%esp)
+107  0000401f  e888210200              calll       0x000261ac                    _objc_assign_strongCast
+112  00004024  83c420                  addl        $0x20,%esp
+115  00004027  31c0                    xorl        %eax,%eax
+117  00004029  5b                      popl        %ebx
+118  0000402a  5e                      popl        %esi
+119  0000402b  c9                      leave
+120  0000402c  c3                      ret

+(BOOL)[PBGitRepository isBareRepository:]
+0  0000402d  55                      pushl       %ebp
+1  0000402e  89e5                    movl        %esp,%ebp
+3  00004030  83ec38                  subl        $0x38,%esp
+6  00004033  895df4                  movl        %ebx,0xf4(%ebp)
+9  00004036  8975f8                  movl        %esi,0xf8(%ebp)
+12  00004039  8b7510                  movl        0x10(%ebp),%esi
+15  0000403c  897dfc                  movl        %edi,0xfc(%ebp)
+18  0000403f  8b3d301b0200            movl        0x00021b30,%edi               PBEasyPipe
+24  00004045  c744241000000000        movl        $0x00000000,0x10(%esp)

As you can see, it makes a big difference in terms of readability. otx decodes the function names and boundaries from the text and decodes the function calls whenever possible. The dynamic dispatch makes it tricky to read a raw disassembly otherwise.

So you just take the byte offsets of the instructions you want to replace and write the appropriate opcodes in place of them with your hex editor. You can write up a little bit of assembly and assemble it with as and then disassemble that if you don’t know the opcodes for the replacement instructions off the top of your head. On x86/x86_64, 0x90 is the opcode for nop, which is the most common instruction that I use as filler. The filler is pretty important, since you can’t easily change the byte offsets of any of the other instructions or else your jump locations and possibly your variable references will be wrong.