Make sure you read the previous post before you start on this one. We started doing some analysis on a piece of malware and looked at the basics of how to start static analysis. We'll pick off where we left off last time. Now if you look at the assembly code and scroll up and down you will see a huge number of CALL statements and many JMP statements and many other loops. Yes you could, as we discussed last time Step Into each and every call ..mark it and go on till you reach the end of the program. That's just fine. So for example - I did this for the next few CALL statements.. just to get a clearer picture and see if this strategy works.
00411D9E . E8 DD0A0000 CALL aolsbm_1.00412880 ; Nothing here - intermediate function
00411DA7 . FF15 28014200 CALL DWORD PTR DS:[<&KERNEL32.GetStartupInfoW>] ; \-------------- Windows appearance at startup
00411DBC . FF15 2C014200 CALL DWORD PTR DS:[<&KERNEL32.HeapSetInformation>] ; --------------------- Heap memory
00411E0B > \E8 44050000 CALL aolsbm_1.00412354 ; ------------- Heap memory create
00411E1C > \E8 90410000 CALL aolsbm_1.00415FB1 ; --------------- Process and thread info gather
00411E27 . E8 42FFFFFF CALL aolsbm_1.00411D6E ; Some common function (Later)
00411E2D > \E8 824D0000 CALL aolsbm_1.00416BB4 ; ---------------- Some ntdll function
00411E35 . E8 9F0A0000 CALL aolsbm_1.004128D9 ; -------------------- Get file handles
00411E45 . 59 POP ECX ; All process setting up till here - ignore
00411E46 > \FF15 30014200 CALL DWORD PTR DS:[<&KERNEL32.GetCommandLineA>] ; Get Program's Command Line arguments
00411E51 . E8 BF750000 CALL aolsbm_1.00419415 ; Environment strings
00411E5B . E8 FA740000 CALL aolsbm_1.0041935A ; ------------------------ Get module file name
00411E6C > \E8 73720000 CALL aolsbm_1.004190E4 ; -------------- Getting user directories, startupinfo, env variables etc
00411E90 > \E8 F0710000 CALL aolsbm_1.00419085 ; --------------------- Some path parsing of the executable path again
Yeah that'll do...I've stepped into each and every one of these call functions(user defined) and studied them briefly and decided whether to delve deeper into them or not. The question each time is - Does this bring me closer to understanding what the malware does? If the answer is NO, just comment it and move on. All ok? Yeah ok...except its going to take a huge amount of time to get to the end of the program. So it is probably a good idea to step back a bit and try and see if your dynamic analysis can help you move forward a little quicker.
Now you know that the malware definitely connects out to the Internet and does some stuff there. So why not startup Wireshark and see what goes out..as you Step Over and Step Into various bits of code? Right? Lets start Wireshark up then. What else? Well.. we know that if a connection to the Internet is made it must use some 'Windows Socket functions' to do so...like 'connect', 'send', 'gethostbyname' and so on. So we want to stop the program whenever these functions are used..which translates to 'We want to break..or we want to set a breakpoint'. So we want to now find out where these functions are being used in the program and break there. So we just hit Ctrl+N and search for the 'connect' function. Now if you look at the Ctrl+N window in Olly 1.10 and you're a newbie like me, you'll get confused because you won't see any 'connect' functions there and you'll spend time moaning about everything ;). But if you used Olly 2.0x and search you'll see a line which has WS2_32.connect in the comments section. The function name is WS2_32.#4. So come back to Olly 1.10(We'll primarily use this as it has more features), right click on the line for WS2_32.#4 and click 'Find references to import'. Promptly a box with an address 00403FD3 comes up..it means there is something at this address which calls the connect function. Right click on this line and set a breakpoint (Toggle breakpoint). So now..whenever the code reaches the line with the address 00403FD3 it will stop and you can analyze the function that called it and work your way backward from there. Easy? No.. not really if it is your first time... but logical..yes. It'll get easier the more you do...lets go on.
So that's another rule learnt then, if you are sure that the malware MUST use certain Windows functions for a specific purpose, which you know because you've done dynamic analysis - read up on MSDN about all those functions and set breakpoints accordingly on all these functions. That'll narrow down the scope quite a bit. Lets then quickly run through what we've done so far:
--- We understood how to navigate through code
--- We commented functions we didn't know anything about at the moment
--- We stepped over all those functions but soon realized that this way though exhaustive, is extremely time taking
--- We re-visited our dynamic analysis learnings and identified functions that could definitely be used and set breakpoints on them
--- We started Wireshark so we could see what traffic is sent by the executable at every step
Not bad at all. Lets move on. So keep hitting F8 till you pass 00411E90. I'm saying this because I've analysed it till there and am quite sure that none of those directly affect the malware in any way..look at the comments I've made. If you want though, feel free to F7 into each of those until you are satisfied :). Well now what? Lets try and run the program directly and hit F9. At some stage, though we don't know when.. we must break at the 'connect' breakpoint we have set. So hit F9. We do break as predicted...but even before that we see a new window open up :). Now there's a big chance that this relates in some way to the malware..so we want to find out how that window appeared.
A good way and probably the most intutive way when you're starting off is to just keep hitting F8 till you see the window pop up. Yes, there probably are more intelligent ways to solve this problem but it'll do for now. So lets do just that...hit Ctrl+F2 and restart the program. Now you know that there is nothing till 004011E90 for sure so instead of hitting F8 till there, lets right click - Go to Expression - 4011E90 - OK and jump there. Once you're there hit F4..this makes the program 'Run to selection'. You can also scroll down to that location if its not too far. Once you reach 4011E90 start hitting F8 as you don't know where the popup is going to come. You don't have to wait too long :).....
Pause for a moment when you reach 4011EAC and note this location down somewhere. Now hit F8 again. Boom!! There's your popup. What does this mean? It just means that there was something INSIDE the function that was called at 4011EAC which caused a popup to appear. This means that the function CALL 00419085 is interesting and we need to know something more about it. So we set a breakpoint here by highlighting that line and pressing F2. Now lets Ctrl+F2 again and hit F9 this time.. this effectively tells the program to run till it breaks. It does just that and halts at 4011EAC. Now since we want to know more about this CALL we hit F7 and not F8. We immediately are taken to 00419085. Notice there isn't any popup yet.. it is some place inside this function which does it. We need to F8 till there to find this out. Repeat this process and you see the popup again at the address 40BBA5. Can you see something at 40BBA5 that makes the popup appear? No, its another call. Put a breakpoint here and restart the program and reach 40BBA5. Now step into the call(F7) at this address (CALL 0040B48F). Once in this call start hitting F8 again till you reach the address 0040B4F7. Pause a bit and look at the instruction here -- 'CreateThread'. Another system function...lets look at what MSDN says.
CreateThread - The CreateThread function creates a new thread for a process.
So we're starting something here..mostly this thread causes the popup to appear...the third argument to this function is the address of the code this thread must execute. That argument is defined at the address 0040B4E7 by the instruction PUSH 0040D1AA. So this thread creates whatever there is at 40D1AA. Lets see what there is at 40D1AA. Right click - Go to Expression - 40D1AA. The range is from 40D1AA to 40D28C(RETN function specifies the end of the function). Its this function which is creating the popup. Lets put a breakpoint at 40D1AA and see if the thread jumps here. So hit F2 while at 40D1AA and then Ctrl+F2 again. Arrive till 0040B4F7 and F8 over the CreateThread function...immediately you see the code jumps to 40D1AA and stops. Yes!! Our understanding was correct. Lets F8 step by step now..
You pass over 2 system functions here - Ole32.coinitialize and Kernel32.GetModuleHandleA. I wont explain these here..you can get into the habit of having Google permanently open for MSDN ;). However there is another call here - CALL 404A22 here..at address 40D1E5. Lets F7 into that..and you see its another function which ranges from 404A22 to 404ABA. Just browse through it...anything interesting?? Aha..there is a call to the CreateWindowEx function with its 2nd argument as "IEEmbedded".... very interesting. Remember we found strings called IEEmbedded in dynamic analysis?? Read up a little about this and you will find that this function creates a window of a specific size :). After a few more calls we're back in the previous function at address 40D1EA.
Go on reading. Now there's a ShowWindow call with 2 arguments - the first argument is the handle returned by CreateWindow and the second argument is the number 5. MSDN says that 5 stands for display the Window that was created. Right..step over ShowWindow. Yes!! The window appears. More F8 reveals navigation inside a loop consisting of the functions TranslateMessage, DispatchMessage and GetMessage. We dont want to remain in this loop now...we sort of know what it does..it does things with the window. That's good enough. Lets go back to the previous function and put a breakpoint at 40B511 .. any location after the CreateThread will do - we just want to get out of that thread now that we know what it does. Remove all the breakpoints except that at 40B511 and hit F9. You should get a Window popup and your code should halt at 40B511. Got it?
So effectively to dig out all the information about a particular call we might have to dig in extremely deep into the code. You saw...that to just get to the function which created a window we had to go 4 or 5 calls deep into the code. Its the same methodology we have to follow for every single call that we're interested in. So to sum up what we have learnt so far:
--- Comment code a lot
--- Step Over calls you dont have use for
--- Think of the actual behavior of the program wrt dynamic analysis and break on specific functions
--- Look at the runtime behavior of the program and dig into CALL statements accordingly
--- Understand API's better and set breakpoints accordingly
These are the basics of reverse engineering ..really. Keep digging till you find what you want. In Part 3 we'll use these same concepts and move much faster and conclude our analysis of this piece of malware.