Friday, July 29, 2011

12.1 - Example - Static Malware Analysis

We looked at the behavior of a piece of malware last time and tried to obtain as much information as possible from it by simply running it and watching it interact with various systems. Many times you may not have the liberty to do this and will have to look at only the assembly listing of the malware and deduce what you think it will do. So in this blog - we will look at the same exe (aolsbm.1.exe) and analyze it statically. Lets go.

In case you missed it you can download the malware from http://www.offensivecomputing.net. You will need to register here (free) and then search for the hash 5a2be07ad750bed86be65954fb9d7d21

We need a debugger to step through the code bit by bit and understand what is happening. To do so we'll primarily use OllyDBG. However to get a better view of function calls and loops its a good idea to also open up the same binary in IDAPro (free version is fine) at the same time - the display is much nicer there. Before starting do familiarize yourself with OllyDbg as much as you can. There is no way you'll be comfortable right away and it might take a week of playing with it regularly for you to understand what all the terminology means..but hey..that's just fine. Just try and understand everything before you go forward, don't get frustrated if you get stuck in the middle of all that assembly code. Just keep plugging at it and you'll eventually get it. Enough sermons then..lets go :)

Load up aolsbm.1.exe in Olly using File - Open and do the same in IDA (use the default options). You immediately get a message about Olly's analysis not being accurate and whether you want to continue doing it. This is because it is difficult to analyze a packed executable..remember we talked about this last time? So we have to try and see if we can unpack it using some software. Remember you'd accumulated disk and memory strings from the running process when you ran it? Have a look at the first few lines of either file. Do you see something like UPX0, UPX1 over there? This may..just may mean that a program called UPX was used to pack the executable. And luckily for us, UPX also has an unpacking switch. So lets download UPX (free) and try to unpack the executable using the command - upx.exe -d aolsbm.1.exe. Immediately you get a new line mentioning the percentage to which it was packed and other information about the file. Close Olly and open the file again. No message..rt? And the analysis also was done by Olly..successfully. Remember though that we got lucky this time. Many malware writers (I've read) have their own custom packers and unpackers embedded in the malware itself. So its harder to find out how the malware was packed..and even harder to unpack it. Lets go on.

The entry point or the place in memory where the malware was loaded is an address 00411F04. This is where the malware will start every time it is loaded into Olly. Now .. how do you proceed? There's a huge ton of code to look at..rt? The ground rules for reversing are actually very simple:-

a) Ignore what you do not want to analyze in depth = Step Over = F8
b) Dive into what you want to understand better = Step Into = F7

Effectively the assembly code listing that you see in front of you in Olly is a big list of functions [user defined and system] calling each other in a defined sequence. To understand what the malware is doing, you will need to understand in depth, what some of those functions are doing. Yes, for a complete code reconstruction you would want to understand what each and every bit of code does..but trust me - that is extremely painful, not needed in a very large majority of cases and would take an unbelievably large amount of time. So I am not going to, at this early stage try to understand every bit - I'll try and understand just about enough to tell me what the malware is doing. Moving on then..

We talked about 'Step Into' and 'Step Over' earlier. Now whenever you see a 'CALL' in assembly it means a function is being called..for some purpose. If it is a system function which was exported by some system DLL you do not need to Step Into it. This is because the behavior of those functions is never going to change and there is nothing to be gained by studying them in depth. You can just look at the documentation of those functions on MSDN and find out what parameters it takes as arguments and what values it returns. Lets take an example now - The very first line is CALL aolsbm_1.40194ac .. now this is a user defined function so you may want to Step Into this and find out what it does. For now though just press F8 till you reach the address 00411DA7 where you see another CALL function; this time it is CALL Kernel32.GetStartupInfoW. This is clearly a system function (starting with a name other than aolsbm_1) so you do NOT need to Step Into this function at all. That's because the behavior of GetStartupInfoW is known and it will always get the same inputs and give the same outputs - there IS nothing to study here. So focus only on the User Defined functions.

Now even in the 'User Defined' functions group - you do NOT need to analyze in depth every single function. Relieved? ;). The trick though is knowing which ones to Step Into and which ones to just Step Over. For e.g You'll remember we kept hitting F8 till we got to the Kernel32 function. This meant that we were not interested in any of the CALL functions that were made till the Kernel32 function. So in this case we are saying - I am not interested in 2 CALL functions made; namely -

00411F04 ----- CALL aolsbm_1.0040194ac
00411D9E ----- CALL aolsbm_1.00412880

This assumption that we have made may or may not be correct. Instead of Stepping Over the functions, lets step into these 2 calls. So hit Ctrl+F2 and get back to the start of the program(Hit Yes if you get a warning). Hit F7 on the first line - which will take you to the address 0040194ac (The destination of the call). Now study this code line by line and see if you can see any system functions being called (like the Kernel32 function) in the body of THIS function. The body of this function ranges from 0040194ac to 004019546. Now in this body we can see 5 system functions - GetSystemTimeAsFilename, GetCurrentProcessID, GetCurrentThreadID, GetTickCount and QueryPerformanceCounter. Go on to MSDN and study what each of these 5 functions does. Once you're through you'll understand that this function(0040194ac) is not doing anything that is important from a malware analysis perspective. So we can Step Over it.

Lets repeat this for the 2nd call(00412880). Hit Ctrl+F2 again and restart the program. This time we do not need to Step Into(F7) the first function (we already did that..rt?) .. so we do F8 till we reach the CALL 00412880 statement and then Step Into that call(F7). The range of this call is from 00412880 to 004128c4. Now here we don't have any system functions to give us any hints about what this function possibly does. So unless we're magicians or super gods in assembly programming we really don't know. So simply mark a comment there and skip it. Huh? Yes..that might sound strange but to be frank I don't think there is anything better you can do at such an early stage. Later in the program when you see some function which looks more familiar, you can return and revisit this function if needed. As of now there is nothing to do - so ignore it. One thing though - You'll see that this function has been called by numerous other functions. You can find this out by clicking on the line which has the address 00412880 and looking at the middle pane on the left half of your screen. It will say something like:
------
Local calls from 0040F41C, 0040F58F, 0040F731, 0040F813, 004100E6, 00410938, 00410D9D, 00410E77, 00411D9E, 00411F74, 00412246, 00412415, 00413F31, 004150C2, 004151E3, 00415325, 004154C1, 00415640, 00415D42, 00415E89, 00416133, 0041617F, 004168A3, ...
------
So many calls means its some very common function - otherwise it wouldnt be called so many times..rt? So we can just record all that information and move on. I recommend you just go to the end of that function using Ctrl+F9 as soon as you realize there is nothing useful for you there at that particular moment. This will take you to the last statement of that function.Hit F8 again and you're back at the original CALL. Move up ..Comment the CALL and move forward. The comments are very useful - its very easy to forget what you were doing when you're in the middle of such relatively unreadable code :)

All ok so far? Lets take a break - assimilate all that slowly - and come back for Part 2 of this little exercise in a while. I also recommend you use this oppurtunity to get familiar with Olly and its features .. play around with it till you feel comfortable. In Part 2 we'll use these basics and a few other small tips that I have learnt so far and try and go forward a little quickly. Bye for now.

No comments: