A program can be compiled dynamically or statically on Linux. For simplicity's sake - I considered only C binaries. When you dynamically compile a program the libraries do not get included into the binary itself - the functions that they export are called at runtime. In a statically linked binary however, all the libraries that the binary needs ... to run... are part of the binary itself. And that if you are reversing something...is a pain. Coz you don't know which part of the code is the binary...and which part is library code. IDA detects a lot - but not all of it... not enough ..for sure. So I decided to try and so something..
This little project came into my mind primarily while playing the reversing challenges in CTFs. The files there used to be massive (4 digit numbers of functions) and very difficult to solve (for me anyway :)). I would never be able to identify which code was library code - in the case of statically linked binaries. Thus I could never complete those challenges OR it took me a lot of time. I still can't complete many but that's a separate story ;)
Anyway TL;DR I wrote a few simple IDAPython/Python scripts that basically compare the IDB of the binary to be reversed and a whole lot of library code. The more idea you have about the exact libraries that were used while building the binary - the more accurate this tool will be.
It is certainly a start to a fairly complex problem IMO and I hope that people more knowledgable than me in this space, can extend this and make it even more useful. At the very very least, I hope it will at least show people what NOT to do while attempting to solve this problem :)
The code I wrote can be found here.
Hopefully over time - I can make this even better or maybe find a better solution to this problem.
This little project came into my mind primarily while playing the reversing challenges in CTFs. The files there used to be massive (4 digit numbers of functions) and very difficult to solve (for me anyway :)). I would never be able to identify which code was library code - in the case of statically linked binaries. Thus I could never complete those challenges OR it took me a lot of time. I still can't complete many but that's a separate story ;)
Anyway TL;DR I wrote a few simple IDAPython/Python scripts that basically compare the IDB of the binary to be reversed and a whole lot of library code. The more idea you have about the exact libraries that were used while building the binary - the more accurate this tool will be.
It is certainly a start to a fairly complex problem IMO and I hope that people more knowledgable than me in this space, can extend this and make it even more useful. At the very very least, I hope it will at least show people what NOT to do while attempting to solve this problem :)
The code I wrote can be found here.
Hopefully over time - I can make this even better or maybe find a better solution to this problem.
No comments:
Post a Comment