Tuesday, November 11, 2008

Trojanin´ imported Python programs

.
Python programs use .py as file extension, and its an interpreted
language, but to speed up load times in programs importing lots of
libraries, Python automatically creates a .pyc file that its a bytecode
platform independent compiled file.

So when you are importing a file into your code, Python will use file.pyc
instead of file.py if it exist in the same directory.

Python engine compares the "last modified" time in the .py file against a
timestamp stored in the .pyc header to know if it can use the .pyc or if
there were some changes and now should recompile the source.

.pyc file header

> bytes 0-3 magic number
> bytes 4-7 timestamp (mtime of .py file)
> bytes 8-* marshalled code object


The first 3 bytes basically identify the Python version that generated the
.pyc, then we have the timestamp, and then the object code, that would
be our program.

Yeah, I know its pretty fooking funny that these guys only check a
timestamp to know if they have a valid .pyc

So, lets do an experiment

Lets create a program called "lib.py" that will print the word "Clean" when
executed.

And lets write caller.py that will import lib and print "caller executed"


import lib
print "caller executed"

When you run caller using "python caller.py" you will see the following output:

clean
caller executed

And Python will create a pre compiled version of lib.py called lib.pyc.

Next time we use caller.py Python will import lib.pyc instead of lib.py.

So, lets have some fun, we will create a rogue lib to replace lib.pyc, in
this case our malignant lib will print "BAD BAD BAD" (Yeahhhhh it's
extremely bad, don´t be afraid)

And when we compare both .pyc files using an hex editor...





ORIGINAL:
00000000 6D F2 0D 0A F2 D9 19 49 63 00 00 00 00 00 00 00 00 01 00 00 m......Ic...........
00000014 00 40 00 00 00 73 09 00 00 00 64 00 00 47 48 64 01 00 53 28 .@...s....d..GHd..S(
00000028 02 00 00 00 74 05 00 00 00 63 6C 65 61 6E 4E 28 00 00 00 00 ....t....cleanN(....
0000003C 28 00 00 00 00 28 00 00 00 00 28 00 00 00 00 74 1E 00 00 00 (....(....(....t....
00000050 2F 64 6F 77 6E 6C 6F 61 64 73 2F 73 73 73 73 73 73 73 73 73 /downloads/sssssssss
00000064 2F 70 74 2F 6C 69 62 2E 70 79 74 01 00 00 00 3F 01 00 00 00 /pt/lib.pyt....?....
00000078 73 00 00 00 00 s....

MALIGNANT:
00000000 6D F2 0D 0A BD D9 19 49 63 00 00 00 00 00 00 00 00 01 00 00 m......Ic...........
00000014 00 40 00 00 00 73 09 00 00 00 64 00 00 47 48 64 01 00 53 28 .@...s....d..GHd..S(
00000028 02 00 00 00 73 0B 00 00 00 42 41 44 20 42 41 44 20 42 41 44 ....s....BAD BAD BAD
0000003C 4E 28 00 00 00 00 28 00 00 00 00 28 00 00 00 00 28 00 00 00 N(....(....(....(...
00000050 00 74 1E 00 00 00 2F 64 6F 77 6E 6C 6F 61 64 73 2F 73 73 73 .t..../downloads/sss
00000064 73 73 73 73 73 73 2F 70 74 2F 6C 69 62 2E 70 79 74 01 00 00 ssssss/pt/lib.pyt...
00000078 00 3F 01 00 00 00 73 00 00 00 00 .?....s....





So we need to modify the .pyc header to match the original timestamp, in
this case that's done modifying the 5th byte.

If we execute lib.py we will still get "clean" as output, but all the
programs importing lib will end up using our trojaned version lib.pyc that
will output "BAD BAD BAD".

Running python caller.py as output we get


BAD BAD BAD
caller executed

so...

If we have +w in the directory where the imported .py files are, or we
are in some crappy windows box, we can place a trojan written in Python
and it will be executed each time the file is imported.
Some of you will think that in a well configured environment that will be
extremely rare to find, but in development environments it's much easier
to find +w directories with utility scripts, or with tests automation
scripts in QA environments, and in those cases this kind of trick can be
easily used to impersonate some other user.

Best thing is that if someone decides to inspect the .py file the source
code will be intact, so it wont arise suspicion.

Hope you like it, any ideas or comments are appreciated

Cheerz

No comments:

 
hit counter script