Continuing my original post on Unpacking UPX, this post will be a deep dive analysis on the unpacked sample and its capabilities.
We unpacked the original sample manually and the resulting file has the file hash 6FE502AC541FBAB0910427FA1D56148AEA17D93FD281990D0D6E9A21C751F387
. For the purpose of this post, I have this binary named as unpacked_sample. This hash value is not known in VT (when I wrote this article) and I am unable to find anything else online regarding it. So, for an initial overview of the binary, let’s take a look at the VT details for the packed sample. Multiple Anti-Virus (AV) vendors tag the detection as “GupBoot” or “Urelas”. This is not necessarily the Malware family name and Karsten Hahn does an amazing job of describing “malware naming” conventions in the blogpost “Malware Naming Hell” [1].
When I first began malware analysis, one of the basic mistakes I did was jumping straight into the sample, without performing some initial analysis. It’s easy to get lost in the sea of assembly code and we need to have specific goals and sections of code to analyze (and pivot from). So, I decided to run the unpacked sample through a free sandbox engine - CAPEv2 Sandbox [3]. Note, most of these sandbox engines will share the malware sample with VirusTotal. Do not use these, if you are working on a sample and do not wish for it to be published to VT. Malware authors will often check VT and other open-source tools and will be alerted to the sample being submitted.
Strings
As part of my initial analysis, while the sample is being analyzed in the sandbox, I ran the following two commands, strings unpacked_sample > strings.txt
and floss.exe unpacked_sample > floss_strings.txt
. Strings is the native SysInternals tool [4] and Floss is the Fireye’s tool [5] for retrieving obfuscated strings (it can do much more, but that’s for some other day).
From the text file generated in above two commands, I noticed the following interesting strings. Note, there are a lot of interesting API calls, and I’ll be documenting only the most important ones here. (We wouldn’t have been able to extract these strings if we hadn’t extracted the sample in previous blogpost)
As usual, potential ip addresses and domains have been defanged. I’ll group the strings by their potential functionality.
Potential IoCs
1.234.83[.]146
133.242.129[.]155
218.54.31[.]226
218.54.31[.]165
There is not much to say about these values. You see them in strings, you know they are IoCs. We need to identify how these ip addresses are used by disassembling the code.
API Calls of Interest for Debugging
VirtualAlloc
TerminateProcess
GetCommandLineW
LoadResource
SizeofResource
FindResourceW
LoadBitmapW
LoadStringW
CreateFileA
WriteFile
SetEndOfFile
DeleteFileW
RegQueryValueExW
RegSetValueExW
ShellExecuteW
ShellExecuteA
I’ve grouped the API calls based on their utility. VirtualAlloc is always interesting since it allocates memory and returns a pointer to the starting address. We need to track the contents in this memory area during debugging to determine if something is being unpacked. LoadResource, SizeofResource, FindResourceW, LoadBitmapW, LoadStringW indicates that the resource section might contain some data that loaded into memory. It might also be loading an image (which might contain code within it). Registry operations and Shell operations are always of interest to us.
Potential Anti Debugging
SetUnhandledExceptionFilter
IsDebuggerPresent
GetTickCount
For details on why these are interesting to us, take a look at this amazing resource from a blackhat presentation [6]. Need to identify how these are used and potentially patch the binary to allow debugging to proceed unhindered.
Potential Strings of Interest
unzip 0.15 Copyright 1998 Gilles Vollant
inflate 1.1.3 Copyright 1995-1998 Mark Adler
Agent_pea.exePK
f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\auxdata.cpp
f:\dd\vctools\vc7libs\ship\atlmfc\include\afxwin2.inl
f:\dd\vctools\vc7libs\ship\atlmfc\include\afxwin1.inl
golfset.ini
H:\PMS\_AUpdate\Update\bin\Release\GolfUpdate87.pdb
IDR_BINARY
Mentions of unzip and inflate indicates archived data. When coupled with the resource based API calls, we have a quick place to look at even before we debug or disassemble code.
Resource Section
As soon as we open the binary in Resource Hacker [7], we notice a familiar string “IDR_BINARY”. The resource starts with the bytes \x50\x4b\x03\x04
which is the file signature for PK Zip file [8].
You can now save the IDR BINARY Resource onto the disk as “IDR_BINARY137.bin” and open it in 7zip since we already know it’s an archive file based on the file signature.
If you try to extract the binary “Agent_pea.exe”, you will be prompted for a password, indicating that it is an encrypted archive. Also note that this binary name was identified in the strings analysis previously.
We need the encryption key now, or we need to let the code load the resource, decrypt it and then we can dump it from memory. Either way, we have clear objectives for disassembling and debugging. Note that the resource section contains two strings “Oplfjhrt”, “HJHDUIEFKDDDF” and we earlier noticed the API call LoadStringW. Also, I tried using these two strings as the password for the zip file and it did not work. I’m mentioning this since a lot of what we do here would be trial and error based on theories.
Automated Malware Analysis
Like I mentioned earlier, we are using Cape for automated malware analysis. You can view the results of the analysis here https://capesandbox.com/analysis/168492/. I’m not going to go into the complete details, but Cape has shown why automated sandboxes are not enought for malware analysis. The automated analysis does not detect a crucial binary quavb.exe, being dropped and executed. The analysis indicates that our unpacked_sample.exe has dropped another binary vyjux.exe
on disk and also set the registry key “Run” to the path of this binary. It makes various interesting API calls related to printers and DeviceIOControls, which might need to be investigated further.
Let’s get back to our unpacked sample and try unpacking the encrypted archive.
Disassembly - unpacked_sample
We know that the resource (which is already loaded into memory as part of the binary) needs to be accessed by the code. We can disassemble the code in Ghidra [9], and search for the API call “LoadResource” and you will arrive at the below section.
The API LockResource retrieves a pointer to the specified resource in our memory. The highlighted function, in the above image is the next point of interest as we want to know what it does with the pointer returned from previous API call. Right before the highlighted function is called, EBX contains the pointer to the encrypted archive resource. ESI contains the size of the encrypted archive in bytes.
FUN_0040a491 takes the size of the encrypted archive as an argument. By stepping back one level up in the function call, we can see what is being done with respect to the encrypted archive.
I have renamed some functions after a quick analysis of what they do. As you can see in the highlighted function names, the encrypted archive is first loaded, and then a file write to disk occurs with a generated file name that ends in .exe. Immediately after it is written to disk, ShellExecuteW is called for the file, thereby executing it. Without spending further time, we can reasonably assume that the file write is the executable extracted from the encrypted archive. We can later verify this by comparing the file size on disk and the file size within the encrypted archive (as seen in Figure 2).
Looking further down, you’ll notice the following section
The name of the extracted executable is added as a registry value to the hive HKCU, to the regisry subkey Software\Microsoft\Windows NT\CurrentVersion\Windows\TrayKey
. So, instead of spending time debugging and identifying the PKZIP password (or dumping from memory during debugging), we can identify this file name and directly obtain the extracted binary sample from disk.
Let’s look at other API calls since there are multiple WriteFile Calls. A quick look at the first result, shows the creation of a file _uinsey.bat
_uinsey.bat
The function calls GetModuleFileNameA API with handle value 0x0 indicating that it is requesting the fully qualified path of the current executable. Once this path is obtained, the following code is written to disk in the file _uinsey.bat
.
:Repeat
del "%s"
if exist "%s" goto Repeat
rmdir "%s"
del "%s"
%s is replaced by the path of the current executable.
Once it is written, ShellExecuteA is called with “Open” argument and the path to _uinsey.bat file. This would repeatedly attempt to delete the current executable. This is in line with our hypothesis that this is merely a dropper and does not persist as it is.
Debugging - unpacked_sample
I loaded the copy of the unpacked binary in x86dbg and set breakpoints for the two anti debug api calls we previously called out. It kept exiting after multiple tries which means, there might be something else happening before these two apis are hit. I opened Process Hacker on the side and noticed the exection of another binary "C:\Users\Admin\AppData\Local\Temp\quavb.exe
from the %APPDATA% folder which is 242 KB size
In the same folder, you will notice three files
- reqyn.exe - bcc5e7861c281a5b1525025779090fc48e122a3dcd870cb60bc685adfc6f5f95 - Size 508 KB
- quavb.exe - f5cd30cd359760f746e81972ad4f87d3d4bbaf4b78e1291bf6ed9c454c279c7b - Size 242 KB
- golfinfo.ini - 8625da02187497527aabf19bed7c7ce5569bdc346a463a7c4b332c7dc118a112 - Size 1 KB
reqyn.exe
reqyn.exe is of the same exact size as our unpacked binary - 508KB. Based on the execution pattern, the unpacked_sample binary copies itself to the %TEMP% directory, renames to reqyn.exe, and modifies the Run key in registry for persistence
C:\Users\Admin>reg query "HKCU\Software\Microsoft\Windows NT\CurrentVersion\Windows" /v Run
HKEY_CURRENT_USER\Software\Microsoft\Windows NT\CurrentVersion\Windows
Run REG_SZ C:\Users\Admin\AppData\Local\Temp\reqyn.exe
quavb.exe
Based on our previous analysis, we know that the encrypted archive is extracted and a .exe file is written to disk. The name of this binary file is present in the registry key mentioned in the disassembly section. So, to short circuit our analysis, I checked the registry key value and the name matches this file indicating that this is the file we want to analyze
This saves us time from trying to manually decrypt the IDR_BINARY resource. Instead, we have the extracted executable file name (quavb) and a sample for further analysis.
golfinfo.ini
Based on the file extension and the size, I’m expecting some kind of config strings to be contained within it. However, opening the file in a text editor shows gibberish. I opened the same in Hex editor to view the data in Hex.
A quick look at the above view indicates repeating characters 0xFF. If we assume that this is XOR encrypted data, and that 0xFF is originally 0x00 (this is a cipher text only attack in Cryptography world), we can guess the key to be 0xFF (Since for the data 0x00 to be encrypted to 0xFF, the key needs to be 0xFF).
You can quickly decrypt the code using python or the numerous XOR tools (XORSearch, CodeChef) available online.
key = 0xFF
with open("golfinfo.ini", "rb") as f:
data = f.read()
for byte in data:
print(chr(byte^key), end='')
The output, will provide the following.
As you can see, the decrypted data has some interesting strings including the file name of another binary we previously called out and two ip addresses we previously obtained from the original unpacked binary.
My working theory is that this file was written by the original unpacked binary so that the other two binaries dropped alongside this file can read it as configuration data. Since quavb.exe is written to the disk and then launched by unpacked binary, maybe it reads this configuration file for further processing. This needs further analysis.
Summary
The original UPX packed file, unpacks itself, and executes a PE file. This PE file, copies itself to %TEMP% directory, creates persistence by adding itself to the Run key and drops smaller binary in the %TEMP% folder. This smaller binary is executed after the file is written to the disk. The analysis done on this post indicates that this file is primarily a dropper for the smaller binary file.
In a subsequent post, I’ll be analyzing the dropped file. The dropped file is compressed using PECompat and needs to be extracted again before further analysis can proceed.
IoCs
1.234.83[.]146
133.242.129[.]155
218.54.31[.]226
218.54.31[.]165
Files
reqyn.exe - bcc5e7861c281a5b1525025779090fc48e122a3dcd870cb60bc685adfc6f5f95 - Size 508 KB
quavb.exe - f5cd30cd359760f746e81972ad4f87d3d4bbaf4b78e1291bf6ed9c454c279c7b - Size 242 KB
golfinfo.ini - 8625da02187497527aabf19bed7c7ce5569bdc346a463a7c4b332c7dc118a112 - Size 1 KB
PDB
H:\PMS\_AUpdate\Update\bin\Release\GolfUpdate87.pdb
References
- https://www.gdatasoftware.com/blog/2019/08/35146-taming-the-mess-of-av-detection-names
- https://www.virustotal.com/gui/file/6fe502ac541fbab0910427fa1d56148aea17d93fd281990d0d6e9a21c751f387/detection
- https://capesandbox.com/analysis/
- https://docs.microsoft.com/en-us/sysinternals/downloads/strings
- https://www.fireeye.com/blog/threat-research/2016/06/automatically-extracting-obfuscated-strings.html
- https://www.blackhat.com/presentations/bh-usa-07/Yason/Whitepaper/bh-usa-07-yason-WP.pdf
- http://www.angusj.com/resourcehacker/
- https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html
- https://ghidra-sre.org/