1/18 View all posts by pcsxcetrasupport3 → November 16, 2021 Excel 4 macro code obfuscation pcsxcetrasupport3.wordpress.com/2021/11/16/excel-4-macro-code-obfuscation/ This sample comes from a Twitter thread located Here by Frost @fr0s7_ and appears to be “BazarLoader” Since this is a Xlsb file I usually just open it up in my Office 2010 Pro sandbox and then convert to Xlsm and unzip it so I can just view as xml. The first thing I always do is take a quick look with a hex editor looking for anything of interest. https://pcsxcetrasupport3.wordpress.com/2021/11/16/excel-4-macro-code-obfuscation/ https://twitter.com/fr0s7_/status/1458481326294769674 https://pcsxcetrasupport3.files.wordpress.com/2021/11/binheader.png 2/18 As we can see from the first 2 bytes we have a “PK” or zip file format. Once we “UnZip” the file and navigate to the xl folder we can verify this is a binary file and it also contains a Excel 4 macro folder named “macrosheets”. https://pcsxcetrasupport3.files.wordpress.com/2021/11/bin-xlfolder.png 3/18 If we look at the SharedStrings.bin file we can see that strings are in a Unicode format and not that easy to see where they split up at. https://pcsxcetrasupport3.files.wordpress.com/2021/11/bin-sharedstr.png 4/18 Looking at sheet1.bin in the macrosheets folder we can see it is not human readable. This is the point where I usually convert the file. https://pcsxcetrasupport3.files.wordpress.com/2021/11/sheet1-bin.png 5/18 Here we can see we still have a “PK” file but you can clearly see the data is presented a little differently. https://pcsxcetrasupport3.files.wordpress.com/2021/11/xml-header.png 6/18 Once we unzip and navigate to the xl folder here it now looks a little different. And now if we look at the SharedStrings.xml file it is a little different. By the counts there are 34 indexed shared strings. Each appears to be randomly generated strings. https://pcsxcetrasupport3.files.wordpress.com/2021/11/xml-folder.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/xmlsharedstr.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/xmlsharedstr-2.png 7/18 I wrote a tool to aid in extracting and indexing the shared string from the xml file. When I first parsed the shared strings I ended up with 0-37 index values instead of 0-33. Turns out the tool stumbled on a rare random Char value I was using to split on. Here we see the xml version of the macro code. Like the shared strings it is hard to see thru all of the xml tags what is there so I wrote a parser for those too. https://pcsxcetrasupport3.files.wordpress.com/2021/11/xmlsharedstr-3.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/macrosheet-1.png 8/18 This tool is designed to extract values to aid in better viewing what is happening without all of the xml tags. In this case some are left. Here we see what the values are. https://pcsxcetrasupport3.files.wordpress.com/2021/11/macrosheet-2.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/macrosheet-3_b.png 9/18 If we look at the highlighted values in green we see that it is looking for the string in cell ‘E11’ then we are taking the char at the index and taking so many chars. “MID(E11,12,1)” . In vbs the index start at 1 but in this the index starts at 0. So now we know the first char code was converted to “S” and now we see the first extracted letter is “h” and the next to letter is “e” and then the next 2 are at the same index and is “l”. Now we have the word “Shell” extracted. This would be a pain to do by hand, but now that we understand how it works what else is available to extract this data. The Answer is “XLMMacroDeobfuscator” located here . https://pcsxcetrasupport3.files.wordpress.com/2021/11/decode-1.png https://github.com/DissectMalware/XLMMacroDeobfuscator 10/18 As we can see here this tool does a great job of presenting us with the deobfuscated strings. The version I’m using here is from October 3rd 2021 before it was updated several more times. The version number stayed the same so you need to verify by the install/ file date. Using the latest version as of November 12th 2021 it only returned the eval result. Also notice in the screen shot that showed the data it is a “Partial Evaluation” where in the updated version it is a “Full Evaluation”. I have not looked at the byte format for the Macro sheet data but I have looked at the shared strings in the binary format. Do to the lack of information that I can find on the file format let’s take a quick look at the data in this file as shown below. Notice the patterns. https://pcsxcetrasupport3.files.wordpress.com/2021/11/deobf-1.png 11/18 In the original sample I wrote an extraction tool for we can see how it is laid out slightly different. https://pcsxcetrasupport3.files.wordpress.com/2021/11/ss-bin-1.png 12/18 Although the file in my original sample was labeled qut.xml it was not an xml file at all. So you can not count on a file name or extension for searches. https://pcsxcetrasupport3.files.wordpress.com/2021/11/ss-bin-2.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/file-1.png 13/18 And here is what it looks like in the Hex editor. https://pcsxcetrasupport3.files.wordpress.com/2021/11/file-2.png 14/18 Lets take a look at format for this sample then we will go back and look at the one from the beginning. https://pcsxcetrasupport3.files.wordpress.com/2021/11/file-3.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/byteresearch-1-b.png 15/18 We can see the first 3 bytes of the data appear to be a fixed Header value. The next 4 bytes are the “Count”. If I understand correctly, it is the total times the string/chars are referenced. The next 4 bytes are the “Unique Count”. These should be the total number of strings shown in the cells. Next it gets interesting. The first byte is always 0x13 Next we have 1 or 2 bytes (Unknown). Perhaps it is a data type ? It appears that it could be 1 or 2 bytes then a null byte depending on the string. Next we have the length of the string as displayed in the cell. It uses at least 2 bytes. So the first is only 1 char then value is 0x0100 or in reverse order 0x0001. After that we have 2 null bytes. Then finally the Unicode bytes for the string. Now lets go back to our first file that we extracted from this sample. 16/18 Notice how everything is aligned but the area in the red box. https://pcsxcetrasupport3.files.wordpress.com/2021/11/firstsst-b.png 17/18 If we look at the string under index 16 we see it is 531 characters long. 531 = 0x0213 and our value in the data is 0x1302. https://pcsxcetrasupport3.files.wordpress.com/2021/11/firstsst-2.png https://pcsxcetrasupport3.files.wordpress.com/2021/11/firstsst-3.png 18/18 Now everything lines up. Here we see the first byte 0X13 then 2 unknown bytes then a null byte then 2 bytes for the length and then a double null and finally the start of out Unicode string values. So in this sample we have extra 0x13 in a place that will break the tool. At this point the tool will work on a few but will need a total rewrite based on this new information. There have been plenty of samples that I have looked at where you did not even need to look at the VBA or macro code. All you needed to do was extract the shared strings to get the urls or paths used. That is it for this one I hope you learned from this as much as I did. Links: Link to Twitter thread Link to Sample on InQuest Labs Link to Sample on Iris-H Link to XLMMacroDeobfuscator Link to my tools on GitHub https://twitter.com/fr0s7_/status/1458481326294769674 https://labs.inquest.net/dfi/sha256/5cd1c8b7425fcfd1d23acb3056262203b86174d87d6b8feb2087790694ea48b5 https://iris-h.services/pages/report/580fa6d458157bf2c0ad5c6d5165a1a9d83fe70c#metadata https://github.com/DissectMalware/XLMMacroDeobfuscator https://github.com/PCsXcetra/ToolsForParsingExcelData