# Recent Posts **threatresearch.ext.hp.com/pdf-malware-is-not-yet-dead/** [HP Threat Research Blog • PDF Malware Is Not Yet Dead](https://threatresearch.ext.hp.com/blog/) ## PDF Malware Is Not Yet Dead May 20, 2022 For the past decade, attackers have preferred to package malware in Microsoft Office file formats, particularly Word and Excel. In fact, in Q1 2022 nearly half (45%) of malware stopped by [HP Wolf Security used Office formats. The reasons are clear: users are familiar](https://www.hp.com/uk-en/security/endpoint-security-solutions.html) with these file types, the applications used to open them are ubiquitous, and they are suited to social engineering lures. In this post, we look at a malware campaign isolated by HP Wolf Security earlier this year that had an unusual infection chain. The malware arrived in a PDF document – a format attackers less commonly use to infect PCs – and relied on several tricks to evade detection, such as embedding malicious files, loading remotely-hosted exploits, and shellcode encryption. ----- Figure 1 – Alert timeline in HP Wolf Security Controller showing the malware being isolated. ## PDF Campaign Delivering Snake Keylogger A PDF document named “REMMITANCE INVOICE.pdf” was sent as an email attachment to a target. Since the document came from a risky vector – email, in this case – when the user opened it, HP Sure Click ran the file in an isolated micro virtual machine, preventing their system from being infected. After opening the document, Adobe Reader prompts the user to open a .docx file. The attackers sneakily named the Word document “has been verified. However PDF, Jpeg, xlsx, _.docx” to make it look as though the file name was part of the Adobe Reader prompt (Figure_ 2). ----- Figure 2 – PDF document prompting the user to open another document. ----- Analyzing the PDF file reveals that the .docx file is stored as an EmbeddedFile object. Investigators can quickly summarize the most important properties of a PDF document using [Didier Stevens’ pdfid script (Figure 3).](https://blog.didierstevens.com/programs/pdf-tools/) Figure 3 – PDFiD analysis of document. To analyze the EmbeddedFile, we can use another tool from Didier Stevens’ toolbox, pdf_parser. This script allows us to extract the file from the PDF document and save it to disk._ Figure 4 – Using pdf-parser to save embedded file to disk. ## Embedded Word Document ----- If we return to our PDF document and click on Open this file at the prompt, Microsoft Word opens. If Protected View is disabled, Word downloads a Rich Text Format (.rtf) file from a web server, which is then run in the context of the open document. Figure 5 – Word document contacting web server. Since Microsoft Word does not say which server it contacted, we can use Wireshark to record the network traffic and identify the HTTP stream that was created (Figure 6). ----- Figure 6 – HTTP GET request returning RTF file. Let’s switch back to the Word document to understand how it downloads the .rtf. Since it is an OOXML (Office Open XML) file, we can unzip its contents and look for URLs in the document using the command shown in Figure 7. Figure 7 – List of URLs in the Word document. ----- The highlighted URL caught our eye because it s not a legitimate domain found in Office documents. This URL is in the document.xml.rels file, which lists the document’s relationships. The relationship that caught our eye shows an external object linking and embedding (OLE) object being loaded from this URL (Figure 8). Figure 8 – XML document relationships. ## External OLE Object Connecting to this URL leads to a redirect and then downloads an RTF document called _f_document_shp.doc. To examine this document more closely, we can use_ _[rtfobj to check if it](https://github.com/decalage2/oletools/wiki/rtfobj)_ contains any OLE objects. Figure 9 – RTFObj output showing two OLE objects. Here there are two OLE objects we can save to disk using the same tool. As indicated in the [console output, both objects are not well-formed, meaning analyzing them with oletools could](http://www.decalage.info/python/oletools) lead to confusing results. To fix this, we can use _[foremost to reconstruct the malformed](http://foremost.sourceforge.net/)_ [objects. Then we can view basic information about the objects using oleid. This tells us the](https://github.com/decalage2/oletools/wiki/oleid) object relates to Microsoft Equation Editor, a feature in Word that is commonly exploited by attackers to run arbitrary code. ----- Figure 10 – Basic OLE information extracted with oleid. ## Encrypted Equation Editor Exploit [Examining the OLE object reveals shellcode that exploits the CVE-2017-11882 remote code](https://msrc.microsoft.com/update-guide/vulnerability/CVE-2017-11882) execution vulnerability in Equation Editor. There are many analyses of this vulnerability, so we won’t analyze it in detail. Instead we focus below on how the attacker encrypted the shellcode to evade detection. ----- Figure 11 – Shellcode that exploits CVE-2017-11882. The shellcode is stored in the OLENativeStream structure at the end of the object. We can [then run the shellcode in a debugger, looking for a call to GlobalLock. This function returns a](https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-globallock) pointer to the first byte of the memory block, a technique used by shellcode to locate itself in memory. Using this information, the shellcode jumps to a defined offset and runs a decryption routine. Figure 12 – Multiplication and addition part of decryption routine. The key is multiplied by a constant and added at each iteration. The ciphertext is then decrypted each time with an XOR operation. The decrypted data is more shellcode, which is executed afterwards. ----- Figure 13 – Decrypted shellcode presenting the payload URL. Without running it further, we see that the malware downloads an executable called fresh.exe [and runs it in the public user directory using ShellExecuteExW. The executable is Snake](https://docs.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-shellexecuteexw) [Keylogger, a family of information-stealing malware that we have written about before. We](https://threatresearch.ext.hp.com/the-many-skins-of-snake-keylogger/) can now extract indicators of compromise (IOCs) from this malware, for example using dynamic analysis. At this point, we have analyzed the complete infection chain and collected IOCs, which can now be used for threat hunts or building new detections. ## Conclusion While Office formats remain popular, this campaign shows how attackers are also using weaponized PDF documents to infect systems. Embedding files, loading remotely-hosted exploits and encrypting shellcode are just three techniques attackers use to run malware under the radar. The exploited vulnerability in this campaign (CVE-2017-11882) is over four years old, yet continues being used, suggesting the exploit remains effective for attackers. ## IOCs _REMMITANCE INVOICE.pdf_ 05dc0792a89e18f5485d9127d2063b343cfd2a5d497c9b5df91dc687f9a1341d _has been verified. however pdf, jpeg, xlsx, .docx_ 250d2cd13474133227c3199467a30f4e1e17de7c7c4190c4784e46ecf77e51fe ----- _f_document_shp.doc_ 165305d6744591b745661e93dc9feaea73ee0a8ce4dbe93fde8f76d0fc2f8c3f f_document_shp.doc_object_00001707.raw 297f318975256c22e5069d714dd42753b78b0a23e24266b9b67feb7352942962 Exploit shellcode f1794bfabeae40abc925a14f4e9158b92616269ed9bcf9aff95d1c19fa79352e _fresh.exe (Snake Keylogger)_ 20a3e59a047b8a05c7fd31b62ee57ed3510787a979a23ce1fde4996514fae803 External OLE reference URL hxxps://vtaurl[.]com/IHytw External OLE reference final URL hxxp://192.227.196[.]211/tea_shipping/f_document_shp.doc Snake Keylogger payload URL hxxp://192.227.196[.]211/FRESH/fresh.exe Snake Keylogger exfiltration via SMTP mail.saadzakhary[.]com:587 Tags [CVE-2017-11882](https://threatresearch.ext.hp.com/blog/?post-tag=CVE-2017-11882) [PDF](https://threatresearch.ext.hp.com/blog/?post-tag=PDF) [snake](https://threatresearch.ext.hp.com/blog/?post-tag=snake) -----