{
	"id": "ed73a420-0fb4-4587-9134-0178b7bdfa8a",
	"created_at": "2026-04-06T00:10:05.599021Z",
	"updated_at": "2026-04-10T13:12:03.641813Z",
	"deleted_at": null,
	"sha1_hash": "5e035edd2dda31dae34f6aaf26bd4aeae93c6013",
	"title": "An interesting Callisto YARA rule",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 850438,
	"plain_text": "An interesting Callisto YARA rule\r\nBy David Cannings\r\nPublished: 2024-06-26 · Archived: 2026-04-05 12:51:19 UTC\r\nYesterday Félix Aimé posted on X (formerly known as Twitter) a neat YARA rule to detect PDF documents.\r\nTweet by @felixaime\r\nThese documents relate to the Callisto group, which is attributed to the Russian FSB by various Western\r\nintelligence agencies.\r\nThe rule makes use of YARA’s hash module. But what is it hashing, and why does it work?\r\nThe rule\r\nA text version of the initial rule is provided below - I added the author and description for clarity.\r\nimport \"hash\"\r\nrule Calisto_PDF_streams {\r\n meta:\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 1 of 7\n\nauthor = \"Félix Aimé @felixaime\"\r\n description = \"Detects Callisto PDFs\"\r\n strings:\r\n $s1 = { 0A 73 74 72 65 61 6D 0A } // \u003c0x0a\u003estream\u003c0x0a\u003e\r\n $s2 = { 0A 65 6E 64 73 74 72 65 61 6D 0A } // \u003c0x0a\u003eendstream\u003c0x0a\u003e\r\n condition:\r\n uint32be(0) == 0x25504446 and\r\n for any i in (0..#s1) : (\r\n hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8) == \"b9950253cf88305a57cb350deb31c07e\" or\r\n hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8) == \"9388a4ee5de0e59595a1e76aeb9796d5\" or\r\n hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8) == \"9aa92ca954147945e6100ce345351d4c\"\r\n )\r\n}\r\nMagic bytes\r\nThe first part of the rule uint32be(0) == 0x25504446 checks that the file starts with the magic bytes %PDF .\r\n00000000 25 50 44 46 |%PDF|\r\nThis is a useful optimisation in many circumstances as YARA can short-circuit and stop matching other\r\nconditions. See Florian Roth’s optimisation guide for more on this topic.\r\nHowever, the PDF magic does not need to appear at offset zero to open in most readers1. When generated by\r\ncurrent software the magic should normally appear at the start of the file, but an easy way to break the rule would\r\nbe to add random bytes at the start. When signaturing malicious documents it is often helpful to remove or tweak\r\nthis condition.\r\nThe for loop\r\nThe statement for any i in (0..#s1) will loop from zero to the number of matches (see below for a comment\r\non this).\r\nThe data for each stream in the document is surrounded by stream / endstream , which both have newlines.\r\nThis part of the condition ensures we check each stream.\r\nThe hash.md5 comparison\r\nThe final part of the rule compares chunks of the file to three specific hash values. The md5 function takes an\r\noffset and size .\r\nThe offset is @s1[i] + 8 - the @ symbol in YARA returns the offset for the match. This will be the offset of the\r\ncurrent $s1 string, which matches \\x0astream\\x0a . Eight bytes are added to skip the start of stream marker.\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 2 of 7\n\nThe size is slightly more complicated. @s2[i] is the offset of the current end of stream marker. The address of\r\nthe start of stream market is subtracted, along with 8 additional bytes.\r\n\u003c0x0a\u003estream\u003c0x0a\u003edata data data\u003c0x0a\u003eendstream\u003c0x0a\u003e\r\n^ ^ ^\r\n|_ @s1 |_ @s1+8 |_ @s2\r\nThis provides the length of the data to hash. Note that if the offset of @s2[i] is less than @s1[i] the length\r\nwould be negative, which is not ideal. We fix this below.\r\nSo what is being signatured?\r\nIn order to test further it is necessary to find a file. Fortunately one is available on VirusTotal2 which hits the rule.\r\nThe condition can be modified to print the location of the match (it is necessary to add import \"console\" at the\r\ntop of the rule for this to work correctly).\r\ncondition:\r\n uint32be(0) == 0x25504446 and\r\n for any i in (0.. math.max(math.min(#s1 - 1, #s2 - 1), 0) ) : (\r\n (hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8) == \"b9950253cf88305a57cb350deb31c07e\" or\r\n hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8) == \"9388a4ee5de0e59595a1e76aeb9796d5\" or\r\n hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8) == \"9aa92ca954147945e6100ce345351d4c\") and\r\n console.log(@s1[i]) and console.log(hash.md5(@s1[i]+8, @s2[i]-@s1[i]-8))\r\n )\r\nRunning this prints the following:\r\n\u003e yara64 .\\rule.yar .\\4f90aed9138ae23a32c2afce3237a891b50a39d04e169e79dd98f29e156295b2.hostile\r\n179310\r\nb9950253cf88305a57cb350deb31c07e\r\nCalisto_PDF_streams .\\4f90aed9138ae23a32c2afce3237a891b50a39d04e169e79dd98f29e156295b2.hostile\r\nThe address of the $s1 match was printed, which tells us that the stream tag starts at offset 179,310\r\n(0x2BC6E).\r\nInspecting the file with the PDF template for 010 Editor reveals an object that starts at offset 0x2BB86:\r\n00000000 34 33 20 30 20 6f 62 6a 0a 3c 3c 0a 2f 54 79 70 |43 0 obj.\u003c\u003c./Typ|\r\n00000010 65 20 2f 58 4f 62 6a 65 63 74 0a 2f 53 75 62 74 |e /XObject./Subt|\r\n00000020 79 70 65 20 2f 49 6d 61 67 65 0a 2f 57 69 64 74 |ype /Image./Widt|\r\n00000030 68 20 39 36 0a 2f 48 65 69 67 68 74 20 39 36 0a |h 96./Height 96.|\r\n00000040 2f 43 6f 6c 6f 72 53 70 61 63 65 20 2f 44 65 76 |/ColorSpace /Dev|\r\n00000050 69 63 65 52 47 42 0a 2f 42 69 74 73 50 65 72 43 |iceRGB./BitsPerC|\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 3 of 7\n\n00000060 6f 6d 70 6f 6e 65 6e 74 20 38 0a 2f 46 69 6c 74 |omponent 8./Filt|\r\n00000070 65 72 20 2f 46 6c 61 74 65 44 65 63 6f 64 65 0a |er /FlateDecode.|\r\n00000080 2f 44 65 63 6f 64 65 50 61 72 6d 73 20 3c 3c 0a |/DecodeParms \u003c\u003c.|\r\n00000090 2f 50 72 65 64 69 63 74 6f 72 20 31 35 0a 2f 43 |/Predictor 15./C|\r\n000000a0 6f 6c 6f 72 73 20 33 0a 2f 42 69 74 73 50 65 72 |olors 3./BitsPer|\r\n000000b0 43 6f 6d 70 6f 6e 65 6e 74 20 38 0a 2f 43 6f 6c |Component 8./Col|\r\n000000c0 75 6d 6e 73 20 39 36 0a 3e 3e 0a 2f 53 4d 61 73 |umns 96.\u003e\u003e./SMas|\r\n000000d0 6b 20 34 34 20 30 20 52 0a 2f 4c 65 6e 67 74 68 |k 44 0 R./Length|\r\n000000e0 20 33 33 31 32 0a 3e 3e 0a 73 74 72 65 61 6d 0a | 3312.\u003e\u003e.stream.|\r\nThis is a 96x96 pixel image and the raw data is in /FlateDecode format3.\r\nTo be absolutely certain we can copy the bytes between stream and endstream and hash them, confirming it\r\nmatches the expected value.\r\nCyberChef used to hash the stream data\r\nThe data for object 43 can also be extracted from the PDF, which ironically yields a PDF icon used by the threat\r\nactor.\r\nWe now know that the YARA rule is hashing all stream data in the PDF, looking for a specific 96x96 pixel image\r\nwhich is the icon above.\r\nA second image matching a different MD5 hash can be extracted from another related document:\r\nPerformance and readability considerations\r\nA few potential improvements to the rule are described below. Some are practically essential (e.g. if deploying to a\r\nstreaming service like VirusTotal Live Hunt), others help with future maintenance.\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 4 of 7\n\nLoop iterations\r\nThe first observation is that the rule tries to loop too many times. This happens because indexes start at zero, but\r\nthe total count #s1 starts at one. We could quickly fix this by subtracting one.\r\nA second observation is that the count of #s1 and #s2 should be the same in a well formed document.\r\nHowever, it is theoretically possible that \u003c0x0a\u003estream appears more than \u003c0x0a\u003eendstream (or vice versa). We\r\nshould cap to the smallest number.\r\nThe final observation is that total loop iterations are unbound. A maximum should be applied here, for example\r\nthe first fifty streams in the file.\r\nThe math module can be used to fix all of these. For example to hash a minimum of zero and a maximum of 50\r\nstreams:\r\nmath.max(math.min(math.min(#s1, #s2), 50) - 1, 0)\r\nReducing calls to hash.md5(..)\r\nTo minimise what we hash we can choose to call hash.md5 only when the length is within a set range. This\r\navoids hashing small objects or very large data which is unlikely to be relevant4.\r\nOne question I had when inspecting this rule is: are multiple calls to hash.md5(..) inefficient? Fortunately Wes\r\nShields (@wxs) confirmed the answer is no - multiple calls with the same input are cached5.\r\nAdding known data\r\nIn this example we know the image is 96x96 pixels. Therefore, adding more known good matches such as\r\n/Height 96 could help to reduce the number of files which need to be checked.\r\nReadability\r\nWes also noted that YARA supports string sets. The initial request for this feature was identical to our requirement\r\nhere.\r\nThe condition can be rewritten like so, ensuring the rule does not repeatedly contain references to the hashing\r\nfunction.\r\nfor any stream_md5 in (\r\n \"b9950253cf88305a57cb350deb31c07e\",\r\n \"9388a4ee5de0e59595a1e76aeb9796d5\",\r\n \"9aa92ca954147945e6100ce345351d4c\"):\r\n ( // do stuff )\r\nOptimising the rule\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 5 of 7\n\nTaking the improvements suggested above we can update the rule like so:\r\nimport \"hash\"\r\nimport \"math\"\r\nrule Calisto_PDF_streams {\r\n meta:\r\n author = \"Félix Aimé @felixaime, updated by David Cannings @edeca\"\r\n description = \"Detects Callisto PDFs\"\r\n strings:\r\n $start = { 0A 73 74 72 65 61 6D 0A } // \u003c0x0a\u003estream\u003c0x0a\u003e\r\n $end = { 0A 65 6E 64 73 74 72 65 61 6D 0A } // \u003c0x0a\u003eendstream\u003c0x0a\u003e\r\n $known_1 = \"/Height 96\"\r\n $known_2 = \"/Height 35\"\r\n condition:\r\n uint32be(0) == 0x25504446 and\r\n any of ($known*) and\r\n for any stream_md5 in (\r\n \"b9950253cf88305a57cb350deb31c07e\",\r\n \"9388a4ee5de0e59595a1e76aeb9796d5\",\r\n \"9aa92ca954147945e6100ce345351d4c\"):\r\n (\r\n // Check the first 50 streams (maximum)\r\n for any i in (0.. math.max(math.min(math.min(#s1, #s2), 50) - 1, 0)) : (\r\n // Require a minimum of 1KiB\r\n @end[i] - @start[i] \u003e 1024 and\r\n // Hash a maximum of 20KiB bytes\r\n @end[i] - @start[i] \u003c 20480 and\r\n // Match to a known MD5 hash\r\n hash.md5(@start[i] + 8, @end[i] - @start[i] - 8) == stream_md5\r\n )\r\n )\r\n}\r\nNot all of the suggested improvements will work for every rule. In this case, the updated rule matches the same\r\nsix files that can be obtained using a VirusTotal retrohunt for the original rule.\r\nNo sample files were found for one of the MD5 hashes. It is possible the additional checks will not detect some of\r\nthe files the original author had available.\r\nConclusion\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 6 of 7\n\nThis is neat technique which - if employed sensibly - could be used to find interesting PDFs. There is plenty of\r\nstream data in a typical PDF ranging from text to images to fonts. Threat actors often reuse templates which yields\r\nplenty of opportunity for signatures.\r\nHowever it is important to be aware of performance, lest you receive a dreaded email from somebody at\r\nVirusTotal saying your rule has been disabled 🙈.\r\nSource: https://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nhttps://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule\r\nPage 7 of 7",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MISPGALAXY",
		"Malpedia"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://edeca.net/post/2024-06-26-an-interesting-callisto-yara-rule"
	],
	"report_names": [
		"2024-06-26-an-interesting-callisto-yara-rule"
	],
	"threat_actors": [
		{
			"id": "5dae3c71-8be1-4591-a2fb-b851ea6f083d",
			"created_at": "2022-10-25T16:07:23.432642Z",
			"updated_at": "2026-04-10T02:00:04.600341Z",
			"deleted_at": null,
			"main_name": "Callisto Group",
			"aliases": [],
			"source_name": "ETDA:Callisto Group",
			"tools": [
				"RCS Galileo"
			],
			"source_id": "ETDA",
			"reports": null
		},
		{
			"id": "79bd28a6-dc10-419b-bee7-25511ae9d3d4",
			"created_at": "2023-01-06T13:46:38.581534Z",
			"updated_at": "2026-04-10T02:00:03.029872Z",
			"deleted_at": null,
			"main_name": "Callisto",
			"aliases": [
				"BlueCharlie",
				"Star Blizzard",
				"TAG-53",
				"Blue Callisto",
				"TA446",
				"IRON FRONTIER",
				"UNC4057",
				"COLDRIVER",
				"SEABORGIUM",
				"GOSSAMER BEAR"
			],
			"source_name": "MISPGALAXY:Callisto",
			"tools": [],
			"source_id": "MISPGALAXY",
			"reports": null
		},
		{
			"id": "3aedca2f-6f6c-4470-af26-a46097d3eab5",
			"created_at": "2024-11-01T02:00:52.689773Z",
			"updated_at": "2026-04-10T02:00:05.396502Z",
			"deleted_at": null,
			"main_name": "Star Blizzard",
			"aliases": [
				"Star Blizzard",
				"SEABORGIUM",
				"Callisto Group",
				"TA446",
				"COLDRIVER"
			],
			"source_name": "MITRE:Star Blizzard",
			"tools": [
				"Spica"
			],
			"source_id": "MITRE",
			"reports": null
		},
		{
			"id": "2d06d270-acfd-4db8-83a8-4ff68b9b1ada",
			"created_at": "2022-10-25T16:07:23.477794Z",
			"updated_at": "2026-04-10T02:00:04.625004Z",
			"deleted_at": null,
			"main_name": "Cold River",
			"aliases": [
				"Blue Callisto",
				"BlueCharlie",
				"Calisto",
				"Cobalt Edgewater",
				"Gossamer Bear",
				"Grey Pro",
				"IRON FRONTIER",
				"Mythic Ursa",
				"Nahr Elbard",
				"Nahr el bared",
				"Seaborgium",
				"Star Blizzard",
				"TA446",
				"TAG-53",
				"UNC4057"
			],
			"source_name": "ETDA:Cold River",
			"tools": [
				"Agent Drable",
				"AgentDrable",
				"DNSpionage",
				"LOSTKEYS",
				"SPICA"
			],
			"source_id": "ETDA",
			"reports": null
		},
		{
			"id": "3a057a97-db21-4261-804b-4b071a03c124",
			"created_at": "2024-06-04T02:03:07.953282Z",
			"updated_at": "2026-04-10T02:00:03.813595Z",
			"deleted_at": null,
			"main_name": "IRON FRONTIER",
			"aliases": [
				"Blue Callisto ",
				"BlueCharlie ",
				"CALISTO ",
				"COLDRIVER ",
				"Callisto Group ",
				"GOSSAMER BEAR ",
				"SEABORGIUM ",
				"Star Blizzard ",
				"TA446 "
			],
			"source_name": "Secureworks:IRON FRONTIER",
			"tools": [
				"Evilginx2",
				"Galileo RCS",
				"SPICA"
			],
			"source_id": "Secureworks",
			"reports": null
		},
		{
			"id": "61940e18-8f90-4ecc-bc06-416c54bc60f9",
			"created_at": "2022-10-25T16:07:23.659529Z",
			"updated_at": "2026-04-10T02:00:04.703976Z",
			"deleted_at": null,
			"main_name": "Gamaredon Group",
			"aliases": [
				"Actinium",
				"Aqua Blizzard",
				"Armageddon",
				"Blue Otso",
				"BlueAlpha",
				"Callisto",
				"DEV-0157",
				"G0047",
				"Iron Tilden",
				"Operation STEADY#URSA",
				"Primitive Bear",
				"SectorC08",
				"Shuckworm",
				"Trident Ursa",
				"UAC-0010",
				"UNC530",
				"Winterflounder"
			],
			"source_name": "ETDA:Gamaredon Group",
			"tools": [
				"Aversome infector",
				"BoneSpy",
				"DessertDown",
				"DilongTrash",
				"DinoTrain",
				"EvilGnome",
				"FRAUDROP",
				"Gamaredon",
				"GammaDrop",
				"GammaLoad",
				"GammaSteel",
				"Gussdoor",
				"ObfuBerry",
				"ObfuMerry",
				"PlainGnome",
				"PowerPunch",
				"Pteranodon",
				"Pterodo",
				"QuietSieve",
				"Remcos",
				"RemcosRAT",
				"Remote Manipulator System",
				"Remvio",
				"Resetter",
				"RuRAT",
				"SUBTLE-PAWS",
				"Socmer",
				"UltraVNC"
			],
			"source_id": "ETDA",
			"reports": null
		}
	],
	"ts_created_at": 1775434205,
	"ts_updated_at": 1775826723,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/5e035edd2dda31dae34f6aaf26bd4aeae93c6013.pdf",
		"text": "https://archive.orkl.eu/5e035edd2dda31dae34f6aaf26bd4aeae93c6013.txt",
		"img": "https://archive.orkl.eu/5e035edd2dda31dae34f6aaf26bd4aeae93c6013.jpg"
	}
}