{
	"id": "8e9054f1-2a03-48b3-8706-e2a9ae7257ee",
	"created_at": "2026-04-06T00:09:23.026259Z",
	"updated_at": "2026-04-10T03:24:23.497923Z",
	"deleted_at": null,
	"sha1_hash": "1791d2ba76643cf9bdbac3843208be89e444963a",
	"title": "Enterprise Scale Threat Hunting: C2 Beacon Detection with Unsupervised ML and KQL — Part 2",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 219864,
	"plain_text": "Enterprise Scale Threat Hunting: C2 Beacon Detection with\r\nUnsupervised ML and KQL — Part 2\r\nBy Mehmet Ergene\r\nPublished: 2023-12-12 · Archived: 2026-04-05 23:45:57 UTC\r\nContinuing with the same example, CS beacon with 15 minutes sleep and 25% jitter, we can calculate the below\r\nvalues for the beacon(these values can be calculated by analyzing the time deltas):\r\nMinBeaconSleep = 675s\r\nMaxBeaconSleep = 900s\r\nAvgBeaconSleep = 787.5s\r\nMinStdev(BeaconSleep) = 0s\r\nMaxStdev(BeaconSleep)= MaxBeaconSleep — AvgBeaconSleep =112.5s\r\nUsing the information above, we can calculate the approximate/exact jitter ratio. How? Well, it’s basic\r\nmathematics: we have an equation with several variables.\r\nCalculatedJitter = (MaxStdev(BeaconSleep) / AvgBeaconSleep ) * 100 = 14.2%\r\nThe reason for the calculated jitter being smaller than the configured jitter is the behavior of the Cobalt Strike\r\nBeacon. The jitter in Cobalt Strike shifts the average beacon sleep to the left of the configured sleep value. If this\r\nwere an Empire beacon, the calculated jitter would be 25%.\r\nTo make things more clear, we have some values that are configured in the beacon. On the other hand, we have the\r\ndata that is a result of the configuration. What we are doing here is that, since we don’t know the beacon\r\nconfiguration, we are analyzing the resulting data to verify it’s from a beacon or not(kind of reverse engineering).\r\nSince we are trying to perform verification, we can define what kind of beacon configuration we are trying to\r\nverify.\r\nDeveloping the KQL query\r\nIn order to detect beaconing, we can use firewall logs, proxy logs, process network connection logs, etc. The logs\r\nmust have at least below information:\r\nSource Username/Source HostName\r\nDestination IP/Destination Hostname\r\nDestination Port\r\nTimestamp\r\nWe can use requestURL or URLHostname information for proxy logs if we want. Using Source IP information is\r\nnot recommended because IP assignments during VPN connections are not cached in DHCP. Therefore, you can\r\nsee one device with several different IPs assigned to it, which will break the detection logic.\r\nhttps://mergene.medium.com/enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc1e7e\r\nPage 1 of 4\n\nLogic\r\nFor each Source-Destination-Port pair:\r\n sort the Timestamp ascending\r\n calculate time difference between each Timestamp\r\n calculate the stdev, avg, min, and max of the time deltas\r\n calcualte the jitter by using the stddev and avg time delta\r\n If jitter \u003c [threshold]\r\n display the details/generate an alert\r\nMin and max time delta values can be used to increase the fidelity if there are too many results(beacon time\r\ndelta must be between MinBeaconSleep and MaxBeaconSleep, but there might be some spikes).\r\nIf there are more than X users/computers connecting to the same destination, it’s more likely a\r\nnonmalicious beacon (for example, windows update)\r\nIf the logs have only the IP address of the destination, the results can be enriched in several ways, like\r\njoining other events that have IP and hostname info.\r\nWith KQL, we can put TimeGenerated, SentBytes, and ReceivedBytes into lists using make_set/make_list. Then,\r\nwe can sort the timestamp values by using array_sort_asc (Using serialize and sort doesn’t work when the data\r\nis big). Since we have an array, we can use its length to apply some filtering:\r\nNow, we have one row for each source-destination pair, and all the connection timestamps are in the array. This\r\napproach makes the data size small, and small data size means faster processing.\r\nNext, we can use mv-apply on the array of timestamps, set_TimeGenerated, to run a subquery for each\r\nconnection pair. This approach also improves the query performance. In the subquery, we can perform the first\r\nstep of beacon analysis by using the JitterThreshold we just defined:\r\nNote that we are storing the query result into the BeaconCandidates variable. We will perform further analysis on\r\nthe stored results for fine-tuning.\r\nGet Mehmet Ergene’s stories in your inbox\r\nJoin Medium for free to get updates from this writer.\r\nRemember me for faster sign in\r\nNow, we have all beacon candidates based on the thresholds defined. Next, we can filter out the beacons based on\r\nthe CompromisedDeviceCountMax threshold. If there are more devices/users beaconing to the same destination,\r\nthe beacons are most likely nonmalicious.\r\nWe put the results into a new variable, PotentialBeacons, for further analysis.\r\nNext, we will find connections that can’t be a beacon. We will use TimeDeltaList and list_SentBytes:\r\nhttps://mergene.medium.com/enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc1e7e\r\nPage 2 of 4\n\nWe used series_outliers to analyze the outliers. The function performs custom Tukey’s analysis that accepts some\r\noutliers to exist in the data. In addition to that, if there are too many outliers, the connection can’t be a beacon.\r\nSince the device can be turned off or not connected to a network for a while, time deltas can have spikes, but these\r\nspikes shouldn’t happen a lot. That’s why we put the OutlierCountMax condition; to accept more spikes.\r\nAs we now have the PotentialBeacons, ImpossibleBeaconsByTimeDelta and ImpossibleBeaconsBySentBytes,\r\nwe can finally get all real beacons by removing the ImpossibleBeaconsByTimeDelta and\r\nImpossibleBeaconsBySentBytes from PotentialBeacons:\r\nSample result(redacted) and how to read the data:\r\nPress enter or click to view image in full size\r\nI’ve developed queries for Palo Alto FW (Azure Sentinel), Sysmon (Azure Sentinel), and Microsoft Defender for\r\nEndpoint/Microsoft 365 Defender. You can find the queries in my GitHub repo. They run super-fast (90 million\r\nevents are analyzed in 20 seconds) and are able to detect beacons with high jitter, like 90%.\r\nHow to use the queries\r\nWe first need to define boundaries for the beacons you want to detect. Defining the boundaries based on the\r\nEmpire beacon behavior covers Cobalt Strike and others.\r\nHunting with the jitter only\r\nIn this scenario, we want to detect all beacons without filtering them based on the sleep interval. Just change the\r\nJitterThreshold and run the query.\r\nhttps://mergene.medium.com/enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc1e7e\r\nPage 3 of 4\n\nHunting with the jitter and sleep interval\r\nIn this scenario, we want to filter beacons based on the jitter and sleep interval thresholds.\r\nExample: Beacons that have at least 15-minute(900s) sleep with %25 jitter\r\nJitterThreshold = 25\r\nTimeDeltaThresholdMin = 900 - (900*25/100) = 675 = 11 minutes, 15 seconds\r\nOptionally, we want to set an upper boundary for the sleep interval:\r\nTimeDeltaThresholdMax = 900 + (900*25/100) = 1125= 18 minutes, 45 seconds\r\nBased on these values, we can filter the results.\r\nP.S.: If you want to learn KQL, especially for Microsoft Sentinel or Microsoft 365 Defender, do\r\ncheck out my training website. Hope to see you there!”\r\nConclusion\r\nDetecting C2 beacons is hard but not impossible(just requires some statistics knowledge). Beacons with high jitter\r\nconfiguration like 100% are harder to detect, but still possible if you have time to analyze the results.\r\nYou can automate the analysis of the results in several ways like using Logic Apps, enriching the data with VT\r\nscore, using Jupyter Notebook, etc.\r\nAlthough the series ends here, I’ll cover a specific C2 scenario using the method I’ve explained. If you see a\r\nmistake or want to ask something about the method, send me a message on Twitter.\r\nHappy hunting!\r\nSource: https://mergene.medium.com/enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc\r\n1e7e\r\nhttps://mergene.medium.com/enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc1e7e\r\nPage 4 of 4",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"references": [
		"https://mergene.medium.com/enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc1e7e"
	],
	"report_names": [
		"enterprise-scale-threat-hunting-network-beacon-detection-with-unsupervised-ml-and-kql-part-2-bff46cfc1e7e"
	],
	"threat_actors": [
		{
			"id": "b740943a-da51-4133-855b-df29822531ea",
			"created_at": "2022-10-25T15:50:23.604126Z",
			"updated_at": "2026-04-10T02:00:05.259593Z",
			"deleted_at": null,
			"main_name": "Equation",
			"aliases": [
				"Equation"
			],
			"source_name": "MITRE:Equation",
			"tools": null,
			"source_id": "MITRE",
			"reports": null
		},
		{
			"id": "610a7295-3139-4f34-8cec-b3da40add480",
			"created_at": "2023-01-06T13:46:38.608142Z",
			"updated_at": "2026-04-10T02:00:03.03764Z",
			"deleted_at": null,
			"main_name": "Cobalt",
			"aliases": [
				"Cobalt Group",
				"Cobalt Gang",
				"GOLD KINGSWOOD",
				"COBALT SPIDER",
				"G0080",
				"Mule Libra"
			],
			"source_name": "MISPGALAXY:Cobalt",
			"tools": [],
			"source_id": "MISPGALAXY",
			"reports": null
		}
	],
	"ts_created_at": 1775434163,
	"ts_updated_at": 1775791463,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/1791d2ba76643cf9bdbac3843208be89e444963a.pdf",
		"text": "https://archive.orkl.eu/1791d2ba76643cf9bdbac3843208be89e444963a.txt",
		"img": "https://archive.orkl.eu/1791d2ba76643cf9bdbac3843208be89e444963a.jpg"
	}
}