{
	"id": "94ec58da-c644-41d1-94d5-12fa1785903b",
	"created_at": "2026-04-06T00:15:20.449577Z",
	"updated_at": "2026-04-10T13:11:21.984885Z",
	"deleted_at": null,
	"sha1_hash": "3a0069c6ada2455941df52878bd5f4b13242aa66",
	"title": "Analyzing conti-leaks without speaking russian — only methodology",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 1873961,
	"plain_text": "Analyzing conti-leaks without speaking russian — only\r\nmethodology\r\nBy Arnaud Zobec\r\nPublished: 2022-02-28 · Archived: 2026-04-06 00:05:39 UTC\r\n5 min read\r\nFeb 28, 2022\r\nIf you’re like me and you don’t speak russian, and you have a conti leak to analyze, here is some tricks for you.\r\nDisclaimer : I will not do the analysis in depth of the files here. It’s just a blogpost to show methodology in such\r\ncase. The audience for this blogpost can be students, or people interested in CTI without big budget. This is NOT\r\nan analysis of Conti-leaks. This is NOT a TODO list in every case. It’s my methodology for json files.\r\nI will talk about how I modified the file to load it easily with Python, and how I used some libraries to translate\r\nthe text, and how I used other softwares, like Gephi, or command-lines like egrep to have informations quickly.\r\nFirst look at the files\r\nWhen you look at the files first, it appears to be in json. Awesome, we love JSON, it’s very easy to use it.\r\nPress enter or click to view image in full size\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 1 of 8\n\nFirst look at the content of the json files contained in the leak.\r\nYou have several ways to load the file into Python, and I’ll show you two different methods under:\r\nFirst method : transform the files a little bit and load it via JSON libraries\r\n#To make one file\r\ncat *.json \u003e big.json\r\n#To remove the first \\n\r\nsed -i -e ':a;N;$!ba;s/{\\n/{/g' big.json\r\n#Remove the \\n after the commas\r\nsed -i -e ':a;N;$!ba;s/,\\n/,/g' big.json\r\n#Remove the \\n before {\r\nsed -i -e ':a;N;$!ba;s/\\\"\\n/\\\"/g' big.json\r\nYour file should now look like this :\r\nPress enter or click to view image in full size\r\nbig.json content\r\nBut you know, there is a WAY simpler trick if you use jq :) . It was just to forced you to use sed to make a little bit\r\nof file manipulation ;)\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 2 of 8\n\ncat *.json | jq -cr \u003e big.json\r\nIt will make a one-line for each json line it can read.\r\nAnd now that I have a clean file, what I want to do is to load every line in a list of dictionnaries in python (and\r\nprint it for the example).\r\nimport json\r\nchatList = []\r\nwith open('onebig.json') as f:\r\n for jsonObj in f:\r\n _Dict = json.loads(jsonObj)\r\n chatList.append(_Dict)\r\nfor line in chatList:\r\n print(line['body'])#print each body\r\nEasy peasy lemon squeezy\r\nRemember ? I don’t speak russian, but I want to read it, and I have no money to pay a professionnal translator. But\r\nmy data is inside a python dictionnary, so I can do whatever I want with it.\r\nTranslation via python\r\nI use a free library that is called deep-translator (https://github.com/nidhaloff/deep-translator)\r\n(to install it : pip install -U deep-translator)\r\nWhat I will do is to use the library on the “body” key in the json file, for each line, and translate it into english into\r\na new key “LANG-EN”. And if there is some fail, I want the message to be “Error during Translation”\r\nAnd finally, I want to print the result of the line as a JSON line.\r\nimport json\r\nfrom deep_translator import GoogleTranslatorchatList = []\r\nwith open('onebig.json') as f:\r\n for jsonObj in f:\r\n _Dict = json.loads(jsonObj)\r\n chatList.append(_Dict)for line in chatList:\r\n try:\r\n translation = GoogleTranslator(source='auto', target='en').translate(line[\"body\"])\r\n line[\"LANG-EN\"] = translation\r\n except Exception as e:\r\n line[\"LANG-EN\"] = \"Error during Translation\"\r\n print(json.dumps(line, ensure_ascii = False).encode('utf8').decode())\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 3 of 8\n\nAs you can see, I had to use ensure_ascii = False and encode(‘utf-8’) because I still want to print russian\r\ncharacters.\r\nNow, your output should look like this :\r\nPress enter or click to view image in full size\r\noutput of the translation script in python\r\nSecond method : transform the files a little bit and load it via pandas\r\nI will transform the first big.json file a little bit, to make it like one big JSON file.\r\nGet Arnaud Zobec’s stories in your inbox\r\nJoin Medium for free to get updates from this writer.\r\nRemember me for faster sign in\r\nTo do it, I’ll put every json line into a json tab:\r\n#add a \",\" between \"}\" and \"{\"\r\nsed -i -e ':a;N;$!ba;s/}/},/g' big.json\r\nThen I add this character \"[\" at the beginning of the file and this character \"]\" at the end of the f\r\nAnd now, I can load it into a Pandas DataFrame very easily !\r\nimport pandas as pd\r\ndf = pd.read_json('big.json')\r\n#Yes, it's that easy\r\nAnd why using pandas dataframe ?\r\nWell we can sort it by dates very easily, and transform it into CSV to export to use with other tools that do not deal\r\nwith JSON easily.\r\nimport pandas as pd\r\ndf = pd.read_json('big.json')\r\nsorted_df = df.sort_values(by=\"ts\")\r\nsorted_df.to_csv('onebig.csv', doublequote=True, quoting=1, escapechar=\"\\\\\")\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 4 of 8\n\nThis code above will create a file called “onebig.csv” sorted by dates.\r\nPress enter or click to view image in full size\r\nonbig.csv output\r\nAnd now what ?\r\nVisualisations : with gephi\r\nGephi is an Open Graph Viz Platform - https://gephi.org/\r\nYou can use gephi, and a Yifan Hu spacialisation to see the interactions between people , by applying a\r\nponderation on links (for example).\r\nThe bigger is the arrow, the bigger is the weight of the link. It means those at each side of the arrow are two\r\npeople that are often talking together.\r\nWe can easily identify people of interest using gephi with this methodology.\r\nOh. You may want to have a graphic card to use it, it’s very power consumptive.\r\nPress enter or click to view image in full size\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 5 of 8\n\nYifan Hu spacialisation using Gephi\r\nVisualisations : with elasticsearch and kibana\r\nWith a very simple configuration, you can load your data into an elasticsearch/kibana cluster, and read things,\r\nrequest it, etc.\r\n#content of /etc/logstash/conf.d/00-leak-analysis.conf\r\ninput {\r\n # this is the actual live log file to monitor\r\n file {\r\n path =\u003e \"/myfolder/leak/*.json\"\r\n type =\u003e \"leak\"\r\n #codec =\u003e json\r\n start_position =\u003e [\"beginning\"]\r\n }}\r\nfilter{\r\n if [type] == \"leak\"\r\n {\r\n json {\r\n source =\u003e message\r\n }\r\n }\r\n}output {\r\n if [type] == \"leak\" {\r\n elasticsearch {\r\n hosts =\u003e [\"localhost:9200\"]\r\n index =\u003e \"leak-%{+yyyy-MM-dd}\"\r\n }\r\n }\r\n}\r\nPress enter or click to view image in full size\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 6 of 8\n\nread messages in Kibana\r\nThen , while using kibana, you can sort by users , or search for specific things.\r\nTo go further :\r\nMaybe you want to extract quickly the url contained in the big.json file ?\r\nquick hint : use regexp via egrep\r\negrep '(http|https):\\/\\/[a-zA-Z0-9.\\/?=_%\u0026:-]*' -o big.json \u003e url_output.txt\r\nAnd there you are. Oh, and you can use defang (python tool) on your file to read it safely !\r\n(to install defang : pip install defang)\r\ndefang -i url_output.txt -o url_output_defanged.txt\r\nPress enter or click to view image in full size\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 7 of 8\n\ndefanged URL observed in leak\r\nIt’s now your turn to be imaginative to read things inside this leak. Have fun :)\r\nSource: https://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nhttps://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b\r\nPage 8 of 8",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://medium.com/@arnozobec/analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b"
	],
	"report_names": [
		"analyzing-conti-leaks-without-speaking-russian-only-methodology-f5aecc594d1b"
	],
	"threat_actors": [],
	"ts_created_at": 1775434520,
	"ts_updated_at": 1775826681,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/3a0069c6ada2455941df52878bd5f4b13242aa66.pdf",
		"text": "https://archive.orkl.eu/3a0069c6ada2455941df52878bd5f4b13242aa66.txt",
		"img": "https://archive.orkl.eu/3a0069c6ada2455941df52878bd5f4b13242aa66.jpg"
	}
}