{
	"id": "3832438e-fbc5-44c2-916f-2711011c26b6",
	"created_at": "2026-04-06T01:30:51.411779Z",
	"updated_at": "2026-04-10T03:21:36.953639Z",
	"deleted_at": null,
	"sha1_hash": "5f04bd9b45be75e08881c133ca46c8db8ca104f7",
	"title": "Introducing ROADtools - The Azure AD exploration framework",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 1132362,
	"plain_text": "Introducing ROADtools - The Azure AD exploration framework\r\nPublished: 2020-04-16 · Archived: 2026-04-06 01:04:12 UTC\r\nOver the past 1.5 years I’ve been doing quite a lot of exploration into Azure AD and how it works under the hood.\r\nAzure AD is getting more and more common in enterprises, and thus securing it is becoming a bigger topic.\r\nWhereas the traditional Windows Server Active Directory already has so much research and community tooling\r\navailable for it, Azure AD is in my opinion lagging behind in this aspect. In this post I’m introducing the\r\nROADtools framework and it’s first tool: ROADrecon. This framework was developed during my research and\r\nwill hopefully serve as both a useful tool and an extensible framework for anyone that wants to analyse Azure AD,\r\nwhether that is from a Red Team or a Blue Team perspective. This post is the first in part of a series in which I’ll\r\ndive into more aspects of Azure AD and the ROADtools framework. Both ROADtools and ROADrecon are free\r\nopen source tools and available on my GitHub. I also did a live stream of most things that are written here that you\r\ncan watch on YouTube.\r\nWhy this framework\r\nWhenever I find myself in a new network or researching a new topic, I want to know as much information as\r\npossible about it, in an easy to digest format. In Active Directory environments, information is relatively simple to\r\nquery using LDAP, and many tools exist that query this information and transform it into a format that’s easier to\r\nuse for humans. Back when I started writing tools, I wrote a simple tool ldapdomaindump that tried to save all the\r\ninformation it could gather offline, so that I could quickly answer questions like “oh which groups is this user in\r\nagain” or “do they have a group for system X that could be useful”.\r\nFast forward a few years and companies are often using Microsoft 365 and moving their things to Azure, where\r\nthere isn’t really a tool that gives you quick insight into an environment. The Azure portal simply requires too\r\nmany clicks to find what you’re looking for, and it can be disabled for anyone but admins. The various Powershell\r\nmodules, .NET libraries and other official ways to query Azure AD have varying degrees of support for\r\ninformation they can give you, ways to authenticate and restrictions that can be applied to them. While\r\nresearching Azure AD I wanted to have a way to access all the possible information, using any authentication\r\nmethod (whether obtained legitimately or not) and have it available offline. Since none of the official methods\r\noffered this possibility, I quickly realized building a custom framework was the only way to achieve it. So I set\r\nmyself a few goals:\r\nProvide tooling for both Red teams and Blue teams to explore all Azure AD data in an accessible way.\r\nShow the wealth of information available to anyone with just 1 valid credential set – from the internet.\r\nImprove understanding of how Azure AD works and what the possibilities are.\r\nProvide a framework that people can build upon and extend for their own use-cases.\r\nI did learn a few things along the way from writing ldapdomaindump, which kept all information in memory until\r\nit had calculated all the recursive group memberships, at which point it would write it to disk. As one expects, this\r\nscales pretty bad in environments that have more than a few thousand users in them. I spent a lot of time thinking\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 1 of 12\n\nhow I wanted to do it (and even more writing the actual code), while ignoring all the ways one is supposed to\r\naccess Azure AD, so today is the first release of the Rogue Office 365 and Azure (active) Directory tools!\r\nROADrecon\r\nThe first (and likely most extensive) tool in this framework is ROADrecon. In short, this is what it does:\r\nUses an automatically generated metadata model to create an SQLAlchemy backed database on disk.\r\nUse asynchronous HTTP calls in Python to dump all available information in the Azure AD graph to this\r\ndatabase.\r\nProvide plugins to query this database and output it to a useful format.\r\nProvide an extensive interface built in Angular that queries the offline database directly for its analysis.\r\nWhere to get the data\r\nSince Azure AD is a cloud service, there isn’t a way to reverse engineer how it works, or a central repository\r\nwhere all the data is stored that you can access. Since Azure AD is completely different than Windows Server AD,\r\nthere’s also no LDAP to query the directory. While researching Azure and looking through the requests in the\r\nAzure Portal, at some point I noticed that the portal was calling a different version of the Azure AD Graph, the\r\n1.61-internal version.\r\nThis internal version of the Azure AD graph exposes much more data than any of the official API’s that are offered\r\nby Microsoft. I talked about some of the interesting things that you can find in this API in my BlueHat Seattle talk\r\nlast year. Though one is probably not supposed to use this version, it is still available for any user. By default it is\r\npossible to query almost all the information about the directory as authenticated user, even when the Azure portal\r\nis restricted.\r\nThe next question was how to store this data in a structured way locally. The API streams everything as JSON\r\nobjects, which is a useful format for transferring data but not really for storing and searching through data. So\r\nideally we’d have a database in which objects and their relationships are automatically stored and mapped. For\r\nthis ROADrecon uses the SQLAlchemy Object Relational Mapper (ORM). What this means is that ROADrecon\r\ndefines the structure of the objects and their relationships, and SQLAlchemy determines how it stores and\r\nretrieves those from the underlying database. To create the object structure, ROADrecon uses the OData metadata\r\ndefinition that the Azure AD graph exposes. This XML document defines all object types, their properties and\r\nrelationships in the directory.\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 2 of 12\n\nI wrote some quite ugly code which transforms this metadata XML (mostly) automatically into a neat and well-defined database structure, which for example looks like this:\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 3 of 12\n\nSQLAlchemy then creates the database for this model, which by default is an SQLite database, but PostgreSQL is\r\nalso supported (in my testing the performance difference was minimal but SQLite seemed slightly faster). The\r\nmain advantage of this is that it is really easy to query the data afterwards, without having to write any SQL\r\nqueries yourself.\r\nThis database model is actually not part of ROADrecon but roadlib , the central library component of\r\nROADtools. The reason for this is if you would want to build an external tool that interfaces with the database\r\npopulated by ROADrecon you wouldn’t actually need to import ROADrecon yourself and all its dependencies.\r\nInstead you could import the library containing the database logic, which doesn’t depend on all the third party\r\ncode that ROADrecon used to transform and display data.\r\nDumping the data\r\nROADrecon uses a process consisting of 3 steps to dump and explore the data in Azure AD:\r\n1. Authenticate - using username/password, access token, device code flow, etc\r\n2. Dump the data to disk\r\n3. Explore the data or transform it into a useful format using plugins\r\nAuthenticating\r\nAuthenticating is the first step to start gathering data. ROADrecon offers quite some options to authenticate:\r\nusage: roadrecon auth [-h] [-u USERNAME] [-p PASSWORD] [-t TENANT] [-c CLIENT] [--as-app] [--device-code] [--ac\r\n [--refresh-token REFRESH_TOKEN] [-f TOKENFILE] [--tokens-stdout]\r\noptional arguments:\r\n -h, --help show this help message and exit\r\n -u USERNAME, --username USERNAME\r\n Username for authentication\r\n -p PASSWORD, --password PASSWORD\r\n Password (leave empty to prompt)\r\n -t TENANT, --tenant TENANT\r\n Tenant ID to auth to (leave blank for default tenant for account)\r\n -c CLIENT, --client CLIENT\r\n Client ID to use when authenticating. (Must be a public client from Microsoft with user_\r\n Default: Azure AD PowerShell module App ID\r\n --as-app Authenticate as App (requires password and client ID set)\r\n --device-code Authenticate using a device code\r\n --access-token ACCESS_TOKEN\r\n Access token (JWT)\r\n --refresh-token REFRESH_TOKEN\r\n Refresh token (JWT)\r\n -f TOKENFILE, --tokenfile TOKENFILE\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 4 of 12\n\nFile to store the credentials (default: .roadtools_auth)\r\n --tokens-stdout Do not store tokens on disk, pipe to stdout instead\r\nThe most common ones that you will use are probably username + password authentication or device code\r\nauthentication. Username + password is the easiest, but does not support (by design) any way of MFA since it’s\r\nnon-interactive. If your account requires MFA you can use the device code flow, which will give you a code to\r\nenter in the browser. There are more options here that you shouldn’t need to use in most scenarios, but are for\r\nadvanced usage or if you want to use tokens that were obtained via different methods. I am planning to do a future\r\nblog on Azure AD authentication and the options available for red teamers. ROADrecon will by default pretend to\r\nbe the Azure AD PowerShell module and will thus inherit its permissions to access the internal version of the\r\nAzure AD graph. By default, ROADrecon will store the obtained authenticating tokens on disk in a file called\r\n.roadtools_auth . Depending on the authentication method this file contains long-lived refresh tokens, which\r\nprevent you from having to sign in all the time. This file is also compatible with any (future) tools using roadlib as\r\nauthentication library. If you don’t want to store tokens on disk you can also output them to stdout which allow\r\nyou to pipe them into the next command directly.\r\nGathering all the data\r\nThe second step is data gathering, which the roadrecon gather command does. This has a few simple options:\r\nusage: roadrecon gather [-h] [-d DATABASE] [-f TOKENFILE] [--tokens-stdin] [--mfa]\r\noptional arguments:\r\n -h, --help show this help message and exit\r\n -d DATABASE, --database DATABASE\r\n Database file. Can be the local database name for SQLite, or an SQLAlchemy compatible UR\r\n postgresql+psycopg2://dirkjan@/roadtools. Default: roadrecon.db\r\n -f TOKENFILE, --tokenfile TOKENFILE\r\n File to read credentials from obtained by roadrecon auth\r\n --tokens-stdin Read tokens from stdin instead of from disk\r\n --mfa Dump MFA details (requires use of a privileged account)\r\nBy default it will dump it into an SQLite database called roadrecon.db in the current directory. Using postgresql\r\nrequires some additional setup and the installation of psycopg2 . The options for tokens depend on the settings\r\nyou used in the authentication phase and are not needed if you didn’t change those. The only other option for now\r\nis whether you want to dump data on Multi Factor Authentication, such as which methods each user has set up.\r\nThis is the only privileged component of the data gathering and requires an account with membership of a role\r\nthat gives access to this information (such as Global Admin/Reader or Authentication Administrator).\r\nROADrecon will request all the data available in two phases. The first phase requests all users, groups, devices,\r\nroles, applications and service principals in parallel using the aiohttp Python library. While requesting these\r\nobjects is done in parallel, the Azure AD graph returns them in chunks of 100 entries and then includes a token to\r\nrequest the next page. This means that requesting the next 100 entries can only be performed after the result of the\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 5 of 12\n\nfirst 100 is returned, effectively still making this a serial process. Each object type is requested in parallel, but it\r\nwill still have to wait for the slowest parallel job to finish before continuing.\r\nIn the second phase all the relationships are queried, such as group memberships, application roles, directory role\r\nmembers and application/device owners. Because this is performed per individual group, there is a much larger\r\nnumber of parallel tasks here and thus the speed gains of using aiohttp become much larger. To limit the\r\nnumber of objects in memory, ROADrecon regularly flushes the database changes to disk (in chunks of ~1000\r\nchanges or new entries). This is not done asynchronously (yet) because in my testing the performance bottleneck\r\nseemed to be the HTTP requests rather than the database reads/writes.\r\nOverall this whole process is pretty fast and for sure much faster than dumping everything in serial before I\r\nrewrote it to async code. Dumping an Azure AD environment of around 5000 users will take about 100 seconds.\r\nFor really large environments that I’ve tested (~120k users) this will still take quite some time (about 2 hours)\r\nbecause of the number of objects that have to be requested in serial in the first phase of data gathering.\r\n(ROADtools) user@localhost:~/ROADtools$ roadrecon gather --mfa\r\nStarting data gathering phase 1 of 2 (collecting objects)\r\nStarting data gathering phase 2 of 2 (collecting properties and relationships)\r\nROADrecon gather executed in 7.11 seconds and issued 490 HTTP requests.\r\nExploring the data with the ROADrecon GUI\r\nNow that we have access to all the data locally on disk in the database, we can start exploring it and convert it to a\r\nformat that is easy to digest for humans. There are multiple options for this. ROADrecon is built with extensibility\r\nin mind, so it has a rudimentary plugin framework which allows for writing plug-ins that can take the data in the\r\ndatabase and output this into something useful. For real simple use-cases, you don’t even need ROADrecon, but\r\nyou can write a few lines of code that do what you want it to. Here is an example of a simple tool that only\r\nrequires you to import the database definition from roadlib and then prints the names of all the users in the\r\ndatabase:\r\nfrom roadtools.roadlib.metadef.database import User\r\nimport roadtools.roadlib.metadef.database as database\r\nsession = database.get_session(database.init())\r\nfor user in session.query(User):\r\n print(user.displayName)\r\nYou don’t need to write any code in most cases though, as ROADrecon already comes with some export plugins\r\nand a fully functional GUI. When running the roadrecon-gui or roadrecon gui commands, it will launch a\r\nlocal webserver through Flask which exposes a REST API that can be accessed by the single-page Angular\r\nJavaScript application.\r\nIt currently features:\r\nListing of users / devices / groups\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 6 of 12\n\nSingle-page directory role overview\r\nApplications overview\r\nService Principal details\r\nRole / OAuth2 permissions assignments\r\nMFA overview\r\nSome screenshots (or watch a live demo here):\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 7 of 12\n\nA recurring component of these listings is that the most important properties are displayed in a table, which\r\nsupports pagination and a quick filter option. If you want to know more details of an object or how it relates to\r\nother components, most of the objects are clickable. When clicked, more detailed information will be shown in a\r\npop-up.\r\nFor every object there is also the “raw” view, which displays all the available properties in a collapsible JSON\r\nstructure (these properties come directly from the Azure AD internal API).\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 8 of 12\n\nOne of my favourite views is the Directory Roles view, since this view gives a really quick overview of which\r\nuser or service accounts has a privileged role assigned. If you performed collection using a privileged account\r\n(Blue Team!) and collected MFA info, you can instantly see which accounts have MFA methods registered and\r\nwhich ones don’t have this.\r\nAnother one is the Application Roles page, which shows all the privileges that Service Principals have in for\r\nexample the Microsoft Graph and which users/groups are assigned to a role in applications.\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 9 of 12\n\nThere are some things still in development in the GUI and I plan to add more advanced filtering features later, but\r\nthe basics are there and overall it feels pretty snappy barring some loading times in large environments.\r\nROADrecon plugins - Parsing conditional access policies\r\nI already mentioned plugins and that the goal is to make it easy for others to also write their own plugins or tools\r\ninteracting with ROADrecon. An example plugin that I developed together with my colleague Adrien Raulot\r\nwhich has not made its way to the GUI yet is the conditional access policies plugin. As I discussed during my\r\nBlueHat talk, conditional access policies are not visible for regular users in the Azure Portal. The internal Azure\r\nAD API allows anyone to list them, but their raw format is full of GUIDs that have to be resolved manually. The\r\n“policies” plugin for ROADtools parses them into readable format and outputs them to a single static HTML page.\r\nSince Conditional Access policies are a pain to explore in Azure AD and require way too many clicks, this file is\r\none of my favourite methods of exploring them. From a Red Team perspective, Conditional Access Policies are\r\nthe most valuable resource to determine which applications do have stricter access controls such as requiring MFA\r\nor a managed device.\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 10 of 12\n\nBloodHound - with a twist of cloud\r\nAnother plugin that has a lot of potential is the BloodHound plugin. This plugin reads the objects of the Azure AD\r\ntenant in the database and writes them into a (local) neo4j database containing BloodHound data. When using a\r\ncustom fork of the BloodHound interface, you can explore users, groups and roles visually, including links with\r\non-prem Active Directory users if is a synchronized environment.\r\nThe BloodHound fork is still in an alpha version and will require some knowledge of Cypher to really get all the\r\ninformation out of it. I know that other people (such as Harmj0y and tifkin_) have also been working on an Azure\r\nAD supporting version of BloodHound, so my hope is that this can be developed further and maybe even merged\r\nback into the official BloodHound project.\r\nGetting the tools\r\nROADtools is available on GitHub under an MIT open source license. Easiest way to install is using PyPi,\r\nautomatic builds from Git are available in Azure Pipelines.\r\nThe fork of BloodHound is available at https://github.com/dirkjanm/BloodHound-AzureAD.\r\nI do also have a lot of stickers with the ROADtools logo (thanks for the design help Sanne!), which I’ll be handing\r\nout as soon as we can safely do conferences again!\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 11 of 12\n\nDefense\r\nIn my opinion enumeration is not an attack technique that blue teamers should focus their defense efforts on. The\r\nbest way to prevent unauthorized users from accessing this information is by having strict conditional access\r\npolicies which govern how and from where users are allowed to use their Azure AD credentials. That being said,\r\nthere is a setting in the deprecated MSOnline PowerShell module which prevents enumeration using the Azure\r\nAD graph, which is documented here. I haven’t personally looked into bypassing this or if other functionality in\r\nAzure breaks if you enable this.\r\nSource: https://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nhttps://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/\r\nPage 12 of 12",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MITRE"
	],
	"references": [
		"https://dirkjanm.io/introducing-roadtools-and-roadrecon-azure-ad-exploration-framework/"
	],
	"report_names": [
		"introducing-roadtools-and-roadrecon-azure-ad-exploration-framework"
	],
	"threat_actors": [],
	"ts_created_at": 1775439051,
	"ts_updated_at": 1775791296,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/5f04bd9b45be75e08881c133ca46c8db8ca104f7.pdf",
		"text": "https://archive.orkl.eu/5f04bd9b45be75e08881c133ca46c8db8ca104f7.txt",
		"img": "https://archive.orkl.eu/5f04bd9b45be75e08881c133ca46c8db8ca104f7.jpg"
	}
}