On Dependency Cooldowns
https://blog.yossarian.net/2025/11/21/We-should-all-be-using-dependency-cooldowns
I read this blog post on the idea that we should use “dependency cooldowns” - the core concept is that by delaying when we install dependencies we’ll give more time to detect malicious dependencies. I think this is basically just not a very good idea. While it may be fine to say “don’t rush to install the latest version”, it’s unclear to me that this is a real solution to malicious dependency issues. I’m currently sick so I had very little time to write up something proper, I’ll just leave my notes here.
TL;DR: Dependency cooldowns are a free, easy, and incredibly effective way to mitigate the large majority of open source supply chain attacks.
I think that the only thing true here is that they are easy. They’re not free and I’m extremely doubtful that they’re effective. The cost is that you’ll wait longer for bug fixes, you’ll wait longer for CVE patching (if you can afford to be late on that, which you often can’t). The ineffectiveness is justified further in this post.
The key thing to observe is that the gap between (1) and (2) can be very large2 (weeks or months), while the gap between (2) and (5) is typically very small: hours or days. This means that, once the attacker has moved into the actual exploitation phase, their window of opportunity to cause damage is pretty limited.
I think the key obsrevation is that this all involves very obvious, well worn issues that have had solutiosn for decades:
- Credential compromise. Solutions exist for this already.
- Overly privileged constructs like package managers or services. Sandboxing has been a thing for a long time.
The timing is not particularly striking at all to me.
The author makes their statement on why cooldowns are good:
A “cooldown” is exactly what it sounds like: a window of time between when a dependency is published and when it’s considered suitable for use. The dependency is public during this window, meaning that “supply chain security” vendors can work their magic while the rest of us wait any problems out.
They’re empirically effective, per above
I don’t think this is true at all. They’re empirically effective, perhaps, in the current world where targets are assumed to update quickly. Would they be effective in a world where targets are expected to update slowly? That’s the proposal, after all.
How do we deal with this 1 week window when there’s a CVE that needs to be patched? As an attacker I can manipulate code in a compromised package and then allocate a CVE for a previous version of the package, triggering faster updates. I feel like it’s just wrong to say that this is empirically effective - we lack counterfactuals.
They’re incredibly easy to implement. Moreover, they’re literally free to implement in most cases: most people can use Dependabot’s functionality, Renovate’s functionality, or the functionality build directly into their package manager5.
I suppose they’re relatively cheap, yes. But again, what about when a CVE is allocated?
I think we can probably say that instantly installing any software is a bit silly. That’s already the case with low reputation software as a whole. But I’m really unconvinced that this is the solution we should accept or consider “good” at all.
Let’s look at these examples.
xz Detection existed because the system was slower than expected. This took 5 weeks. A delay probably wouldn’t have helped and it’s also trivial to imagine the malicious maintainer issuing a CVE for a previous vulnerability in order to push the version out sooner.
In a world where the vulnerability wasn’t adopted sooner, would fewer people have been impacted or would it have taken longer to discover?
This was an extremely sophisticated multi-year attack in a service designed to give remote code execution capabilities so I’m just not sure what more we can do about these other than have more tests, code reviews, etc.
Ultralytics This was a build-time attack involving a PyPI credential. Just running the build is sufficient for compromise. The attack worked by accessing local credentials.
Would delays have helped? Yep, definitely. But… how about not allowing a build-time executable to access credentials? Why can it simultaneously access the environment and the network? Why wouldn’t users be alerted to that?
Detection was effective here but I don’t understand why package managers give arbitrary code execution like this to begin with.
tj-actions Once again, I wonder why a github action has arbitrary access. This is just GHA kind of sucking. Permissions are ridiculously coarse grained, there’s virtually no monitoring, there’s no egress network controls, etc. Why can an action do things without asking?. One version of the action didn’t need access, the next did, how is it that users weren’t made aware of that by design? Why could a tag even point to a new commit, and why did that new commit not require users to update/ acknowledge the change?
chalk This was another case of a runtime-executing malware, which feels a bit more novel, but it’s again just “scrapes for secrets or mines crypto, sends data to endpoint”.
Nx And again, remote code execution of a package at build time that accessed secrets and used the network for exfil. Why is this allowed?
rspack Can you guess? Yes, install time attack of a package
num2words Another of the somewhat rarer “it ran at execution time, not build time”. Why did services running have public access without any proxying? Sometimes that’s required, but it’s pretty rare.
Most services only talk to specific endpoints and need specific permissions unless they run queries on a user’s behalf. Still, at least this isn’t just another “build time package installation with ridiculous levels of access”.
Kong
Novel in that it used an image but similar to num2words where the answer for all production services is to assume remote code execution and harden accordingly. Separating network permissions from secrets access permissions is how you solve this problem.
web3.js And another runtime attack based on a compromised package.
The author says this:
In the very small sample set above, 8/10 attacks had windows of opportunity of less than a week. Setting a cooldown of 7 days would have prevented the vast majority of these attacks from reaching end users (and causing knock-on attacks, which several of these were). Increasing the cooldown to 14 days would have prevented all but 1 of these attacks6.
We can’t say this. In a world where cooldowns are the standard we have no idea how attacks would have adapted. Nothing about cooldowns really seems to make life harder for an attacker, they’re a statistical mitigation where you hope that you aren’t part of the unlucky group that got owned first, and it relies on complex detection machinery that’s always playing catch-up.
Related to the above, it’s unfortunate that cooldowns aren’t baked directly into more packaging ecosystems: I just find it so odd because to me it’s striking that package managers don’t bake sandboxing directly into their tooling. Even
Thoughts So I guess my thinking is this. All of these attacks have a few things in common:
- Largely involve compromised API keys that could be exfiltrated and used elsewhere.
- Largely rely on weak permissions, often at build time.
There are a ton of solutions to these problems that feel much better than “everyone wait a week to update”.
-
It should be possible to sign published artifacts, publish to a transparency log, and prove where and when an artifact was produced, that way anyone can verify when/ where the publish happened. This prevents an attacker from stealing a long lived token to publish updates from elsewhere. This sort of attestation would make detection of attacks much easier.
-
Sandboxing at build time. There is insufficient reason for a build tool/ package manager to be able to read arbitrary files on disk or make arbitrary network requests. The model for this is well worn by Browser Extensions - a manifest with rights is published, when the manifest updates a user must ack it, and everything else is blocked. Even a more basic system of just running builds in a container would be better, and that’s what I did for
cargo-sandboxwhen I build a POC. -
Hardening of credentials in general. If you want to publish a package to
crates.ioor wherever else then it should be required for you to use a non-phishable 2FA. Many attacks start with a compromised legitimate package - stopping that is well worn territory, force unphishable 2FA, notify users when maintainers are changed for a repo. -
Perhaps we need better tooling or education, but most services in production environments run with total network access for very little reason. Unless you execute arbitrary requests on behalf of your users it should be very, very simple to firewall off the vast majority of the internet, and if you do execute arbitrary network requests for your users then you should already be isolating that code to begin with.
There’s more than that, by far. I think it’s nuts that Github Actions doesn’t support egress network controls but that’s another thing that would be great, and more patterns around workflow isolation for common CI jobs. But regardless, I’m not convinced at all that cooldowns are a good response when we’re still decades behind on things that we know solve real problems already.
blog comments powered by Disqus