I recently gave a brief talk that noted how Let’s Encrypt and cloud-based architectures encourage positive appsec behaviors. Check out the slides and this blog post for a sense of the main points. Shortly thereafter a slew of security and stability events related to HTTPS and cloud services (SHA-1, Cloudbleed, S3 outage) seemed to undercut this thesis. But perhaps only superficially so. Rather than glibly dismiss these points, let’s examine these events from the perspective of risk and engineering — in other words, how complex systems and software are designed and respond to feedback loops.
This post is a stroll through HTTPS and cloud services, following a path of questions and ideas that builders and breakers might use to evaluate security; leaving the blather of empty pronouncements behind. It’s about the importance of critical thinking and seeking the reasons why a decision comes about.
For more than a decade at least two major hurdles have blocked pervasive HTTPS: Certs and configuration. The first was (and remains) convincing sites to deploy HTTPS at all, tightly coupled with making deployment HTTPS-only instead of mixed with unencrypted HTTP. The second is getting HTTPS deployments to use strong TLS configurations, e.g. TLS 1.2 with default ciphers that support forward secrecy.
For apps that handle credit cards, PCI has been a crucial driver for adopting strong HTTPS. Having a requirement to use transport encryption, backed by financial consequences for failure, has been more successful than either asking nicely, raising awareness at security conferences, or shaming. As a consequence, I suspect the rate of HTTPS adoption has been far faster for in-scope PCI sites than others.
The SSL Labs project could also be a factor in HTTPS adoption. It distilled a comprehensive analysis of a site’s TLS configuration into a simple letter score. The publically-visible results could be used as a shaming tactic, but that’s a weaker strategy for motivating positive change. The failure of shaming, especially as it relates to HTTPS, is partly demonstrated by the too-typical disregard of browser security warnings. (Which is itself a design challenge, not a user failure.)
Importantly, SSL Labs provides an easy way for organizations to consistently monitor and evaluate their sites. This is a step towards providing help for migration to HTTPS-only sites. App owners still bear the burden of fixing errors and misconfigurations, but this tool made it easier to measure and track their progress towards strong TLS.
Where SSL Labs inspires behavioral change via metrics, the Let’s Encrypt project empowers behavioral change by addressing fundamental challenges faced by app owners.
Let’s Encrypt eases the resource burden of managing HTTPS endpoints. It removes the initial cost of certs (they’re free!) and reduces the ongoing maintenance cost of deploying, rotating, and handling certs by supporting automation with the ACME protocol. Even so, solving the TLS cert problem is orthogonal to solving the TLS configuration problem. A valid Let’s Encrypt cert might still be deployed to an HTTPS service that gets a bad grade from SSL Labs.
A cert signed with SHA-1, for example, will lower its SSL Labs grade. SHA-1 has been known weak for years and discouraged from use, specifically for digital signatures. Having certs that are both free and easy to rotate (i.e. easy to obtain and deploy new ones) makes it easier for sites to migrate off deprecated versions. The ability to react quickly to change, whether security-related or not, is a sign of a mature organization. Automation as made possible via Let’s Encrypt is a great way to improve that ability.
The recent work that demonstrated a SHA-1 collision is commendable, but it shouldn’t be the sudden reason you decided to stop using it. If such proof of danger is your sole deciding factor, you shouldn’t be using (or supporting) Flash or most Android phones.
Facebook explained their trade-offs along the way to hardening their TLS configuration and deprecating SHA-1. It was an engineering-driven security decision that evaluated solutions and chose among conflicting optimizations — all informed by measures of risk. Engineering is the key word in this paragraph; it’s how systems get built. Writing down a simple requirement and prototyping something on a single system with a few dozen users is far removed from delivering a service to hundreds of millions of people. WhatsApp’s crypto design fell into a similar discussion of risk-based engineering. Another example of evaluating risk and threat models is this excellent article on messaging app security and privacy.
Companies like Cloudflare take a step beyond SSL Labs and Let’s Encrypt by offering a service to handle both certs and configuration for sites. They pioneered techniques like Keyless SSL in response to their distinctive threat model of handling private keys for multiple entities.
If you look at the Cloudbleed report and immediately think a service like that should be ditched, it’s important to question the reasoning behind such a risk assessment. Rather than make organizations suffer through the burden of building and maintaining HTTPS, they can have a service the establishes a strong default. Adoption of HTTPS is slow enough, and fraught with error, that services like this make sense for many site owners.
Compare this with heartbleed, which also affected TLS sites, could be more targeted, and exposed private keys (among other sensitive data). The cleanup was long, laborious, and haphazard. Cloudbleed had significant potential exposure, although its discovery and remediation likely led to a lower realized risk than heartbleed.
If you’re saying move away from services like that, what in practice are you saying to move towards? Self-hosted systems in a rack in an electrical closet? Systems that will likely degrade over time and, even more likely, never be upgraded to TLS 1.3? That seems ill-advised.
Does the recent Amazon S3 outage raise concern for cloud-based systems? Not to a great degree. Or, at least, not in a new way. If your site was negatively impacted by the downtime, a good use of that time might have been exploring ways to architect fail-over systems or revisit failure modes and service degradation decisions. Sometimes it’s fine to explicitly accept certain failure modes. That’s what engineering and business do against constraints of resource and budget.
So, let’s leave a few exercises for the reader, a few open-ended questions on threat modeling and engineering.
Flash has been long rebuked as both a performance hog and security weakness. Like SHA-1, the infosec community has voiced this warning for years. There have even been one or two (maybe more) demonstrated exploits against it. It persists. It’s embedded in Chrome, which you can interpret as a paternalistic effort to sandbox it or (more cynically) an effort to ensure YouTube videos and ad revenue aren’t impacted by an exodus from the plugin — or perhaps somewhere in between.
Browsers have had impactful vulns, many of which carry significant risk and impact as measured by the annual $50K+ rewards from Pwn2Own competitions. The minuscule number of browser vendors carries risk beyond just vulns, affecting influence on standards and protections for privacy. Yet more browsers doesn’t necessarily equate to better security models within browsers.
On the topic of decentralization, how much is good, how much is bad? DNS recommendations go back and forth. We’ve seen huge DDoS against providers, taking out swaths of sites. We’ll see more. But is shifting DNS the right solution, or a matter that misses the underlying threat or cause of such attacks? How much of IoT is new or different (scale?) compared to the swarms of SQL Slammer and Windows XP botnets of yesteryear’s smaller internet population?
Approaching these with ideas around resilience, isolation, authn/authz models, or feedback loops are (just a few) traits of a builder. As much as they might be for a breaker executing attack models against them.
Approaching these by explaining design flaws and identifying implementation errors are (just a few) traits of a breaker. As much as they might be for a builder designing controls and barriers to disrupt attacks against them.
Approaching these by dismissing complexity, designing systems no one would (or could) use, or highlighting irrelevant flaws is often just blather. Infosec has its share of vacuous or overly-ambiguous phrases like military-grade encryption, perfectly secure, artificial intelligence (yeah, I know, blah blah blah Skynet), use-Tor-use-Signal. There’s a place for mockery and snark. This isn’t concern trolling, which is preoccupied with how things are said. This is about the understanding behind what is said — the risk calculation, the threat model, the constraints.
I believe in supporting people to self-identity along the spectrum of builder and breaker rather than pin them to narrow roles. (A principle applicable to many more important subjects as well.) This about the intellectual reward of tackling challenges faced by builders and breakers alike, and leaving behind the blather of uninformed opinions and empty solutions.
I’ll close with this observation from Carl Sagan (from his book, The Demon-Haunted World): “It is far better to grasp the universe as it really is than to persist in delusion, however satisfying and reassuring.”
Our application universe consists of systems and data and users, each in different orbits. Security should contribute to the gravity that binds them together, not the black hole that tears them apart. Engineering sees the universe as it really is; shed the delusion that one appsec solution in a vacuum is always universal.