What We Know about the Capital One Data Breach

I’m not a fan of dissecting complex data breaches when we don’t have any information. In this case we do know more than usual due to the details in the complaint filed by the FBI.

I want to be very clear that this post isn’t to blame anyone and we have only the most basic information on what happened. The only person we know is worthy of blame here is the attacker.

As many people know Capital One makes heavy use of Amazon Web Services. We know AWS was involved in the attack because the federal complaint specifically mentions S3. But this wasn’t a public S3 bucket.

Again, all from the filed complaint:

The attacker discovered a server (likely an instance – it had an IAM role) with a misconfigured firewall. It presumably had a software vulnerability or was vulnerable due to to a credential exposure.
The attacker compromised the server and extracted out its IAM role credentials. These ephemeral credentials allow AWS API calls. Role credentials are rotated automatically by AWS, and much more secure than static credentials. But with persistent access you can obviously update credentials as needed.
Those credentials (an IAM role with ‘WAF’ in the title) allowed listing S3 buckets and read access to at least some of them. This is how the attacker exfiltrated the files.
Some buckets (maybe even all) were apparently encrypted, and a lot of the data within those files (which included credit card applications) was encrypted or tokenized. But the impact was still severe.
The attacker exfiltrated the data and then discussed it in Slack and on social media.
Someone in contact with the attacker saw that information, including attack details in GitHub. This person reported it to Capital One through their reporting program.
Capital One immediately involved the FBI and very quickly closed the misconfigurations. They also began their own investigation.
They were able to determine exactly what happened very quickly, likely through CloudTrail logs. Those contained the commands issued by that IAM role from that server (which are very easy to find). They could then trace back the associated IP addresses. There are many other details on how they found the attacker in the complaint, and it looks like Capital One did quite a bit of the investigation themselves.

So misconfigured firewall (Security Group?) > compromised instance > IAM role credential extraction > bucket enumeration > data exfiltration. Followed by a rapid response and public notification.

As a side note, it looks like the attacker may have been a former AWS employee, but nothing indicates that was a factor in the breach.

People will say the cloud failed here, but we saw breaches like this long before the cloud was a thing. Containment and investigation seem to have actually run far faster than would have been possible on traditional infrastructure. For example Capital One didn’t need to worry about the attacker turning off local logging – CloudTrail captures everything that touches AWS APIs. Normally we hear about these incidents months or years later, but in this case we went from breach to arrest and disclosure in around two weeks.

I hope that someday Capital One will be able to talk about the details publicly so the rest of us can learn. No matter how good you are, mistakes happen. The hardest problem in security is solving simple problems at scale. Because simple doesn’t scale, and what we do is damn hard to get right every single time.

Blog

What We Know about the Capital One Data Breach

Comments

Leave a Reply Cancel reply

Research

Firestarter: Multicloud Deployment Structures and Blast Radius

Firestarter: So you want to multicloud?

Firestarter: 2019: Insert Winter is Coming Meme Here

Firestarter: re:Invent Security Review

Firestarter: Hardware Hacks and Lift and Pray

Sign Up for Our Newsletter

Contact

About

Quick Links