Defense in Depth
Thoughts after a Hacker News post on a docker vulnerability.
The Article
I read Hacker News every day. A story on home server security made the front page recently and got me thinking about Tealok security and defense in depth in computing. This primarily came about due to the (excellent) Hacker News discussion of the article.
The story is pretty simple - you don’t need to read the entirety of the original article to get it. Here’s a summary
- The author starts up a Docker container running Postgres
- Docker by default exposes the container to the internet
- An attacker leveraged the default password in Postgres to get RCE within the container
- Attacker installs Kinsing malware to extract value from the exploited server.
One may be tempted to just say this is a “skills issue” on the part of the original author. If they’d known better, they wouldn’t have exposed the database container and there’d be no story here. I think that’s a valid take to a point - the world is full of dangers and experts are experts in part because they avoid dangers. But that’s missing import lessons.
We want tools to get safer by default.
We want the self-hosting equivalent of sawstop.
Tealok
At it’s heart Tealok is in my mind a set of technologies that make running your own cloud simpler and safer. I see the priorities falling roughly in this order:
- Don’t lose customer data.
- Don’t let customer data be exfiltrated.
- Don’t let customer hardware be abused or exploited.
- Don’t let customer hardware get damaged.
- Don’t let the customer lose control of their hardware.
- Don’t let the customer services stay offline.
- Don’t require the customer to make choices they are unqualified to make.
- Don’t put maintenance work on the customer.
These are pretty off-the-cuff, and likely will not survive first contact with the enemy. I think it’s a useful model in creating a defense-in-depth strategy. The value comes in the tension between each successive point.
Don’t lose customer data
If we have to choose between letting data get exfiltrated, and losing customer data, let it get exfiltrated.
We should almost never have to make this choice.
I also want to acknowledge that some data for some customers are of such a nature that they’d rather I chose the other way. For example, some customers have data that could be used to blackmail them, arrest them, or get them killed. They’d likely prefer that data be lost. However, I as the system builder have no way to know if any give piece of data would be categorized that way (unless I ask the customer). So, as a general rule, we’re going to go with “don’t lose data” as priority number 1.
There is, to me, no other sensible
Don’t let customer data be exfiltrated
I’d rather customer hardware get damaged and their system go offline than that their data get exfiltrated.
I don’t know how valuable their data is. I can guess that their hardware is worth less than $1000. I don’t really know for sure. I don’t know how expensive system downtime is. It could be very large. That said, it’s easy for me to imagine situations in which data exfiltration has a nearly unlimited cost to the user. I have a much harder time imagining unlimited costs from downtime.
Don’t let customer hardware be abused or exploited
By “abuse” and “exploitation” I primarily mean that someone other than the customer is using network, compute, or storage. For the sake of this bullet point, I’m removing access to the customer’s data. This is, admittedly, rare. If you have arbitrary remote code execution, you probably have data exfiltration ability. Probably, but not always. The ’not always’ is what makes this a separate point.
This point is in tension with hardware damage, customers losing control, and services staying online.
The hardware damage tension is hard. Without data exfiltration the actual “damage” an attacker does to the system is roughly the value of the electricity/heat/wear-and-tear and the value of the lost capability of the system. This “my services are 90% slower because an attacker is using all my compute.” If the hardware costs $1000 and an attacker only costs $20 in extra power bill, wear-and-tear, and lost capability, then letting hardware be damaged to avoid abuse is a bad trade.
Just because the priority is in this order doesn’t mean I’m happy fully destroying hardware to prevent abuse.
It just means, in the extreme, if I can choose between the two, I choose to avoid abuse and take hardware damage.
I choose this because hardware damage is a bounded cost. Provided we don’t lose data (priority #1) we should be able to get the customer services back online, even though it costs money. The ’losing control’ point is a fine line - system abuse is a form of losing control -