A blog about the development of a general-purpose solution for mitigating cold-boot attacks on Full-Disk-Encryption solutions.

Controlling the uncontrollable cache

I think I found a solution for the last significant challenge, which I've described in the blog entry "Lack of cache control": it's the uncertainty about whether any data in the CPU cache has been flushed out into RAM. This flushing could be initiated by CPU instructions like invd, wbinvd and clflush (Thanks, haxwell) or even external events like signals on a CPU pin (although this is just my personal speculation).

I've previously suggested this approach to minimize this risk:
One way to minimize the impact of "unintentional" cache flushes (unintentional from our point of view) is to repeat the cache freezing procedure periodically in order to reverse the effects of "unintentional" cache flushes (wbinvd).
My new idea would eliminate this risk all together. However, I haven't actually verified yet, whether this idea can actually be implemented. Keep this in mind while reading the next paragraphs. It is also important to understand the difference between physical/linear and virtual memory addresses; if you don't know what they are then you should read this before you read on.

The idea is actually quite simple: keep the data in the cache on physical/linear addresses which aren't backed by RAM on the system. This would guarantee that the data won't leave the CPU cache, even if a cache flush is triggered (a GP would be raised).

What I haven't verified, is whether it is actually possible to set up this scenario. The setup procedure might look something like this:
  1. load the data into CPU registers
  2. overwrite the data in RAM
  3. change the virtual-to-linear mapping for that virtual address to a non-existant physical/linear address (by modifying the appropriate page-table entry)
  4. switch the CPU cache into frozen mode
  5. move the data from the CPU registers into the CPU cache
What I don't know yet is: whether CPUs will throw faults in step 3 (or maybe even step 4) and render this idea impossible to implement/execute. After all, this isn't quite conforming with a CPU's understanding of "correct" memory management. It'll be a few days before I'll have time to actually implement and verify this idea - I'll let you know.

One last note: obviously, if one has 4GB of RAM then there would be no invalid linear addresses - PAE might be a possibility, but that's a problem for much later.