The cleaning lady and write amplification

Imagine you’re running a cafeteria. This is the real world and your cafeteria has a finite number of plates, say 200 for the entire cafeteria. Your cafeteria is open for dinner and over the course of the night you may serve a total of 1000 people. The number of guests outnumbers the total number of plates 5-to-1, thankfully they don’t all eat at once.

In actuality the write happens more like this. A new block is allocated, valid data is copied to the new block (including the data you wish to write), the old block is sent for cleaning and emerges completely wiped. The old block is added to the pool of empty blocks. As the controller needs them, blocks are pulled from this pool, used, and the old blocks are recycled in here.

While average latencies are very low, the max latencies are around 350x higher.

They are still low compared to a mechanical hard disk, but what’s going on to make the max latency so high? All of the cleaning and reorganization I’ve been talking about. It rarely makes a noticeable impact on performance (hence the ultra low average latencies), but this is an example of happening.

In the diagram above we see another angle on what happens when a write comes in. A free block is used (when available) for the incoming write. That’s not the only write that happens however, eventually you have to perform some garbage collection so you don’t run out of free blocks. The block with the most invalid data is selected for cleaning; its data is copied to another block, after which the previous block is erased and added to the free block pool. In the diagram above you’ll see the size of our write request on the left, but on the very right you’ll see how much data was actually written when you take into account garbage collection. This inequality is called write amplification.

The write amplification factor is the amount of data the SSD controller has to write in relation to the amount of data that the host controller wants to write. A write amplification factor of 1 is perfect, it means you wanted to write 1MB and the SSD’s controller wrote 1MB. A write amplification factor greater than 1 isn’t desirable, but an unfortunate fact of life. The higher your write amplification, the quicker your drive will die and the lower its performance will be. Write amplification, bad.

My first question is this. Is it possible to analyse a program while you’re using it, to see whether it is primarily doing sequential or random writes? Since there seems to be a quite clear difference between the Intel X25m 80gb and the OCZ vertex 120gb, which are the natural entry-level drives here, where the Intel works better for random access, the vertex for sequential, it would be very useful to know which I would make best use of.

Second question: does anyone know whether lightroom in particular is based around random or sequential writes? I know that a LR catalog is always radically fragmented, which suggests presumably that it is based around random writes, but that’s just an uninformed guess. It does have a cache function, which produces files in the region of 3-5mb in size–are they likely to be sequential?

Third question: with photoshop, is it specifically as a scratch disk that the intel x25m underperforms? Or does photoshop do other sequential writes, besides those to the scratch disk? I ask because if it only doesn’t work as a scratch disk, then that’s not a big problem–anyone using this in a PC is likely to have a decent regular HDD for data anyway, so the scratch disk can just be sent there. In fact, I’ve been using a vertex 120gb, with a samsung spinpoint f3 500gb on my PC, and I found that with the scratch disk on the samsung I got better retouch artists results (only by about half a second, but that’s out of 14 seconds, so still fairly significant).

just to report back, since writing the previous comment I have bought both drives, vertex and intel (the original vertex 128gb, and the intel g2 x25m). While the Intel does perform better in benchmarks, the difference in general usage is barely noticeable. Except when using lightroom 3, when the intel is considerably slower than the vertex. I’m using a canon 550d, which produces 18mpx pictures. When viewing a catalogue for the first time (without any pre-created previews), the intel takes on average about 20s to produce a full scale 1:1 preview. This is infuriating. The vertex takes about 8s. Bear in mind that i’ve got 4gb of 1333mhz ram, intel i7 q720 processor, ati 5470 mobility radeon graphics. So it’s not the most powerful laptop in the world, but it’s no slouch either. I can only conclude that when LR3 makes previews it does large sequential writes, and that the considerable performance advantage of the vertex on this metric alone suddenly becomes very important. With which in mind, I’m now going to sell the Intel and buy a vertex 2e, which will give the best of both worlds. But I’m sure there are lots of photographers out there wondering about this like I was, so hopefully this will help.

• ogreinside – Monday, December 14, 2009 – link After spending all weekend reading this article, 2 previous in the trilogy, and all the comments, I wanted to post my thanks for all of your hard work. I’ve been ignoring SSDs for a while as I wanted to see them mature first. I am in the market for a new Alienware desktop, but as the wife is letting me purchase only on our Dell charge account, I have a limited selection and budget.

I was settled on everything except the disks. They are offering the Samsung 256SSD, which I believe is the Samsung PM800 drive. The cost is exactly double that of the WD VelociRaptor 300 GB. So naturally I have done a ton of research for this final choice. After exploring your results here, and reading comments, I am definitely not getting their Samsung SSD. I would love to grab an Intel G2 or OCZ Indilinx, but that means real cash now, and we simply can’t do that yet. The charge account gives us room to pay it off at 12-month no-interest.

So at this point I can get a 2x WD VR in raid 0 to hold me over for a year or so when I can replace (or add) a good SSD. My problem is that I have seen my share issues with raid 0 on an ICH controller on two different Dell machines (boot issues, unsure of performance gain). In fact, using the same drives/machine, I saw better random read performance (512K) on a single drive than the ICH raid, and 4k wasn’t far behind. I’m thinking I may stick to a single WD VR for now, but I really want to believe raid0 would be better.