When learning more about computers and how they work, you will occasionally run across something that does not seem to make sense. With that in mind, does emptying disk space actually speed computers up? Today’s SuperUser Q&A post has the answer to a puzzled reader’s question.
Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-driven grouping of Q&A web sites.
SuperUser reader Remi.b wants to know why emptying disk space seems to speed up a computer:
“Why does emptying disk space speed up computers?”
It does not, at least not on its own. This is a really common myth. The reason it is a common myth is because filling up your hard drive often happens at the same time as other things that traditionally could slow down your computer (A). SSD performance does tend to degrade as they fill, but this is a relatively new issue, unique to SSDs, and is not really noticeable for casual users. Generally, low free disk space is just a red herring.
For example, things like:
1. File fragmentation. File fragmentation is an issue (B), but lack of free space, while definitely one of many contributing factors, is not the only cause of it. Some key points here:
- The chances of a file being fragmented are not related to the amount of free space left on the drive. They are related to the size of the largest contiguous block of free space on the drive (i.e. “holes” of free space), which the amount of free space happens to put an upper bound on. They are also related to how the file system handles file allocation (more below). Consider: A drive that is 95% full with all the free space in one single contiguous block has 0% chance of fragmenting a new file (C) (and the chance of fragmenting an appended file is independent of the free space). A drive that is 5% full but with data spread evenly over the drive has a very high chance of fragmentation.
- Keep in mind that file fragmentation only affects performance when the fragmented files are being accessed. Consider: You have a nice, defragmented drive that still has lots of free “holes” in it. A common scenario. Everything is running smoothly. Eventually, though, you get to a point where there are no more large blocks of free space remaining. You download a huge movie, the file ends up being severely fragmented. This will not slow down your computer. All of your application files and such that were previously fine will not suddenly become fragmented. This may make the movie take longer to load (although typical movie bit rates are so low compared to hard drive read rates that it will most likely be unnoticeable), and it may affect I/O-bound performance while the movie is loading, but other than that, nothing changes.
- While file fragmentation is certainly an issue, often times the effects are mitigated by OS and hardware level buffering and caching. Delayed writes, read-ahead, strategies like the prefetcher in Windows, etc., all help reduce the effects of fragmentation. You generally do not actually experience significant impact until the fragmentation becomes severe (I would even venture to say that as long as your swap file is not fragmented, you will probably never notice).
2. Search indexing is another example. Say that you have automatic indexing turned on and an OS that does not handle this gracefully. As you save more and more indexable content to your computer (documents and such), indexing may take longer and longer and may start to have an effect on the perceived speed of your computer while it is running, both in I/O and CPU usage. This is not related to free space, it is related to the amount of indexable content you have. However, running out of free space goes hand in hand with storing more content, hence a false connection is drawn.
3. Anti-virus software (similar to the search indexing example). Say that you have anti-virus software set up to do background scanning of your drive. As you have more and more scannable content, the search takes more I/O and CPU resources, possibly interfering with your work. Again, this is related to the amount of scannable content you have. More content often equals less free space, but the lack of free space is not the cause.
4. Installed software. Say that you have a lot of software installed that loads when your computer boots, thus slowing down start-up times. This slow down happens because lots of software is being loaded. However, installed software takes up hard drive space. Therefore, hard drive free space decreases at the same time that this happens, and again a false connection can be readily made.
5. Many other examples along these lines which, when taken together, appear to closely associate lack of free space with lower performance.
The above illustrates another reason that this is such a common myth: While the lack of free space is not a direct cause of slow down, uninstalling various applications, removing indexed or scanned content, etc. sometimes (but not always; outside the scope of this answer) increases performance again for reasons unrelated to the amount of free space remaining. But this also naturally frees up hard drive space. Therefore, again, an apparent (but false) connection between “more free space” and a “faster computer” can be made.
Consider: If you have a machine running slowly due to lots of installed software, etc., clone your hard drive (exactly) to a larger hard drive, then expand your partitions to gain more free space, the machine will not magically speed up. The same software loads, the same files are still fragmented in the same ways, the same search indexer still runs, nothing changes despite having more free space.
“Does it have something to do with searching for memory space to save things?”
No. It does not. There are two very important things worth noting here:
1. Your hard drive does not search around to find places to put things. Your hard drive is stupid. It is nothing. It is a big block of addressed storage that blindly puts things where your OS tells it to and reads whatever is asked of it. Modern drives have sophisticated caching and buffering mechanisms designed around predicting what the OS is going to ask for based on the experience we have gained over time (some drives are even aware of the file system that is on them), but essentially, think of your drive as just a big dumb brick of storage with occasional bonus performance features.
2. Your operating system does not search for places to put things, either. There is no searching. Much effort has gone into solving this problem as it is critical to file system performance. The way that data is actually organized on your drive is determined by your file system. For example, FAT32 (old DOS and Windows PCs), NTFS (later editions of Windows), HFS+ (Mac), ext4 (some Linux systems), and many others. Even the concept of a “file” and a “directory” are merely products of typical file systems — hard-drives know nothing about the mysterious beasts called files. Details are outside the scope of this answer. But essentially, all common file systems have ways of tracking where the available space is on a drive so that a search for free space is, under normal circumstances (i.e. file systems in good health), unnecessary. Examples:
- NTFS has a master file table, which includes the special files $Bitmap, etc., and plenty of meta data describing the drive. Essentially it keeps track of where the next free blocks are so that new files can be written directly to free blocks without having to scan the drive every time.
- Another example: Ext4 has what is called the bitmap allocator, an improvement over ext2 and ext3 that basically helps it directly determine where free blocks are instead of scanning the list of free blocks. Ext4 also supports delayed allocation, that is, buffering of data in RAM by the OS before writing it out to the drive in order to make better decisions about where to put it to reduce fragmentation.
- Many other examples.
“Or with moving things around to make a long enough continuous space to save something?”
No. This does not happen, at least not with any file system I am aware of. Files just end up fragmented.
The process of “moving things around to make up a long enough contiguous space for saving something” is called defragmenting. This does not happen when files are written. This happens when you run your disk defragmenter. On newer editions of Windows, at least, this happens automatically on a schedule, but it is never triggered by writing a file.
Being able to avoid moving things around like this is key to file system performance, and is why fragmentation happens and why defragmentation exists as a separate step.
“How much empty space should I leave free on a hard disk?”
This is a trickier question to answer (and this answer has already turned into a small book).
Rules of thumb:
1. For all types of drives:
- Most importantly, leave enough free space for you to use your computer effectively. If you are running out of space to work, you will want a bigger drive.
- Many disk defragmentation tools require a minimum amount of free space (I think the one with Windows requires 15 percent, worst case) to work in. They use this free space to temporarily hold fragmented files as other things are rearranged.
- Leave space for other OS functions. For example, if your machine does not have a lot of physical RAM, and you have virtual memory enabled with a dynamically sized page file, you will want to leave enough space for the page file’s maximum size. Or if you have a laptop that you put into hibernation mode, you will need enough free space for the hibernation state file. Things like that.
2. SSD-specific:
- For optimum reliability (and to a lesser extent, performance), SSDs require some free space, which, without going into too much detail, they use for spreading data around the drive to avoid constantly writing to the same place (which wears them out). This concept of leaving free space is called over-provisioning. It is important, but in many SSDs, mandatory over-provisioned space already exists. That is, the drives often have a few dozen more GB than they report to the OS. Lower-end drives often require you to manually leave unpartitioned space, but for drives with mandatory OP, you do not need to leave any free space. An important thing to note here is that over-provisioned space is often only taken from unpartitioned space. So if your partition takes up your entire drive and you leave some free space on it, that does not always count. Many times, manual over-provisioning requires you to shrink your partition to be smaller than the size of the drive. Check your SSD’s user manual for details. TRIM, garbage collection, and such have effects as well, but those are outside the scope of this answer.
Personally, I usually grab a bigger drive when I have about 20-25 percent free space remaining. This is not related to performance, it is just that when I get to that point, I expect that I will probably be running out of space for data soon and it is time to get a bigger drive.
More important than watching free space is making sure scheduled defragmentation is enabled where appropriate (not on SSDs) so that you never get to the point where it becomes dire enough to affect you.
There is one last thing worth mentioning. One of the other answers here mentioned that SATA’s half-duplex mode prevents reading and writing at the same time. While true, this is greatly oversimplified and is mostly unrelated to the performance issues being discussed here. What this means, simply, is that data cannot be transferred in both directions on the wire at the same time. However, SATA has a fairly complex specification involving tiny maximum block sizes (about 8kB per block on the wire, I think), read and write operation queues, etc., and does not preclude writes to buffers happening while reads are in progress, interleaved operations, etc.
Any blocking that occurs would be due to competing for physical resources, usually mitigated by plenty of cache. The duplex mode of SATA is almost entirely irrelevant here.
(A) “Slow down” is a broad term. Here I use it to refer to things that are either I/O-bound (i.e. if your computer is sitting there crunching numbers, the contents of the hard drive have no impact) or CPU-bound and competing with tangentially related things that have high CPU usage (i.e. anti-virus software scanning tons of files).
(B) SSDs are affected by fragmentation in that sequential access speeds are generally faster than random access, despite SSDs not facing the same limitations as a mechanical device (even then, lack of fragmentation does not guarantee sequential access due to wear leveling, etc.). However, in virtually every general use scenario, this is a non-issue. Performance differences due to fragmentation on SSDs are typically negligible for things like loading applications, booting the computer, etc.
(C) Assuming a sane file system that is not fragmenting files on purpose.
Make sure to read through the rest of the lively discussion at SuperUser via the link below!
Have something to add to the explanation? Sound off in the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.
No comments:
Post a Comment