Can Hard Drive Defragmentation help my server perform?

Asked

Viewed 592 times

7

I always hear that it is necessary to "defragment the hard disk". But I never specifically knew the internal processes involving that operation.

A doubt for example is:

  • How software can perform this operation on hardware?

  • And because Linux file systems don’t need to be defragmented?

  • If my server where I have applications like PHP and Python has defragmented disk, I gain performance?

  • 8

    Out of curiosity. Is this question not beyond the scope of the site? BS: I have not yet denied. This question can be answered and is very interesting.

  • Well, guys, if it’s outside the scope, just close it. We don’t need to make a scene ;)

  • @durtto we do not have a super user in English.

  • @Wallacemaxters have to suggest the creation of one, as happened with Sopt? Because until then, the only source was the hardware community.

  • 3

    @Diegofelipe already tried to make a Supt. It didn’t work .(

  • 6

    I’m not sure that doing this will potentially improve applications like php or python, but it will probably improve the use of "Storage" as reading or recording data (nothing that is so noticeable), you will only feel difference if before the driver was too fragmented. On the question it seems to me that this at the limit of the scope, but the problem is that it is an extremely wide question and an answer may or may not be correct varying according to the environment it uses. However I did not vote.

  • 1

    Thank you for the sincerity @Guilhermenascimento ;)

  • Relevant discussion at goal: http://meta.pt.stackoverflow.com/q/2546

Show 3 more comments

1 answer

12


On a hard drive, yes, you will gain by defragmenting. Note that this is a logical question, nothing to do with the hardware. It just depends on how the operating system works, specifically his file system.

On a solid-state device you will not have won (maybe a little, but derisory and will pay a price too expensive since this type of storage has problems if you do a lot of writing, and defragmentation makes a lot of writing). Even more in NVRAM.

The gain is absurd for applications that run in PHP or Python, or other languages? In general no, but there may be some cases, just testing to know. Obviously these gains will occur in programs that access the disk a lot within certain standards. Accessing the disk little or making access that is essentially random will make little or no difference.

The quality of the defragmenter also counts.

Because it fragments

Summarized and simplified the file system of the main operating systems usually works as a linked list call for file allocation table. Files are divided into pages (clusters). In general these pages are being placed in sequence on the disc, glued together.

But what happens when the file increases? It probably doesn’t have space since another file should already have been saved soon after.

The page will be placed far away from there after other existing files, so the pages are allocated as a linked list and not as a fixed sequential vector, it needs to have these independent nodes to have this flexibility.

And what happens if you remove a part or all of the file?

Parts of the disk are free (not actually erased) and at some point these pages need to be repurposed, can fit an entire file there, or not. So even creating an entire file from scratch can fragment. After all, if the file does not fit in that space it already needs to be placed in other parts. In theory it is possible to have a file always having a page of another file intercalating.

Nor did I mention the competing writings of two files that may, in some situations, create a natural fragmentation. In general, Sos avoid this by pre-allocation, which is a technique that avoids fragmentation even without competition. What can be done manually when you know that the file tends to grow and benefits from not fragmenting.

Fragmentação

How Defragmentation Works

Defragmentation is to make the linked list look like a sequential vector, that is, to join all pages in the natural order to the file. Read how the linked list works on link above.

Obviously in this process it is likely that several pages will have to be temporarily relocated to another position to give sequential space to the file being defragmented.

Some shredders can do this more intelligently, including by placing the most commonly used files at the edges of the disk, where access is much faster. The perimeter of the border circumference is much larger than the circumference of the center, so it fits more data, and as the speed of both is obviously the same, in a complete turn of the disk read much more data on the edge, making the reading (or writing, of course) faster (in order to manipulate more data at the same time, not to finish earlier).

The sequential access to the data is faster in such cases, since the access is done naturally in the turn of the disk. If you have to keep looking for where the other parts are, much fun is wasted without performing a reading or writing operation. When it is fragmented the access that theoretically would be sequential happens to be random.

Remember that the magnetic tape or paper tape was totally sequential (not fractional, but also lacked flexibility), the HDD is semi-random (random on one axis and sequential on another), and the SSD is 100% random (therefore defragmentation in it is not necessary, it gets along with this form).

Linux

There is a myth that Linux does not need to defragment.

First, we’re talking about which file system? It can use several (actually Windows too, but no one uses it). It’s the Ext3? You have to defragment yes.

The positive thing about it is that it has a very good algorithm that better organizes the recording process trying to better coordinate the order used, so the fragmentation is small, but it fragments yes.

The downside is that to defragment you need to disassemble the volume (and it’s not even that simple to defragment it. And files can be placed closer to the center of the disk without immediate need, making its access slower.

There may be lags in allocation because of the algorithm, this decreases fragmentation but makes access slower in addition to increasing the potential for data loss.

It is better in Ext4 that facilitates defragmentation and has more refined techniques to prevent it from happening, such as mapping clusters and more pre-allocation facilities.

If the file system does not need to defragment, how it resolves the issues I put above file that grows or decreases in size, it needs to be embedded where another file was before?

There is only one solution, copying everything to another location, and you always need to have space for the integral copy of the largest existing file plus the size that is being added. It would be tragic in performance. There is no miracle, there is no free lunch.

There are filesystems specific ones that even do this, but they are used when it is known that these mentioned file usage patterns are never applied, or are in a controlled way.

There are others that do defragmentation during the recording process. It distributes the process over operations. Obviously the cost of recording a file ends up getting higher than expected and time goes on to be nondeterministic. Others can do this alone when the disc is not being demanded.

Everything is tradeoff.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.