What is RAID and why would I use it?
This entry was posted on May 27, 2017
.You may have read on forums or seen somewhere mentioning RAID 0 when bragging about their system, but not really known what it was about. RAID is an acronym that stands for Redundant Array of Inexpensive Disks. At the heart of it, RAIDing drives is something that is done to use multiple hard drives, solid state drives or other storage media in order to add performance or make a system more resilient to drive failure or sometimes do both. There are a few different types of RAID and I think once I explain each type you will have a better understanding of what RAID is really good for.
The first type I am going to go over is RAID 0. For a long time this was very popular but has slowly declined in popularity as SSD's have become more popular. In this kind of RAID you stripe the data across 2 or more drives. What I mean by this is that if you have two drives, you put one part of a file on Hard drive A and then the second part on Hard drive B. This is fantastic for speed because instead of needing to wait for one drive to read the entire file you split the load across two drives and can theoretically get it done in half the time. This comes at a great cost though because it means if one of your two drives fails then you are out of luck because all of your data is gone and it likely isn't recoverable. This means total data loss. As you add more and more drives RAID 0 makes much less sense because for every drive you add in you increase the chance of total data loss because you only need one of your many drives to fail to put the entire thing up in smoke. This has slowly become less popular because SSD's perform at such a fast speed that RAIDing hard drives can't reach even a single SSD's speed and responsiveness and very few consumers can justify the cost of more than one SSD for a negligible real-world performance boost. Going twice as fast when your response time is already nearly instant does not make a huge difference for the end user.
RAID 1
The second type I'll explain is RAID 1. This one uses mirroring. This one, as the name states, mirrors the data from half the drives to the other half. So in this scenario if you have two drives then hard drive A and hard drive B are copies of each other. You can also gain speed benefits from this because having copies of your data on two drives means that when something needs to be read it can search each hard drive simultaneously so performance is equal to the first of either drives. On the flipside when writing, the write speed has to stay with the speed of the slowest drive. One of the biggest advantages of this layout is that the RAID array can continue functioning with up to half of your drives failing. In a RAID 1 the total capacity is half the size of your total capacity as each drive has a double of itself. So if you have 4 4TB drives then you will only have 8TB's to work with. This array type does not make much sense if you have a large number of devices either because you lose so much capacity. This array is primarily used when the data is very important and can't be lost.
RAID 5/ RAID 6
The last ones I will talk about are RAID 5 and RAID 6. These are very similar so I have bundled them together. Basically how this works is that instead of purely doing striping or purely doing mirroring you do something kind of in between using a parity bit. What this means is that you stripe the data across all of the drives, but then you also stripe a portion of the data across all of the drives again so that if any of the drives fail then you will be able to recreate the lost drive with the bits of data you striped across the still functional drives. This means that you get pretty much all of the benefits of RAID 0 while also having some tolerance for drive failure. RAID 5 makes it so you only lose one drives worth of capacity and RAID 6 makes it so you lose two drives of capacity but you may also lose two drives and still have all your data. At this point most people do not use RAID 5 if they have a large array because it is very dangerous to only allow one drive to die at a time and the chance of one of the drives going bad mid rebuild is too high. Most people have moved onto RAID 6 who have 5+ drives.
To summarize: RAID 0(stripe): Fastest reads and write, don't lose any drives, total data loss if you lose any drives RAID 1(mirror): Fast reads and slow writes, can lose half of your drives with no data loss RAID 5/6(parity): only slightly slower reads and writes than RAID 0 and ability to lose one drive in RAID 5 and two drives in RAID 6