Raid Levels Explained in 5 Minutes

Server Raid Explaination

Linux Server RAID

So what is RAID and what has it done for me lately? RAID stands for Redundant Array of Inexpensive Disks and was first proposed in an article published by the University of California Berkeley in 1987. The article proposed that several smaller disks could be combined into an array of disk drives that could provide redundancy as well as increased performance over a single large drive. 

Particularly in the instance of a production Linux server RAID can provide a means of protecting critical data by mirroring it across several disks or by increasing overall disk performance by dividing read/write operations across several hard drives.

There are several different types of RAID arrays and you should consider your requirements carefully before selecting a specific one for your Linux server. 

RAID can also be implemented either by hardware, specifically designed disk controllers that perform the task of managing, reading and writing to your RAID array independently of the operating system or by software, where the RAID array is managed by the Linux operating system itself using specific kernel modules and some additional tools. 

Whether you choose to implement hardware RAID or software RAID is a matter of personal choice. I have used both and have found hardware RAID to be more agreeable, but this is of course my personal opinion. 

Software RAID is often cheaper to implement than hardware RAID, more so when using SCSI disks though many modern SATA motherboards now have built-in RAID controllers that have made implementing Linux server RAID even easier than ever before.

As previously mentioned there are several different levels of RAID and each one offers slightly different levels of redundancy (multiple copies of the same data) and performance. Let's have a look at the most commonly used RAID configurations, what they offer and what their drawbacks might be.

Linux Server RAID - RAID Levels

Linear ModeLinear mode is best defined as when two or more disks are combined into one logical drive. The disks are joined together in the eyes of the operating system to form one disk and they are written to in a linear fashion, disk 1 will fill up first then disk 2 and so on. 

There is no redundancy in linear mode, if a disk fails then the data on that disk will be lost. You may be able to recover data from the other disks in the array however there is no guarantee that all the data written to a linear mode array will not be lost, it will depend wholly and solely on how the data was written. 

Read/write performance will not increase markedly for single read/writes, however you may see a performance boost if more than one user is accessing the array and accessing files on different disks.

RAID-0

RAID-0 is often referred to as striping because of the way that data is written to the disk, if you were writing an 8k file for example and you had two disks in the array 4k would be written to one disk and 4k would be written to the other. 

When adopted on your Linux server RAID 0 can provide a substantial increase in disk performance as the read/write operations are divided between the two disks, which the operating system sees as one drive. RAID-0 does not provide any redundancy, if one disk fails then your data is lost as half of your file is written on one disk and the other half on the other.

Raid 0

Advantages:

  • Performance boost for read and write operations
  • Space is not wasted as the entire volume of the individual disks are used up to store unique data

Disadvantages:

  • There is no redundancy/duplication of data. If one of the disks fails, the entire data is lost.

RAID-1

RAID-1 is the most common form of Linux server RAID. RAID-1 maintains an exact mirror of the information contained on one disk on the other disks in the array. 

Although it can be used with many disks it is common to only use RAID-1 with two disks only, working on the premise that only one disk will fail at a time. The obvious advantage of RAID-1 is its ease of reconstruction of the array in the event of a single disk failure. 

Write performance can be worse than a single device with RAID-1 when using software RAID as the additional copies of the same data will slow your bus down. With a hardware RAID arrangement, the extra copies are generated by the RAID controller itself thus sidestepping this bottleneck. 

Otherwise, RAID-1 performance is virtually identical to the performance of a single disk. Those who wish to safeguard their data against accidental loss will probably consider RAID-1 to be the prime candidate.

Raid 1

Advantages:

  • Data can be recovered in case of disk failure
  • Increased performance for read operation

Disadvantages:

  • Slow write performance
  • Space is wasted by duplicating data which increases the cost per unit of memory

RAID-4

RAID-4 is not widely utilized as it has a few minor drawbacks when it comes to performance however it does offer redundancy and can survive one disk in the array failing. 

RAID-4 can be used on 3 or more disks, one of the disks is used to store parity information and the other disks have the data written to them in a striped fashion similar to RAID-0. 

The main drawback to RAID-4 is the way parity information is stored on one disk and this information is updated every time any of the disks that store the data are written to. This causes the parity disk to become the bottleneck and a very fast disk is required for RAID-4 to perform at its best.

Raid 4

Advantages:

  • Efficient data redundancy in terms of cost per unit of memory
  • Performance boost for read operations due to data stripping

Disadvantages:

  • Write operation is slow
  • If the dedicated parity disk fails, data redundancy is lost

RAID-5

RAID-5 can be used on 3 or more disks and is useful for those who wish to combine a large number of disks and still retain redundancy. A RAID-5 array can survive the loss of one disk and can be compared to RAID-4 except that instead of storing parity information on one disk is spread out through all of the disks in the array, thus avoiding the performance issues this creates. 

The write speed of the array is higher than that of a single disk though still not at the same level as RAID-0 due to the necessity of writing parity information. Reads perform very comparably to RAID-0.

Raid 5

Advantages:

  • All the advantages of RAID 4 plus increased write speed and better data redundancy

Disadvantages:

  • Can only handle up to a single disk failure

RAID 6

RAID 6 uses double parity blocks to achieve better data redundancy than RAID 5. This increases the fault tolerance for up to two drive failures in the array.

Each disk has two parity blocks which are stored on different disks across the array. RAID 6 is a very practical infrastructure for maintaining high availability systems.

Raid 6

Advantages:

  • Better data redundancy. Can handle up to 2 failed drives

Disadvantages:

  • Large parity overhead

RAID 10 (RAID 1+0)


RAID 10 combines both RAID 1 and RAID 0 by layering them in the opposite order. Sometimes, it is also called a “nested” or “hybrid” RAID. This is a “best of both worlds approach” because it has the fast performance of RAID 0 and the redundancy of RAID 1. 

In this setup, multiple RAID 1 blocks are connected with each other to make it like RAID 0. It is used in cases where huge disk performance (greater than RAID 5 or 6) along with redundancy is required.

Raid 10

Advantages:

  • Very fast performance
  • Redundancy and fault tolerance

Disadvantages:

  • The cost per unit memory is high since data is mirrored

Conclusion

Understanding the RAID levels is very crucial for developing a storage infrastructure that meets the needs of the organization. RAID has the capability to protect against disk failures and provide fast performance. However, it does not provide any means to protect against data corruption or implement security capabilities.

Post a Comment

© LinuxFault. All rights reserved. Developed by Jago Desain