RAID means Redundant Array of Inexpensive Disks (Redundant Array of Inexpensive Disks), but now it is known as Redundant Array of Independent Disks (Redundant Array of Independent Drives). Earlier a small disk capacity are very expensive, but now we can be very cheap to buy a bigger disk. Raid is a series together, become a logical disk set.
Understand RAID settings in Linux
RAID consists of a set or even a collection of an array. Combined with a set of disk drives in a RAID array or RAID set. At least two disks connected to a RAID controller and become a logical volume, and multiple drives can be placed in a group. A set of disks can only use a RAID level. Using RAID can improve server performance. Different levels of RAID performance will vary. It is to save our data through fault tolerance and high availability.
This series is named "the use of RAID in Linux", it is divided into nine parts, including the following topics:
Part 1: Introduction and Concepts RAID level
Part 2: How to set up Linux RAID0 (striping)
Part 3: How to set up in Linux RAID1 (mirroring)
Part 4: How to Set RAID5 (striping with distributed parity) in Linux
Part 5: How to Set RAID6 (dual distributed parity stripe) in Linux
Part 6: Set in Linux RAID 10 or 0 + 1 (nested)
Part 7: Increase the existing RAID array and remove the corrupted disk
Part 8: Recovery (reconstruction) damaged drive in the RAID
Part 9: Managing RAID on Linux
This is Part 1 of 9 tutorial series, where we will introduce the concept of RAID and RAID levels, which is built in Linux RAID to be understood.
Software RAID and hardware RAID
Software RAID lower performance, because it uses the host's resources. You need to load the RAID software to read data from a software RAID volume. Before loading the RAID software, the operating system needs to boot up and load the RAID software. No physical hardware in a software RAID. Zero-cost investment.
High performance hardware RAID. They use PCI Express card physically provided with a dedicated RAID controller. It does not use the host resources. They have NVRAM cache for read and write. When the cache is used for RAID rebuild, even if there is a power failure, it will use the back-up battery power to keep the cache. For large-scale use is very expensive investment.
Hardware RAID card is as follows:
Important concepts of RAID
Check method used in RAID reconstruction regenerate the missing content from the check the saved information. RAID 5, RAID 6 parity check.
Striping is a random slice data stored across multiple disks. It does not save the entire data on a single disk. If we use two disks, each disk data storage to our half.
Mirroring is used for RAID 1 and RAID 10. Mirroring will automatically backup data. In RAID 1, it will save the same content to another disc.
Hot Backup is a backup drive on our server, it can automatically replace the failed drive. In our array, if any one drive failure, the hot spare drive will automatically be used to rebuild RAID.
Block is the smallest unit of RAID controllers each read and write data, the minimum 4KB. By defining a block size, we can increase I / O performance.
There are different RAID levels. Here, we list only the most use in a real environment RAID level.
Striping RAID0 =
RAID1 = mirror
RAID5 = single-disk distributed parity
RAID6 = double distributed parity disk
RAID10 = + mirrored stripe. (Nested RAID)
RAID using a software package called mdadm on most Linux distributions management. Let's first get to know about each RAID level.
RAID 0 / Stripe
Striping have a good performance. In RAID 0 (striping) data will use the slice approach is written to disk. Half of the content on a disk, the other half of the content will be written to another disk.
Suppose we have two disk drives, for example, if we have the data "TECMINT" wrote logical volume, "T" will be saved in the first set, "E" will be saved in the second set, 'C' will be save in the first set, "M" will be saved in the second set, it would have been to continue the cycle. (LCTT Annotation: byte is virtually impossible to slice, slice the data blocks.)
In this case, if any one drive fails, we will lose data because of a disk is only half the data can not be used to rebuild RAID. However, when comparing the write speed and performance, RAID 0 is very good. Create a RAID 0 (striping) requires at least two disks. If your data is valuable, do not use this RAID level.
RAID 0 in zero capacity loss.
Zero fault tolerance.
Read and write high performance.
RAID 1 / mirroring
Mirror also has a good performance. Mirroring our data to do a copy of the same. Suppose we have two 2TB hard drive, we have a total of 4TB, but in the mirror, but on the drive behind a RAID controller forms a logical drive, we can only see this logical drive is 2TB.
When we save the data, it is simultaneously written to two 2TB drives. Create a RAID 1 (mirrored) requires a minimum of two drives. If a disk failure occurs, we can replace a new disk recovery RAID. If a failure occurs in a disk to any RAID 1, we can get the same data from another disk, because the other disks have the same data. So is the zero data loss.
The total capacity lost half the available space.
Full fault tolerance.
Reconstruction will be faster.
Slower write performance.
Read performance better.
The operating system and can be used for small-scale databases.
RAID 5 / distributed parity
RAID 5 more for the enterprise level. RAID 5 distributed parity to work the way. Parity information will be used to reconstruct the data. It is normal the rest of the information on the drive to rebuild from. When a drive fails, it can protect our data.
Suppose we have four drives, if one drive fails then we replace the failed drive, we can reconstruct from parity data to the replacement drive. On all four drives, if we have four 1TB drives storing parity information. Parity information will be stored in each drive 256G, while others 768GB is the user's own use. After a single drive failure, RAID 5 is still working properly, if the drive is damaged number more than 1 would result in the loss of data.
Reading speed is very good.
Write speed at the average level, if we do not use a hardware RAID controller, the write speed is slow.
Reconstruction from all the drives in the parity information.
Full fault tolerance.
A disk space will be used for parity.
It can be used in a file server, Web server, a very important backup.
RAID 6 dual distributed parity disk
RAID 6 and RAID 5 is similar but it has two distributed parity. Mostly used in large quantities in the array. We need at least four drives, even if there are two drive fails, we can still reconstruct data after a new replacement drive.
5 It is slower than RAID, because the data is simultaneously written to four drives. When we use a hardware RAID controller speed is at the average level. If we have six of 1TB drives, four drives will be used for data storage, two drives will be used for verification.
Read performance is very good.
If we do not use a hardware RAID controller write performance will be poor.
Reconstruction from the two parity drives.
Full fault tolerance.
2 disk space is used for parity.
It can be used in large arrays.
For backup and video streaming, and for large-scale.
RAID 10 / mirrored stripe +
RAID 10 may be referred to 0 or 0 + 1 + 1. It will do the work of two mirror + strip. In RAID 10 is first mirrored and then do the strip. On RAID 01 first bands to do, and then do the mirror. RAID 10 is better than 01.
Suppose that we have four drives. When I write data on logical volumes, it will use for mirroring and striping will save data to four drives.
If I write data on RAID 10 "TECMINT", the data will be used in the following manner to save. First, the "T" simultaneously written to two disks, "E" will also be written to two disks, all data is written to two disks. This copy each data to another disk.
At the same time it will use RAID 0 mode to write data, follow the "T" written in the first group of disks, "E" written to the second set of disks. Again "C" written in the first group of disks, "M" to the second set of disks.
Good read and write performance.
The total capacity of the lost half of the available space.
Rapid reconstruction from the copy data.
Because of its high performance and high availability, it is often used to store the database.
In this article, we already know most of what is in the actual environment and which RAID level RAID uses. I hope you have learned above written. For building RAID must understand the basic knowledge of RAID. You can basically meet the above understanding of the RAID.
In the next article, I'll show you how to set up and use to create a variety of RAID levels, increasing the RAID group (array) and drive troubleshooting.