RAID (redundant array of inexpensive disks) is a special disk configuration in which multiple disk drives build a single logical unit. This process allows files to span multiple disk devices. RAID technology provides improved reliability at the cost of performance decrease. Generally, there are six RAID levels, 0 through 5. Only three of these levels, levels 0, 1, and 5, are significant for database systems.
RAID can be hardware or software based. Hardware-based RAID is more costly (because you have to buy additional disk controllers), but it usually performs better. Software-based RAID can be supported usually by the operating system. Windows operating systems provide RAID levels 0, 1, and 5. RAID technology has impacts on the following features:
- Fault tolerance
- Performance
The benefits and disadvantages of each RAID level in relation to these two features are explained next.
RAID provides protection from hard disk failure and accompanying data loss with three methods: disk striping, mirroring, and parity. These three methods correspond to RAID levels 0, 1, and 5, respectively.
RAID 0 (Disk Striping)
RAID 0 specifies disk striping without parity. Using RAID 0, the data is written across several disk drives in order to allow data access more readily, and all read and write operations can be speeded up. For this reason, RAID 0 is the fastest RAID configuration. The disadvantage of disk striping is that it does not offer fault tolerance at all. This means that if one disk fails, all the data on that array become inaccessible.
RAID 1 (Mirroring)
Mirroring is the special form of disk striping that uses the space on a disk drive to maintain a duplicate copy of all files. Therefore, RAID 1, which specifies disk mirroring, protects data against media failure by maintaining a copy of the database (or a part of it) on another disk. If there is a drive loss with mirroring in place, the files for the lost drive can be rebuilt by replacing the failed drive and rebuilding the damaged files. The hardware configurations of mirroring are more expensive, but they provide additional speed. (Also, hardware configurations of mirroring implement some caching options that provide better throughput.) The advantage of the Windows solution for mirroring is that it can be configured to mirror disk partitions, while the hardware solutions are usually implemented on the entire disk.
In contrast to RAID 0, RAID 1 is much slower, but the reliability is higher. Also, RAID 1 costs much more than RAID 0 because each mirrored disk drive must be doubled. It can sustain at least one failed drive and may be able to survive failure of up to half the drives in the set of mirrored disks without forcing the system administrator to shut down the server and recover from file backup. (RAID 1 is the best-performing RAID option when fault tolerance is required.)
Mirroring also has performance impacts in relation to read and write operations. When mirroring is used, write operations decrease performance, because each such operation costs two disk I/O operations, one to the original and one to the mirrored disk drive. On the other hand, mirroring increases performance of read operations, because the system will be able to read from either disk drive, depending on which one is least busy at the time.
RAID 5 (Parity)
Parity is implemented by calculating recovery information about data written to disk and writing that parity information on the other drives that form the RAID array. If a drive fails, a new drive is inserted into the RAID array and the data on that failed drive is recovered by taking the recovery information (parity) written on the other drives and using this information to regenerate the data from the failed drive. The advantage of parity is that you need one additional disk drive to protect any number of existing disk drives.
The disadvantages of parity concern performance and fault tolerance. Due to the additional costs associated with calculating and writing parity, additional disk I/O operations are required. (Read I/O operation costs are the same for mirroring and parity.) Also, using parity, you can sustain only one failed drive before the array must be taken offline and recovery from backup media must be performed. Because disk striping with parity requires additional costs associated with calculating and writing parity, RAID 5 requires four disk I/O operations, whereas RAID 0 requires only one operation and RAID 1 two operations.