What is cloud computing data redundancy?

Asked

Viewed 2,454 times

5

What is data redundancy, when it refers to cloud computing and what is its importance for the service?

  • 2

    Redundancy, in general, is you having more than one of the same, to withstand failures. If you use a 5 disk HD Raid 1, you will have 5 disks with the same data, if one fails (or if four fail) you still have the data. It is common for servers to use more than one power source, so in case of failure the other still keep running.... This is applied to all basically. Data redundancy is having the same data on multiple disks, machines, different regions. This is in order for the data to survive and be ALWAYS available, even if a hurricane occurs in one of the data centers.

  • 1

    @Inkeliz Post this as an answer.

1 answer

6


Redundancy is not something unique to computing, let alone the cloud computing, in fact it has exactly the same feature even in Portuguese.

In the Portuguese language, saying "go outside", "go up", "equal halves", "countries of the world" (...) is considered redundant (and they also give the name of vicious pleonasm), this is because there is a repetition, saying the same thing twice.

The area of computing, in general, is surrounded by "redundancies". The difference that in most cases this is considered beneficial, unlike the Portuguese language, which is a mistake.


GENERAL

Redundancy is used/created to keep the device/system available at all times, without interruptions or data loss. Redundancy is to have at least two "things" that are able to operate smoothly even if a "thing" fails to work.

You can create/utilize redundancies in multiple places, such as...

Source/PSU:

PSU

You can have two power sources, this will allow the server to remain available even if one of them dies. The server will be unavailable only if both stop working and the probability of this occurring is less the more sources there are. For personal computers this is not used, I believe.

HDD/SSD: Raid

This is perhaps the most important and the most used. Using more than one disc will allow you to have better performance and/or greater reliability. For that there is the RAID nothing more than the form that will use the disk set, After all you can use them individually.

The most common RAID modes, but that’s my opinion, is RAID 0, RAID 1 and RAID 5. I will summarize their operation. RAID 0 improves performance but reduces reliability, it splits files between disks, if one dies all files are lost. RAID 1 has all same files for all disks, so you have the same file on all disks, if a die nothing is affected, however the performance is not priority, moreover all disks have the same data, there being no "sum" of the storage capacity. RAID 5 is more complex, it splits the files but reserves 1/3 of the disk as a "backup", being a little more tolerant to failure, but if two disks die already.

You can also "add up" the RAID, such as RAID 10 (which is a RAID 1 + RAID 0) or you can use multiple RAID, for example two disks with RAID 1 and two others with RAID 0, distinct from each other.

There is also a BBU that is a specific battery for the disk cache (actually the disk controller cache), so if there is a power outage of the disks the cache will remain available. Thus, the cache data will be written when the disks are working again.

CPU: CPU

You may have two processors on the same server, but this does not indicate that you will approve of failures! Few processors are able to support one processor dying, Intel and AMD processors, as far as I know, are not tolerant of failure of this kind. If you have 2x "Intel Processors", if one of them burns it will stop working, regardless of whether the other one still works. Only IBM processors (such as the Power8) appear to be able to withstand this situation, running alone with the other burned.

Intel in turn has "redundant cores", at least have a patent for it, if this is true. If this is in use some Intel processors have extra cores that can be used in case of failure or can be used precisely to reduce the failure of the "original core", but it’s already inside him, it’s not you "who does".

RAM:

RAM

There are some Intel processors, which have a "Mirrored Mode RAM" feature, it is "identical" to RAID 1. It duplicates the RAM data to the same Host, ie there is a "RAID 1" by Channel, that way if you use 4x 16GB, it will have 32GB usable and not 64GB, assuming it’s two memories per channel, obviously.

RAM is known to be volatile, as it does not store data for long duration. So the only advantage of using this feature is that the device can be used even if a RAM per Channel crashes, provided that one RAM remains per Channel is all "100%". If all the memories of the same "channel" give problem, besides very bad luck will have a lot of blue screen.

IBM has such an RAIM, which follows the same logic above and the same idea of RAID, only for RAM. The difference is that I don’t know much information, the information about IBM is kind of obscure for me.


But it’s not over yet, no, but if I were to list it all.

You can also have generators and batteries to survive a power outage, including Stackoverflow had a problem while a hurricane was there in 2012, unless mistaken.

Besides, there’s the Internet network, after all, what’s the point of keeping everything online if there’s no Internet? A good redundancy is to use different providers and different connection routes precisely to be able to withstand the fall of one of the connections, so even if we break an optical fiber on the way we have another connection that gives account.


CLOUD

Now add up everything that’s up there, or at least the top ones, and combine that with the "cloud" system that’s available in AWS, Rackspace, Azure, Google Cloud and so on.

The idea is the same, only with more resources.

Multiple locations/"machines": Vários locais

If you are in only one location and without any kind of redundancy is already difficult to occur any problem, imagine using RAID and multiple PSU in several different countries? You can go through several drills, tsunamis (...) and the chance that this destroys all the data is difficult.

Having "cloned" servers in various regions makes the website, for example, available whenever and with the least downtime possible. You can balance traffic using DNS, so if a server is unavailable or unstable, it will use another server from another location, for example.

"Migration": Cloud

One of the differences of "cloud" is that you don’t own a server itself, unlike the dedicated one. This makes it easier to migrate between servers. Imagine that you have 5 "machines" in 5 different locations, one of them fails to work, who knows why, you can simply destroy the problematic instance and create a new one by copying what you had in the others, simple like this.

Meanwhile, if you have a dedicated server for a website, for example, and the motherboard crashes, you will have to wait until a human checks the problem, removes the card, replaces... While in Cloud, as long as you have another instance "as a reference" you can migrate it peacefully. I’ve had a problem like that in Rackspace, one of the servers of the Rackspace Database died, I don’t know why, but the website continued functioning normally and in a few minutes the support transferred to another server, black magic, then the two came back to work and had no impact either before, during or after.


When to use redundancy?

Whenever you do not want to lose the data, for example for the database it is ideal that it has disk redundancy, scattered backups and servers around the world. This will ensure that the data will be available whenever needed and that no data will be lost, thus achieving both goals.

You should use all possible redundancies, in everything that is place, if you really want the service to be always available. Imagine you own a website and want it to be 100% online, even if a server stops working. If you have several instantiations/machines the website will still work because the DNS is already pointing to the new server or the load balance has already been adjusted for the new instance. Meanwhile if you have Mysql replicates, if a Mysql server gives a tilt you can use the website normally all data and everything operating as if nothing had happened, just like this, thanks to all the redundancies.

What’s the downside of using redundancy?

In general cost. You will have to pay twice, three times, four times to have exactly the same thing you already have. You will not gain performance, will only make you more reliable.

One place where the cost of this becomes clear is in RAID 1, imagine having 4 SSD/HDD disks in RAID 1, you have exactly the same space of only 1 disk. If you have 4 1 TB disks you will have only 1 TB free to be used, only.

Now extend this to have several identical servers around the world, if you consider that having an instance/server is already expensive, having two, three or four of them equal will be even more expensive.

There is also the performance problem. Recording large volume of data on a disk can already be slow, can you imagine writing to several different disks? I’m not even counting on using a network to keep all the data synchronized between the servers. There is a negative impact on performance, I’m not saying that it prevents the use of RAID 1 or "mirrored RAM", for example, but it does exist. For performance (independent of anything) it is much better to use a RAID 0 (or even "No RAID") than to use RAID 1 in most situations.

When not using redundancy?

When data is not important or when having high availability is not a priority.

In the above example of database, you can make a RAID 1 to avoid data loss. However, if you have a database that uses a lot of temporary disk table so that you will use a RAID 1 in a temporary folder? Another option would be to use a RAID 0 only for the temporary folder. Thus, reading and writing would be infinitely faster and you do not need these temporary data.

However, this will cause that if RAID 0 gives problem the server will be unavailable. But the database data, which is on the RAID 1 disks, will still be saved, or is expected to be.

I think you can understand that. Redundancy is not used when you are willing to sacrifice service availability.

It is logical, if the database server, from the above situation, has a replica, or is a slave, this is not a problem. If RAID 0 dies from a server is still all right, there are replicates that will be used in its place.


OPINION: If you are using "cloud" I would suggest that you have at least replicates in other places, from the same provider and backups in another provider. The most critical point in my opinion is the database, there can be nothing asked for, invest as much as you can in it. In my opinion if the website/software stay offline for a few hours (or even days) is not a serious problem if compared to losing data from the database.

Remember to back up, RAID and replicates ARE NOT BACKUPS!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.