What is the difference between vertical and horizontal scaling?

Asked

Viewed 13,327 times

24

I’ve recently seen some things about cloud, BD’s, etc. In some of these subjects I see quite the term scheduling. Faced with this theme I have the following doubts:

  • The difference between vertical and horizontal scaling?
  • There are other types of scheduling?
  • 2

    See: http://stackoverflow.com/questions/11707879/difference-between-scaling-horizontally-and-vertically-for-databases

3 answers

34


Scaling up

The vertical scheduling is to put more memory capacity (main and/or mass) and processing. Ie, is to buy a more powerful hardware to handle.

In some cases just create more processes/threads who’s doing the scaling up, since the hardware already supports this increase. There are cases that separating the database on multiple storage devices is already a vertical scale, again is scaling on the same machine.

The investment is basically in hardware. Buying more processor. Memory and storage already has an increased capacity.

In some cases it is more a matter of making a simple configuration to achieve what the single hardware already supports.

It can also be to optimize the application so that it performs better and meets more than before.

Scaling out

The horizontal scheduling is to put more computers to do the job. Of course they add more processing capacity and memory too, in total sum.

It is much more complex to scale horizontally both from a management and programming point of view, even though there are tools to make it easier. It’s not just putting in the computers, they need to "speak up" consistently and appropriately. Actually this is considered the most difficult problems to solve in computing.

Incredible as it may seem can be cheaper than the vertical, at least in the cost of acquiring the infrastructure since it is possible to acquire simpler and more common hardware that is usually cheaper by the production scale. Of course the cost of management and development can change the total cost.

Apart from the cases that the vertical would not hold the need, after all this strategy has a limit that in theory the horizontal does not have, the horizontal has the advantage of being more tolerant to failures, or at least be easier to have the operation back in case of some failure.

Differences

Any minimally structured database can do both types of scheduling. The vertical does not need any specific property except in the case of separating data on multiple storage devices or allowing multiple processing lines, so it is not simple to do in certain modelling. The horizontal needs mechanisms that allow and, if possible, facilitate horizontal scaling. It doesn’t matter if it’s a nonrelational type or not, whether it uses SQL or not.

I don’t know any other guys, I don’t even know if it’s possible. There are variations of these forms, mainly horizontal there are many strategies and techniques. Can also do a hybrid scheduling.

Applications that need a horizontal scale are rare. At least in the same scale sense. It may be useful to do so because of the greater reliability of having more than one node meeting the requests, but not because it needs more resources. The bulk of the need comes from high demand web applications or very specific processing.

It may not look the same but it has a question that is related, showing that scaling anything horizontally may seem like the solution, but it does not always solve or compensate for the difficulty that is inherent in it.

  • Thanks for the clear explanation @bigown

7

Staggering Horizontal you add more machines into your resource pool while scheduling Vertical means you add more power (CPU, RAM) for an existing machine.

In a Horizontal database is often based on the partitioning of data, i.e., each node contains only part of the data.

In the Vertical database the data resides in a single node and the dimensioning is done through multi-core, that is, it spreads the load between the CPU and the RAM.

In horizontal scheduling it is often easier to resize dynamically by adding more machines, whereas in Vertical it is often limited to the ability of a single machine to scale,in addition to that capacity often involves downtime and comes with an upper limit.

Example of horizontal scheduling: Cassandra, Mongodb..

Example of vertical scheduling Mysql - Amazon RDS (The cloud version of Mysql). It provides an easy way to scale vertically from small to larger machines. This process often involves downtime.

In-Memory Data Grids as Gigaspaces XAP , Coherence etc .. are often optimized for both horizontal and vertical scaling simply because it is not required to disc. Horizontal through partitioning and vertical through multi-core support.

SOURCE

5

Just for contextualization, what generates the need to do some sort of scheduling is that the growth of Internet access has caused large social networks, search systems, among others, to receive a large amount of data. Due to this growth, a large amount of data is generated and this data is valuable as it is used as a source of information for strategic decision-making and data mining.

In the horizontal scheduling, several computers or virtual machines run the same application and the load of users is distributed among them. This way, in case of updates, it is not necessary to leave the whole system out of the air, but only one machine at a time. No vertical scheduling, instead of designating multiple machines for this function, a single machine is used and, when necessary, invests in its improvement, for example, investing in a higher capacity HD or a processor and a faster connection to support the high number of simultaneous accesses. This is due to the centralized nature of relational banks, which makes it necessary for them to be on a single machine. The problem of this type of scheduling is the cost-benefit ratio, because realizing a good performance at the beginning, the investment in hardware generates little improvement of system performance. To alleviate these problems there are some techniques that make it possible to extend/extend the use of relational databases, such as sharding, denormalization and distributed cache. However, these techniques only attempt to compensate for the limitations of these banks with horizontal scaling.

Reference:

Browser other questions tagged

You are not signed in. Login or sign up in order to post.