What is a Mysql Transaction for?

Asked

Viewed 10,782 times

40

I was studying some examples of Mysql with PHP and came across as following code snippet:

try {
    $db->beginTransaction();

    $db->query('query 1');
    $db->query('query 2');
    $db->query('query 3');

    $db->commit();
} catch (Exception $e) {
    $db->rollback();
}

From what I could tell from researching the methods, it’s called Transaction, but I couldn’t understand why this is used. I would like to know how it works, and some examples of real situations where this could be useful.

5 answers

44


TL;DR

SQL databases in general, not only Mysql, are transactional, that is, they allow you to perform a sequence of operations as an indivisible block in order to ensure data integrity in a concurrent access environment.

The Problem

In the example quoted in the question, imagine that the #1, #2 and #3 queries are operations that affect the database and we don’t use a transaction to control them. Let’s use as an example an e-commerce:

  1. Updates customer delivery data
  2. Enter a new record of the purchase made, checking if you have stock
  3. Debit the stock of products

Now imagine two customers trying to finish their purchases in this fictional e-commerce. The server receives two requests almost simultaneously and starts processing the requests in the above sequence. The two orders are being processed simultaneously in different threads.

Imagine also that both customer A and customer B selected a product that has only one unit in stock. We can end the next execution line:

  1. Thread A updates customer data A (step #1), checks the stock inserts the purchase record (step #2)
  2. Thread A is locked and B is executed
  3. Thread B updates customer data B (step #1), checks stock inserts purchase record (step #2)
  4. Thread B updates the stock, which is now zeroed
  5. Thread B is locked and A is executed
  6. Thread A updates the stock, which is now negative!

Note that although the code checks the stock the order of execution makes the check is not guaranteed in the next step.

The Solution

Transactional databases use the concept ACID:

  • Atomicity: a transaction is a sequence of operations indivisible, or is executed as a whole, or all is undone.
  • Consistency: at the end of the transaction, the status of the data shall be consistent.
  • Isolation: although some systems allow breaking the isolation, in general an ongoing transaction cannot be accessed by other transactions so as to avoid reading an inconsistent state, a "dirt".
  • Durability: in case of success (commit) the persistence of the data must be guaranteed

To ensure these concepts, in general, databases use locks when simultaneous access to the same data structure occurs. That is, if someone is already messing with the dice, the others have to wait for him to finish and wait their turn in line.

In practice

By using transactional databases, we can take advantage of this management control by the Relational Database Management Systems (DBMS).

Including the concept of ACID transaction in the previous example, let’s see how the execution:

  1. Thread A starts a transaction, updates customer A data (step #1), checks stock inserts purchase record (step #2)
  2. Thread A is locked and B is executed
  3. Thread B starts a transaction, but when trying to update client B data it is blocked because A’s transaction is not over
  4. Thread A updates the stock, which is now zeroed, and does commit in the transaction.
  5. Thread B is unlocked and runs
  6. Thread B updates customer data B (step #1), checks the stock and returns an error as it does not find the product available
  7. Thread B executes a rollback to undo the changes you have already made

The end result is as if only thread A had run and B never existed.

Not everything is a bed of roses

There are some problems inherent to ACID transactions, being the performance the biggest of them.

While it is important to ensure data integrity, for many systems where availability is the most critical factor, a model that blocks simultaneous access becomes unviable.

This is one of the main factors for the emergence and adoption of various nontransactional database systems and Nosql.

The important thing is to understand that the use of transactions has a cost and sometimes this can be too high. One of the most common representations of trade-off data persistence is as follows (withdrawal of this article):

Trade-off entre propriedades de bancos de dados

The graph demonstrates that consistency, availability and partitioning (scaling the database at several nodes) are resources that affect each other. You just can’t have the best of the three, according to the theorem of CAP.

Relational databases generally sacrifice partitioning for consistency and availability, while some Nosql systems sacrifice data consistency.

14

Transactions guarantee atomicity to a set of operations, that is, or all of them will succeed, or all of them will fail. A classic example is a bank transfer, where two changes are required in your records: 1) reduce the balance of account A by X; 2) increase the balance of account B by X. If the first operation is executed, and then the system presents a failure, your database will be inconsistent: there is X minus on account A, but account B continues with the same balance. Reversing operations does not help much because the reverse case may occur.

In its example, DBMS ensures that - in its persistent storage - or all query 1, query 2 and query 3 will successfully complete (commit), or the bank shall remain in the same state as before the query 1 get started (rollback). Note that a rollback explicit is not necessary for this guarantee to be met - if the system fails before the commit, when it is restarted the DBMS will take care to undo the partial actions. Likewise, if the system fails soon after the commit - before the persistence of the rest of the data is assured - the DBMS after the reboot will complete the partial actions as recorded in its journal (Journal).

Another benefit of the transactions, as pointed out by @Ernandes, is to promote the isolation access through concurrent processes. If a process has initiated a sequence of darlings, and another process wants to access the bank, it would not be correct for it to access it in the [provisional] state in which the first process left it. Not only would the bank be [potentially] inconsistent, but there is no guarantee that those changes will actually be made (as the first process may give rollback). Still, changes by process 2 could affect process operations 1.

There are various levels of isolation, which depend on DBMS and other factors:

  • read uncommited - one transaction can read the writings of another before the commit;
  • read commited - a transaction can only read data that has passed through commit. However, two or more successive [equal] readings may return different data (if other transactions occur during the course of the first);
  • repeatable read - In addition to the above feature, it is ensured that every line read once during the course of the transaction remains the same during subsequent readings. Still, it is possible that one query returns a different number of different lines and/or lines (if the competing transaction added a new line, for example);
    • Note: this is the standard level of Innodb in Mysql.
  • serializable - the above phenomenon does not occur, every equal reading will have an equal result (alternatively: the system will not allow a commit if there is a collision between two competing transactions).
  • 2

    Excellent! gives a more academic view on the subject, posted more practical, but your answer is also very valid!

  • 1

    @mgibsonbr "Note that an explicit rollback is not necessary for this guarantee to be met - if the system fails before the commit, when it is restarted the DBMS will take care to undo the partial actions." Not really necessary? mysql tries to do the reversal if something fails?

  • 2

    @Jorgeb. I refer to a "catastrophic" failure such as running out of power and turning the machine off. In this case, restarting the bank will do the reversal automatically. But in the normal course of the program, it is good practice to put an explicit rollback (even to avoid "junk" in the BD logs).

  • 1

    @mgibsonbr If we have 2 transactions to be treated at the same time, in the first we are writing in table x, in the second if we try to write in table x at the same time, what happens?

  • 1

    @Jorgeb. That’s a subject quite a lot complex, it would be better to open a new question to that. Personally, I don’t know what happens, but I suspect that both writings will be successful, only the order of the same is undefined (i.e. one can overwrite the data of the other, if moving on the same line).

13

Mysql when using some more recent storage plugin (Innodb, Xtradb) has support for transações, basically, commit and rollback, but what are they for?

When starting a transaction, you basically tell mysql to mark the current state of the database, after making some queries that change the database (UPDATE, INSERT, DELETE) the results are only available in your session, and have not persisted in the database yet, when using the commit command is that you will be persisting the changes, if something goes wrong halfway, you can make a rollback and leave the bank exactly as it was when started.

Example:

SELECT * FROM EXEMPLO;
------
id    nome
1     Foo

START TRANSACTION;
INSERT INTO EXEMPLO (id, nome) VALUES (2, 'Bar');

# resultado da select para sua sessão:
SELECT * FROM EXEMPLO;
------
id    nome
1     Foo  
2     Bar

# resultado da select para outras sessões
SELECT * FROM EXEMPLO;
------
id    nome
1     Foo  

# Persistindo, caso Commit

COMMIT;

# resultado geral
SELECT * FROM EXEMPLO;
------
id    nome
1     Foo
2     Bar

# Caso Rollback

ROLLBACK;

# resultado geral
SELECT * FROM EXEMPLO;
------
id    nome
1     Foo
  • Just to reinforce, transaction control serves well also for the case of several followed INSERTS that depend on each other. For example, a Class that when persisted will create record in other tables. If in the middle of the process, an INSERT fails, the programmer can call the rollback and undo everything, all the data before "entered" did not exist.

4

Transaction (Transaction) in MYSQL nothing more than an execution control, that is, if the requests are satisfied, the same is accepted, otherwise the database undoes all changes associated with that request.

This way you make the confirmation (Commit) and keeps actions if it is executed properly, but if an error occurs, it undoes the action (Rollback).

A Rollback can be generated by an invalid entry, internal server error, crash, reboot, shutdown during the transaction. It can be defined within a MYSQL, this way being more secure, because the communication of the commit and/or the rollback will be made directly in the database and not through a connector of the database in another language. Placing the commit inside a precedent with proper transaction control prevents duplicate entries if validated correctly.

Banks, financiers and the like use this concept to ensure the integrity of information. When a transaction control is made we need to scale which data can be lost if a Rollback, there are data that after registering with a query (Select) can be validated and reused, thus reducing the procedure in a new attempt to insert this information. That way they’d be executed Commit and Rollback for each of these items and always revalidating the information.

Basic reference https://dev.mysql.com/doc/refman/5.0/en/commit.html

Always use Innodb, if another type can change it in the form below

ALTER TABLE clientes TYPE = InnoDB;
ALTER TABLE clientes ENGINE = InnoDB;

To commit and rollback inside the database, set after Begin

BEGIN

DECLARE exit handler for sqlexception
  BEGIN
  ROLLBACK;
END;

DECLARE exit handler for sqlwarning
 BEGIN
 ROLLBACK;
END;

START TRANSACTION;

[Seu codigo MYSQL]

COMMIT;

END

This way your transaction will be 100% controlled by the base and more secure.

4

A transaction server to ensure a set of querys is executed.

Imagine a transaction between current accounts, take money from one account and deposit into the other, You will have to:

Subtract account value A and Add to Account B

If at the time you subtract the value of the account A the server stops, you will have removed the value of A, but there will be added this value in the account B

The transaction serves precisely to prevent this, it will only complete the operation if all that is requested within the transaction is carried out successfully.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.