Performance issues using UPDATE

Asked

Viewed 1,760 times

2

I am building an import application that takes the lines of a txt file and updates it into two tables, works normally, however the files usually have more than 5 thousand lines, and I am not being able to update all tables with my current method. How could I optimize this code?

foreach($file as $line):
            $row = explode(';', $line);

            $prodcode = $row[0];
            $prodcurrentinv = $row[3];
            $prodprice = number_format(intval($row[5]), 2, '.', '') . '00';
            $prodretailprice = $prodprice;

            $query = $pdo->prepare('UPDATE isc_products SET prodcurrentinv = :prodcurrentinv, prodretailprice = :prodretailprice, prodprice = :prodprice WHERE prodcode = :prodcode LIMIT 1');
            $query->bindParam(':prodcurrentinv', $prodcurrentinv, PDO::PARAM_STR, 11);
            $query->bindParam(':prodretailprice', $prodretailprice, PDO::PARAM_STR, 11);
            $query->bindParam(':prodprice', $prodprice, PDO::PARAM_STR, 11);
            $query->bindParam(':prodcode', $prodcode, PDO::PARAM_STR, 12);
            $query->execute();

            $query = $pdo->prepare('UPDATE isc_product_variation_combinations SET vcstock = :vcstock WHERE vcsku = :vcsku LIMIT 1');
            $query->bindParam(':vcstock', $prodcurrentinv, PDO::PARAM_STR, 11);
            $query->bindParam(':vcsku', $prodcode, PDO::PARAM_STR, 12);
            $query->execute();
        endforeach;
  • You’re not getting why? What’s the mistake?

  • No error, The problem is the performance of the updates, as they are more than 5 thousand lines, so the code is running for about 2 minutes until it stops and ends up not doing the update of everything.

  • There is no possibility to use ETL tools for this operation?

  • No, I think I would need to merge these two Updates, and use a better method to go through each line of the file.

  • Why recreate the Pro-red statement in each loop? The purpose of Prepared statements is precisely to accelerate the interpretation process by creating it only once.

3 answers

4

You should consider operations done in Bulk (batch) as discussed in this reply in English.

Basically, you create a temporary table and populate it via Bulk. Once done, you update your table by doing a Join with this temporary table.

You should now search how to do Bulk operations on Mysql.

Operations in Bulk should always be considered when the mass of data is too large. It is often this way that companies integrate systems that generate immense amounts of data.

It usually works like this: you assemble a text file with the data, where the columns can be separated by tab, for example. Once that’s done, you put that file somewhere where the bank server has access. With specific commands you can make the database load this file and insert the data into a table. Because I’m specific that I won’t be able to help you any more than that. You need to see how it works in Mysql.

Anyway, the concept is general. It has in Oracle, SQL Server, Mysql, etc.

  • 1

    Modesty yours :) Although not giving the details step by step, your answer presents the best solution to the problem! I took the liberty of cutting the apologies.

  • Thanks @bfavaretto. I’m glad I helped. My apologies were exactly for not posting a step-by-step solution. :-)

2

You can try updating multiple records by query using CASE.

For example, instead of:

UPDATE tabela SET valor = 123 WHERE indice = 1;
UPDATE tabela SET valor = 456 WHERE indice = 2;
UPDATE tabela SET valor = 789 WHERE indice = 3;

You can use:

UPDATE tabela SET valor = CASE indice
    WHEN 1 THEN 123
    WHEN 2 THEN 456
    WHEN 3 THEN 789
END

Following this format, you can assemble your script to do as follows:

// prepara os dados
foreach($file as &$line)
{
    $row = explode(';', $line);
    $line = array(
        $row[0],
        $row[3],
        number_format(intval($row[5]), 2, '.', '') . '00'
    );
}

// monta a query
$query = 'UPDATE `isc_products` SET ';

$query .= '`prodcurrentinv` = CASE `prodcode` ';
foreach ($file as $line)
    $query .= ' WHEN ' . $line[0] . ' THEN ' . $line[1];
$query .= ' END';

$query .= ', `prodprice` = `prodretailprice` = CASE `prodcode` ';
foreach ($file as $line)
    $query .= ' WHEN ' . $line[0] . " THEN '" . $line[2] . "'";
$query .= ' END';

// executa o trem
$pdo->exec($query);

// monta a outra query
$query = 'UPDATE `isc_product_variation_combinations` SET `vcstock` = CASE `vcsku` ';

foreach ($file as $line)
    $query .= ' WHEN ' . $line[0] . " THEN '" . $line[1] . "'";
$query .= ' END';

// executa o trem
$pdo->exec($query);

Note that it is not practical to use Prepared statements in that case, then make sure to filter the data that will be entered in the query to avoid SQL Injection.

1

You should have the bank do all this in a single query.

Currently you are using foreach to run each line containing a query. Really, depending on the amount of executions can get slow and give end up resulting in a Timeout.

A general tip, do everything you can in a single query, avoid doing multiple querys, the more querys your system will perform less.

But, if even so, you believe that it is still necessary to make these 5,000 querys for things to work, reflect a little, because there may be some problem in the logic of your system, try to improve it.

You can also, generate the foreach with all your querys in a variable, and run it only at the end, even then it will not be the ideal solution.

LIMIT no foreach

Try to draw a line in yours foreach.

$i = 0;
foreach($file as $line){
    if($i <= 100){
       // executa query
    }
    $i++
}

LIMIT

Try to use the LIMIT, so you can limit to 1,000 records for example. And run again until all your records are updated according to conditions.

set_time_limit()

You can also try to change the execution time of the script using the function set_time_limit(). The default running time of a script is 30 seconds, you can set as set_time_limit(0) so you indicate that you have no running time.

Reference:

  • Yes, The problem is that I have to go through all 5mil lines of the file and update in the isc_product_variation_combinations and isc_products table where there is the product code of the current line.

  • I updated, maybe LIMIT will help you in this task, avoiding Timeout.

  • But in this case I am already using LIMIT 1, because the update is only in one line of the file per loop. what is complicating. How could I update to two tables in the same query?

  • @Raphaelcastro Yes, the problem is this one million querys, updated, try to use set_time_limit().

  • The Server is already configured to have a long running time, but ends up overloading.

  • A limit on your foreach doesn’t solve? Have it done 100 to 100, for example.

  • How can I make it 100 out of 100? And how could I join these two Updates into a single query and run two updates in the same query per loop? I think the explosion also ends up consuming a lot of performance.

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.