Mysql performance with Innodb in large data mass table

Question

Mysql performance with Innodb in large data mass table

Asked 11 years, 5 months ago

Viewed 4,278 times

5

I currently have a table with about 6 million records that in turn perform a large amount of I/O operations, so in the design of the project I chose to use the Innodb instead of Myisam in Mysql, after all, the lock would be per page and not per table.

But I have a huge problem, the MAJORITY of the queries made in this table is through a ** date period (datetime). By concept I tried **to share it, but I came across this limitation of Innodb

What do you suggest to improve the performance of these queries? Considering that I have a very large hardware limitation?

Below is the structure of the table.

  CREATE TABLE `sensores` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `equipamento_id` int(11) NOT NULL,
    `data_hora` datetime DEFAULT NULL,
    `valor_primario` float(10,6) DEFAULT NULL,
    `valor_secundario` float(10,6) DEFAULT NULL,
    PRIMARY KEY (`id`),
    KEY `fk_sensor_equipamento_idx` (`equipamento_id`),
    KEY `data_hora` (`data_hora`),
    CONSTRAINT `fk_sensor_equipamento_idx` FOREIGN KEY (`equipamento_id`) REFERENCES `equipamento` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
  ) ENGINE=InnoDB AUTO_INCREMENT=3515782247 DEFAULT CHARSET=utf8;

The operation consists of numerous "sensors" writing equipment reading information in this table every 15 seconds.

Most querys made are similar to instruction

SELECT * FROM sensores WHERE data_hora BETWEEN ? AND ?

Can you detail a little more about the structure of this table and what kind of changes, insertions and exclusions it usually suffers? What kind of queries are made? Without this, it is difficult to give useful answers.

– Victor Stafusa

2014/01/29 at 22:46
Innodb’s great asset are TRANSACTIONS, if you don’t use it, you don’t really have to choose Innodb. Myisam is much faster for consultation.

– Havenard

2014/01/29 at 22:53
@Victor, I’m sorry for the lack of information, I edited the question and put the table structure.

– Mauro Alexandre

2014/01/29 at 23:13
@Havenard, really, but I have a lot of writing too, and their response time is very important. In business it is difficult to tell which operation is more important.

– Mauro Alexandre

2014/01/29 at 23:14

4 answers

Browser other questions tagged mysql database performance innodb

You are not signed in. Login or sign up in order to post.

by Havenard • **547** points · Answer 1 · 2014-01-29T23:19:14+00:00

5

A simple selection like this should be no seven-headed bug for Mysql to run, even in a table of 6 million records.

You should however make sure that the fields involved in the condition are indexed, in this case the field data_hora, to allow Mysql to perform binary search and much more efficiently.

See if creating the following index the performance improves:

CREATE INDEX `data_hora` ON `sensores` (`data_hora`);

In case the Dice is created in a date_time column, would it create an Dice for each "second" ? Do you know how Mysql would behave ?

– Mauro Alexandre

2014/01/29 at 23:21
@Mauroalexandre It creates the indexes for the values that are in the table, there is no "for every second" or any other existing time unit. The indexes in general have the format of a B tree.

– Victor Stafusa

2014/01/29 at 23:23
The index is a kind of metatable that contains the pointers of each record organized by the specified field. As they are in order, Mysql has the opportunity to perform binary search in the table.

– Havenard

2014/01/29 at 23:23
Another feature that you can use is to increase the cache of Innodb, this will cause more information, possibly even the entire table, to be loaded in RAM ready to perform quick searches. If I’m not mistaken this is done by adjusting the size of the innodb_buffer_pool_size in the archive my.cnf. However, it is critical that your server has enough memory to support Mysql.

– Havenard

2014/01/29 at 23:26
@Mauroalexandre If I add in a table with the indexed date field, three dates, one in 1950, one in 2000 and the other in 2050, the index will contain only 3 entries and not a few billion. So be cool about it.

– Victor Stafusa

2014/01/29 at 23:26
@Victor, right, I understand, when I say one every second is related to the business and behavior of my sensors, suppose I have a record on 01/29/2014 00:00:01, another a second later, 01/29/2014 00:00:02, and so on. If so, I would have an input value per second, do you understand ? I don’t know to what extent the creation of an input in this column would be interesting.

– Mauro Alexandre

2014/01/29 at 23:34
@Havenard, thanks for the tip, really won’t be able to get away, I’ll have to do a hardware upgrade. Thank you !

– Mauro Alexandre

2014/01/29 at 23:35
@Mauroalexandre The index size is limited by the number of records. The primary index (which indexes by the primary key) already does this. This secondary index would never be higher than the primary.

– Victor Stafusa

2014/01/29 at 23:36
1

But it is. You look at the table and see that the records are already naturally organized by date, but you have a brain capable of detecting this pattern, Mysql is not. He’s not smart, he doesn’t realize it as obvious as it is. You have to teach him, and the way to do that is by creating an index. Without the index Mysql will burramente read all table records looking for those that are within the specified period.

– Havenard

2014/01/29 at 23:36

Show 4 more comments

by Emerson Rocha • **3,710** points · Answer 2 · 2014-01-29T23:08:50+00:00

Short explanation

Tune Innodb to allow the table to be in memory
Tune Innodb to sync changes every 1 second instead of all the time
Rephrase your table. Remove unnecessary indexes, or add new ones

Settings that I recommend you don’t forget to Tunar are innodb_buffer_pool_size (to allow the bank to remain in RAM and reduce I/O), innodb_flush_method (prevent OS from duplicating cache, requires testing) and innodb-flush-log-at-trx-commit. Others can be seen in the reference at the end of this answer.

Long explanation

Partitioning shouldn’t be much help in your case. As your problem is of I/O, the tendency to improve and use SSD instead of HDD, or else minimize access to disk.

As just switching to SSD will leave fast, but still not too fast, you better have enough memory and configure your Mysql/Mariadb to allow the entire table to remain in RAM memory and limit so that the database writes changes on disk at intervals not smaller than every second, because even if the bank is completely in memory is a requirement that there is such a synchronization.

As to use Engine MyISAM, it cannot perform worse than Engine InnoDB when updates and Writes are high. the Engine MEMORY may be useful in some specific cases, but should be used as a last resort, and not infrequently Engine InnoDB can be almost as efficient as the Engine MEMORY if it is well configured.

I know you may have limited hardware, but it’s going to be hard to optimize without having at least enough memory. In that situation, the best you can do is to recommend the following paragraph.

As for reshaping your table, if your mode of use is often only to change only recent records, it is useful to create two tables, and eventually move from recent table to old table. I do this with tables that have much more data than yours and it works great. But of course, this is only useful if you don’t have Updates on old data. This data division is more efficient than using partitioning if it is well planned and allows caching with greater ease.

References you should read

by Diogo Calazans • **153** points · Answer 3 · 2014-01-29T23:14:28+00:00

One thing you can do is follow @Havernard’s suggestion, if not the table has no constant modifications.
The other is in your consultation bring just what you really need, nothing SELECT * FROM, and make sure that the field by which you will make the filter is not null
can also do the search with paging, pq possibly vc not will need to view hundreds of thousands of records of a only time
Along with all this you can index the column, but you would want to see which type of your date field, because the performance may vary according to type (date, datetime or timestamp)

by Victor Stafusa • **63,338** points · Answer 4 · 2014-01-29T23:21:37+00:00

Not that this answer exactly answers your question. But, whereas entered records are never changed or deleted and whereas you do time-based searches, then the focus of partitioning is on time.

A very simple way of partitioning time is to start creating tables by period of time. Something like sensores_11_2013, sensores_12_2013, sensores_01_2014, etc..