4
I’m trying problems to list and search data table with more than 200 thousand records.
I read in researches, that for the search system, the ideal would be to make indexes of type "fulltext" in the fields where will have a search for records.
So I did it this way:
EDITED: I changed the tables creating indexes as suggested in the comments.
I also read that counting records for paging, it would be better to use the function:
SELECT FOUND_ROWS();
I haven’t done the research part yet, but I’m already slow to list the logs according to the user.
I have an administrative area, where there will be several users responsible for the registration. Then the system has to display on the screen all the records that were made by a certain user and with paging.
So I did it this way:
public function AdmListEmpresas($link, $usercod, $pagina = 1, $ItensPorPagina = 100) {
// Definir a partir de quantos registro inicia a busca de acordo com a página atual.
$inicio = ($pagina - 1) * $ItensPorPagina;
$sql = 'SELECT SQL_NO_CACHE SQL_CALC_FOUND_ROWS
`empresas`.`nome`,
`usuarios`.`nome` AS `usuarionome`,
`equipes`.`nome` AS `equipenome`,
`setores01`.`titulo` AS `setor01titulo`,
`setores02`.`titulo` AS `setor02titulo`,
`setores03`.`titulo` AS `setor03titulo`,
`empresas`.`telprincipal`,
`empresas`.`email`,
`empresas`.`website`,
`empresas`.`plantype`,
`empresas`.`datavencimento`,
`empresas`.`datacadastro`,
`empresas`.`ativoinfo`
FROM
`' . BDtables['BD_EMPRESAS'] . '` `empresas`
LEFT JOIN `' . BDtables['BD_USERS'] . '` AS `usuarios` ON `empresas`.`useradmcod` =
`usuarios`.`cod`
LEFT JOIN `' . BDtables['BD_USERS'] . '` AS `equipes` ON `empresas`.`codequiperesp` =
`equipes`.`cod`
LEFT JOIN `' . BDtables['BD_SETORES'] . '` AS `setores01` ON `setores01`.`cod` =
`empresas`.`codsetor1`
LEFT JOIN `' . BDtables['BD_SETORES'] . '` AS `setores02` ON `setores02`.`cod` =
`empresas`.`codsetor2`
LEFT JOIN `' . BDtables['BD_SETORES'] . '` AS `setores03` ON `setores03`.`cod` =
`empresas`.`codsetor3`
WHERE
(`empresas`.`ativouser` = "1" AND
`empresas`.`ativosys` = "1" AND
`empresas`.`useradmcod` = "'.$usercod.'") OR
(`empresas`.`codequiperesp` = "'.$usercod.'") OR
(`empresas`.`usersfunccods` LIKE "%'.$usercod.'%")
ORDER BY
`empresas`.`useradmcod` LIMIT '.$inicio.', '.$ItensPorPagina.';';
$sqlcount = 'SELECT FOUND_ROWS() as "rowcount";';
$result = $this->conn->query($sql);
$numrows = $this->conn->query($sqlcount);
$numrows = $numrows->fetch_array(MYSQLI_ASSOC);
if ($result->num_rows >= 1) {
while ($row = $result->fetch_array(MYSQLI_ASSOC)) {
$listaarray[] = ['empresa' => $row['nome'],
'usuarionome' => $row['usuarionome'],
'equipenome' => $row['equipenome'],
'setor01titulo' => $row['setor01titulo'],
'setor02titulo' => $row['setor02titulo'],
'setor03titulo' => $row['setor03titulo'],
'telprincipal' => $row['telprincipal'],
'email' => $row['email'],
'website' => $row['website'],
'plantype' => $row['plantype'],
'datavencimento' => $row['datavencimento'],
'datacadastro' => $row['datacadastro'],
'ativoinfo' => $row['ativoinfo']
];
}
}
if (empty($listaarray)) { $listaarray = false; }
// Array da Paginação
$paginacaoarray = $this->Paginacao($link, $numrows['rowcount'], $pagina, $ItensPorPagina);
return array('paginacao' => $paginacaoarray,'lista' => $listaarray);
}
The table BD_EMPRESAS Has the record of all companies with more than 200 thousand records.
The table BD_USERS It may also contain more than 200,000 records, and in the listing of companies, you may have to do a search between users' Ids to search for each one’s name.
The table BD_SETORES It will have a small table of 300 records, and each company will have up to 3 sectors, where in the search it should get the name of each sector according to the ID.
In WHERE in the query, you get the logged-in user code, and search it between 3 table fields where it can be the administrator, system team, or company employee. Then it is made a search of all companies that that user participates, being able to return on average of 5 thousand records in this list. Where paging will display from 200 to 200 records. (I had no problems with paging, only with the time it takes to return records).
Still in WHERE, the fieldsativouser
and ativosys
are only to fetch records q are active, listed as a field of type enum
which stores 0 for disabled and 1 for activated. q returns only the enabled companies in the list, but a super administrator will also have to list the disabled records which can further increase the number of displayed records.
All this is then stored in an Array where it is sent to another php area that displays the results within a template compiled by a Smarty system.
Someone who has worked with giant tables could help me with a better solution??
EDITED 01:
As suggested by @bfavaretto, below is the result of EXPLAIN
:
EDITED 02: I did some tests here, the query q this way with limit displaying from 200 in 200 records, it takes 7 seconds for each query.
I realized that the SQL_CALC_FOUND_ROWS weighs a lot in the query, it alone without the JOINS takes up to more than the query with the JOINS.
If I leave the JOINS but remove the SQL_CALC_FOUND_ROWS, from 7 seconds down to 4 or 5 seconds.
If I remove SQL_CALC_FOUND_ROWS and the JOINS the query goes to less than 1 second, which would be perfect.
The problem is that JOINS need to display the results on the screens. Now I need to think of a way to replace JOINS and display the same results.
Having 200,000 records is far from being gingante. But, Mysql has the Profiler and the
EXPLAIN
, they serve precisely to say what is the bottleneck and what is problematic... Moreover if theativouser
and theativosys
,useradmcod
and thecodequiperesp
are on WHERE, and it’s direct comparison, you can create an Index with all of them. Dexes use RAM memory, if there is not enough will give in it. You can also create keys between theuseradmcod
of both banks to facilitate theJOIN
, anyway, the prince is to know what is taking long. Thefulltext
only affectsMATCH AGAINST
, as far as I know.– Inkeliz
usercod
I imagine it’s primary key... with operatorLIKE
? other than that, only create the indexes for the fields that are compared, both in Where and Join– Rovann Linhalis
Put a
EXPLAIN
before theSELECT
of your query and edit the question with the result. You should have margin for optimization in your tables.– bfavaretto
@Inkeliz So I thought 200 thousand records would be heavy, since the delay is being great, q would be more than 200 thousand companies, and more than 200 thousand users, where the company’s select is giving Join pro user, I believe this can slow tbm. I used EXPLAIN and did not understand anything about the returned result, Profiler tbm did not find on the net how to use. I created the Indice for the fields you suggested but it has not changed anything. He sang studying some way to avoid using Join to see if it gets faster.
– Fernando VR
@Rovannlinhalis, then the
usercod
is primary key in the users table, but in companies theusersfunccods
will jam in a test field the code of all users q are employees of that company, then the search uses theLIKE
to tbm find according to the logged in user if he is part of the employees of a certain company. I created the indexes for the fields that are inJOIN
but nothing has changed at runtime. I think I must be doing it wrong.– Fernando VR
@bfavaretto I edited and put the result of EXPLAIN in question. I do not understand how it works. I do not know what should be done with these results. I’ve never used this function before.
– Fernando VR
Have you put index in all columns? This is not recommended.
– Renato Junior
@Renatosilva I did not put in all columns no, I had 37 columns and 13 indices, and two were of the type Unique just to avoid equal data in the table, and I ended up leaving only 1 to reduce the number of indices. 6 were the fulltext type that I use in the search part of the frontend, I decrease to 2 indices where 1 of them contains 4 fields that are the fields for searching words. and 3 indices that total 6 more fields for joins. where when I do the listing I use the command
USE INDEX
to separate the indices used according to each query.– Fernando VR