2
I need to perform a query to know a total of operators I have. However the table has almost 18 million records, and it is taking a long time to return almost 3 minutes, and this is because I am performing direct test in the bank.
The problem will be bigger when this table is complete, more than 150 million, yes it will take an eternity, will have some way to carry out this query faster?
SELECT *, COUNT(operadora) AS total_op
FROM telefones
GROUP BY operadora
Table structure
CREATE TABLE IF NOT EXISTS `telefones` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`numero` varchar(30) NOT NULL,
`ddd` int(3) NOT NULL,
`operadora` varchar(15) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=19606524 ;
--
-- Extraindo dados da tabela `telefones`
--
INSERT INTO `telefones` (`id`, `numero`, `ddd`, `operadora`) VALUES
(31, '11920160030', 11, 'OI - Celular'),
(32, '11920160031', 11, 'TIM - Celular'),
(33, '11920160032', 11, 'Claro - Celular'),
(34, '11920160033', 11, 'VIVO - Celular'),
(35, '11920160034', 11, 'Claro - Celular'),
(36, '11920160035', 11, 'Claro - Celular'),
(37, '11920160036', 11, 'Claro - Celular'),
(38, '11920160037', 11, 'Claro - Celular'),
(39, '11920160038', 11, 'Claro - Celular'),
(40, '11920160039', 11, 'Claro - Celular'),
(41, '11920160040', 11, 'Claro - Celular'),
(42, '11920160041', 11, 'Claro - Celular'),
(43, '11920160042', 11, 'Claro - Celular'),
(44, '11920160043', 11, 'Claro - Celular'),
(45, '11920160044', 11, 'Claro - Celular'),
(46, '11920160045', 11, 'Claro - Celular'),
(47, '11920160046', 11, 'Claro - Celular'),
(48, '11920160047', 11, 'Claro - Celular'),
(49, '11920160048', 11, 'Claro - Celular'),
(50, '11920160049', 11, 'Claro - Celular'),
(51, '11920160050', 11, 'Claro - Celular'),
(52, '11920160051', 11, 'Claro - Celular'),
(53, '11920160052', 11, 'Claro - Celular'),
(54, '11920160053', 11, 'Claro - Celular'),
(55, '11920160054', 11, 'Claro - Celular'),
(56, '11920160055', 11, 'Claro - Celular'),
(57, '11920160056', 11, 'Claro - Celular'),
(58, '11920160057', 11, 'Claro - Celular'),
(59, '11920160058', 11, 'Claro - Celular'),
(60, '11920160059', 11, 'Claro - Celular'),
(61, '11920160060', 11, 'Claro - Celular'),
(62, '11920160061', 11, 'Claro - Celular'),
(63, '11920160062', 11, 'Claro - Celular'),
(64, '11920160063', 11, 'Claro - Celular'),
(65, '11920160064', 11, 'Claro - Celular'),
(66, '11920160065', 11, 'Claro - Celular'),
(67, '11920160066', 11, 'Claro - Celular'),
(68, '11920160067', 11, 'Claro - Celular'),
(69, '11920160068', 11, 'Claro - Celular'),
(70, '11920160069', 11, 'Claro - Celular'),
(71, '11920160070', 11, 'Claro - Celular'),
(72, '11920160071', 11, 'Claro - Celular'),
(73, '11920160072', 11, 'Claro - Celular'),
(74, '11920160073', 11, 'Claro - Celular'),
(75, '11920160074', 11, 'Claro - Celular'),
(76, '11920160075', 11, 'Claro - Celular'),
(77, '11920160076', 11, 'Claro - Celular'),
(78, '11920160077', 11, 'Claro - Celular'),
(79, '11920160078', 11, 'Claro - Celular'),
(80, '11920160079', 11, 'Claro - Celular'),
(81, '11920160080', 11, 'Claro - Celular'),
(82, '11920160081', 11, 'Claro - Celular'),
(83, '11920160082', 11, 'Claro - Celular'),
(84, '11920160083', 11, 'Claro - Celular'),
(85, '11920160084', 11, 'Claro - Celular'),
(86, '11920160085', 11, 'Claro - Celular'),
(87, '11920160086', 11, 'Claro - Celular'),
(88, '11920160087', 11, 'Claro - Celular'),
(89, '11920160088', 11, 'Claro - Celular'),
(90, '11920160089', 11, 'Claro - Celular');
Is there any index by the column
operadora
?– Artur Trapp
@Arturtrapp No , this direct with the name
– Marcos Paulo
'Cause you’re grouping around
operadora
? Also, try creating an index if you don’t have it, I’ve had occasions where I was able to make queries that would take three minutes to go down to less than eight seconds– Artur Trapp
Why
select *, count()
? It wasn’t just Count?– Jéf Bueno
@Arturtrapp - The Customer imports everything from a TXT, and plays on the bench, and so everything is grouped together! If I change to include Indice, I think in the import it takes a long time, will double!
– Marcos Paulo
@Arturtrapp The File reaches him in this way - 11999999999;Nextel - Cellular
– Marcos Paulo
@jbueno I need to list the name of operators and the amount of records in each one. I can do this without having to set a variable to it?
– Marcos Paulo
@jbueno without the
GROUP by
it will add up everything, if you do grouped it will show a line for each operator.– Guilherme Nascimento
@Marcospaulo I think I understand what you need, I really can’t test 150 million, but generally engines like innoDB have an internal cache, also have myISAM that they say have better performance because it is more simple. But I can’t say much.
– Guilherme Nascimento
@Marcospaulo, what is the impediment for you to put an input in the carrier column? Note that Inserts will lose a little performance, but that will depend a lot on the way you are doing it. If you do an SQL transaction the difference will be much smaller.
– jlHertel
@Guilhermenascimento, what killed me was this amount, I’ll see what I can, the worst I’ll have to count by DDD too, then it weighs even more!
– Marcos Paulo
@jlHertel I will try to change here, the import system and another programmer, he created a program to play directly in the BD, I think I will have to create a tab in the system where will be imported the TXT, but have almost 500 megas TXT
– Marcos Paulo
@Marcospaulo, the size of the TXT does not matter. The important thing is to ensure that the data are of the smallest possible type and are correctly indexed. An alternative solution that you can try is to create one more column with the operator using a numerical value, because the numbers are much smaller and much better at the time of indexing.
– jlHertel