How do you set zero or null in an SQL query of a date range, for those whose value does not exist?

Asked

Viewed 1,627 times

5

Formulating the question was a little difficult, but I will try to be as explicit as possible using an example to put the question.

There are the following data in my table:

inserir a descrição da imagem aqui

The aim is to carry out a consultation sql (the DBMS used is Mysql) in the date range 2014-09-01 until 2014-09-10 and have the following result:

inserir a descrição da imagem aqui

What I already have:

To be honest, I have no idea how to do this, so far I have solved this in the control of my application, an algorithm like:

  1. Select the data in the DB, by means of a between of the desired dates and store in a list;
  2. For each day (or time) existing in the desired date range, store in an auxiliary variable (an instance of Calendar in Java);
  3. Check if In the list obtained in point 1 there is an element with this date, if there is not, add an element in the resulting list with this date, but zero in the quantity field;

Is there any way to do this already in consultation on sql? The way I do it now is not the most efficient.

  • 1

    Cold, you can create a table with periods (only with dates), and make a left join with the table you already use.

  • @Wakim, if I work with long intervals of dates, I will have to have a huge table in parallel just to save dates?

  • You only need one record per day. In the worst case, 3650 lines for 10 years. You will have to analyze if the performance gain is worth it, you can even use an index.

  • Eh... but we would be considering only days, my realization is statistical data with a considerable accuracy : every 10 seconds. A query, in my view If much more efficient, we would only consume resources when necessary and not occupy physical memory permanently and have to "join" always. But if no other possibility arises, I will think about using something like that. Thanks @Wakim

3 answers

4

You can use a generator of dates not to need intermediate tables:

SELECT
   DATE_ADD( '2014-01-01', INTERVAL d1+d2*10+d3*100 DAY ) AS diaDoAno
FROM
   ( SELECT 0 AS d1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 ) t1,
   ( SELECT 0 AS d2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 ) t2,
   ( SELECT 0 AS d3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 ) t3
WHERE
   YEAR( DATE_ADD( '2014-01-01', INTERVAL d1+d2*10+d3*100 DAY ) ) = 2014

The query looks great, but the performance is excellent.

Functioning:

  1. Each subquery generates the numbers from 0 to 9, except the last one, which generates numbers from 0 to 3;
  2. the formula d1+d2*10+d3*100 transforms the outputs of the 3 Subqueries in a number of 0 to 399
  3. with DATE_ADD( '2014-01-01', INTERVAL d1+d2*10+d3*100 DAY ) we turn this number into a date.
  4. To limit it to the days of the year in question, we use the same formula that generates the date, to ensure that we do not exceed the interval of one year (remembering that the number of days varies in leap years).


How to use:

As described in other answers, just use a JOIN with the desired table.

SELECT
   DATE_ADD( '2014-01-01', INTERVAL d1+d2*10+d3*100 DAY ) AS dia_do_ano.
   contador,
   quantidade
FROM
   ( SELECT 0 AS d1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 ) t1,
   ( SELECT 0 AS d2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 ) t2,
   ( SELECT 0 AS d3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 ) t3
LEFT JOIN
    sua_tabela ON dia_do_ano = data_tempo
WHERE
    YEAR( DATE_ADD( '2014-01-01', INTERVAL d1+d2*10+d3*100 DAY ) ) = 2014

1

Cold. As Wakin suggested and as I was writing.

Make a schedule of dates, and from this one LEFT JOIN with your table.

Here are two links with suggestions for creating this table:

Generate Table Dates 1

Generate Table Dates 2

For the values null, just add in your COUNT the verification IFNULL() and your problem will be solved.

  • Thank you @Fernando A.W. for the answer/suggestion, but, as I told Wakin, I would have to have a table (similar to time dimension in a D.W) which would probably not be so efficient anymore. If there really is no other way, I will consider this way.

  • I get it, it depends a lot on your architecture and your need. But if it’s an indexed table and everything, it wouldn’t be too many records to link to, of course, it will depend too much on the range you’d be looking for. But in parallel to this, you can use hints performance to help, follow an example Hint Performance. It’s a tip as well. But I think it’s a good solution.

1

You can create a PROCEDURE creating a temporary table before executing the query, for example:

DELIMITER //
CREATE PROCEDURE TempDateTable(data_ini DATE, data_fim DATE)

BEGIN
    DECLARE v_curdate DATE;
    SET @v_curdate = data_ini;
    START TRANSACTION;
    DROP TABLE IF EXISTS tempdatetable;

    CREATE TEMPORARY TABLE tempdatetable  (
        data_tempo DATE NOT NULL PRIMARY KEY
    );

  WHILE @v_curdate <= data_fim DO
    INSERT INTO `TempDateTable`(`data_tempo`) VALUES (@v_curdate);
    SET @v_curdate = ADDDATE(@v_curdate, INTERVAL 1 DAY);
  END WHILE;
  COMMIT;
END//
DELIMITER ;

Dice

--------------------------------------
| data_tempo | contador | quantidade |
--------------------------------------
| 2014-09-01 | CDR_SUC  | 256        |
| 2014-09-04 | CDR_SUC  | 258        |
| 2014-09-05 | CDR_SUC  | 195        |
| 2014-09-06 | CDR_SUC  | 100        |
| 2014-09-10 | CDR_SUC  | 317        |
--------------------------------------

Running Procedure

CALL `databasetest`.`TempDateTable`('2014-09-01', '2014-09-10');

Performing the consultation

SELECT tmp.data_tempo, test.contador, IFNULL(test.quantidade,0) 
FROM tempdatetable as tmp 
    LEFT JOIN  testedados as  test ON DATE(test.data_tempo) = DATE(tmp.data_tempo);

Upshot:

--------------------------------------
| data_tempo | contador | quantidade |
--------------------------------------
| 2014-09-01 | CDR_SUC  | 256        |
| 2014-09-02 | NULL     | 0          |
| 2014-09-03 | NULL     | 0          |
| 2014-09-04 | CDR_SUC  | 258        |
| 2014-09-05 | CDR_SUC  | 195        |
| 2014-09-06 | CDR_SUC  | 100        |
| 2014-09-07 | NULL     | 0          |
| 2014-09-08 | NULL     | 0          |
| 2014-09-09 | NULL     | 0          |
| 2014-09-10 | CDR_SUC  | 317        |
--------------------------------------
  • Thank you @Kaduamaral for the answer, but I believe that, as I commented with Fernando A.W. in his reply, I would always be creating a dimension of time (as in a D.W.). And the most important thing for me would be to have an auxiliary dimension, that is, it is always created when a query is made. For each query I have a table created with time (in order of 10 seconds) with a few records...

  • And if I’m not mistaken the result there is a little different from my neh friend... kkkkk. Just to note that the accountants can diversify.

  • The difference you talk about is the countryside contador? But it is quiet, if there is another way to do this, it is beyond my knowledge with SQL. I mean, that I know only in algorithm.

  • Because it is @Kaduamaral, but I already have this option as backup :D.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.