1
I have an E-learning system and I need to create a report to return to me how many students have passed, failed, students who do not access more than 30 days, students who are taking the course and those who have not accessed yet.
The tables that were used are
`company` Tabela de empresas;
`company_place` Tabela de centro de treinamentos;
`user` Tabela de usuários;
`user_course` Tabela que armazena a inscrição de usuários nos cursos;
`user_course_lesson` Registra as aulas assistidas pelo usuário.
The rules are as follows:
For approved students: must contain date of completion and the field approved
like 1;
For failed students: must contain date of completion and the field approved
like 0;
For dropout students: without completion date, the field approved
as 0, and the last update in the table user_course_lesson
must be more than 30 days.
For students who have not accessed yet: There should be no classroom displayed in the table user_course_lesson
For students with the course in progress: total records with no completion date and with the field approved
as 0, less dropouts and students who did not access.
The SQL I’m using is the following
SELECT
/* Nome da empresa */
c.`name`,
/* Alunos aprovados */
SUM( case when uc.`approved` = "1" AND uc.`finished_at` IS NOT NULL then 1 else 0 end ) as approved,
/* Alunos reprovados */
SUM( case when uc.`approved` = "0" AND uc.`finished_at` IS NOT NULL then 1 else 0 end ) as reproved,
/* Alunos desistentes */
SUM( case when uc.`approved` = "0" AND uc.`finished_at` IS NULL AND (
SELECT
MAX( case when `updated_at` IS NULL then `created_at` else `updated_at` end ) as "last_access"
FROM `user_course_lesson` as cl
WHERE cl.`user_course_id` = uc.`id`
ORDER BY `last_access` DESC
LIMIT 1
) <= DATE_ADD(now(), INTERVAL(-30) DAY) then 1 else 0 end ) as dropout,
/* Alunos que não acessaram ainda */
SUM( case when uc.`approved` = "0" AND uc.`finished_at` IS NULL AND (
SELECT
COUNT( cl.`id` ) as "access"
FROM `user_course_lesson` as cl
WHERE cl.`user_course_id` = uc.`id`
) = 0 then 1 else 0 end ) as not_access,
/* Alunos com o curso em andamento */
SUM( case when uc.`approved` = "0" AND uc.`finished_at` IS NULL then 1 else 0 end ) -
SUM( case when uc.`approved` = "0" AND uc.`finished_at` IS NULL AND (
SELECT
MAX( case when `updated_at` IS NULL then `created_at` else `updated_at` end ) as "last_access"
FROM `user_course_lesson` as cl
WHERE cl.`user_course_id` = uc.`id`
ORDER BY `last_access` DESC
LIMIT 1
) <= DATE_ADD(now(), INTERVAL(-30) DAY) then 1 else 0 end ) -
SUM( case when uc.`approved` = "0" AND uc.`finished_at` IS NULL AND (
SELECT
COUNT( cl.`id` ) as "access"
FROM `user_course_lesson` as cl
WHERE cl.`user_course_id` = uc.`id`
) = 0 then 1 else 0 end ) as pending,
count( uc.`id` ) as `total`
FROM `company` as c
INNER JOIN `company_place` as p
ON p.`company_id` = c.`id`
INNER JOIN `user` as u
ON u.`place_id` = p.`id`
INNER JOIN `user_course` as uc
ON uc.`user_id` = u.`id`;
The problem
SQL repeats many queries, and the time to calculate with few records is already high, around 0.056 seconds, with the increase of records this time will increase exponentially.
Ideally the queries in the table user_course_lesson
did not repeat themselves, or had another way to write this SQL with Joins or another form of relationship.
I didn’t go into the query itself, but by her face it might be better to consider pre-processing these statistics from time to time, and storing them in some sort of cache (for example, in another table).
– bfavaretto
I have considered this, but the maintenance cost it had in the old version of the same project was very high. Constantly had to check the accounts, not to mention that the project has several registered companies, and access of about 15000 students daily per company. I really believed that keeping it in another table would be the solution, but I suffered too much to do it. It was much given in one table, and little reliability. I was thinking about creating some view or something, but I don’t know if the performance gain would be very advantageous.
– marcusagm
I need to improve the little that is the query, because you can store in the memcache later and not have to run it every time. Another detail is that the project is constantly audited, so the concern with data reliability.
– marcusagm