Calculate the average of a field in a specific year within a period

Asked

Viewed 313 times

1

I was studying queries in mysql using the Employees demo database (link), when I decided to do a search in the table salaries than would be a report of how much the company spent per year on wages.

Table salaries:

emp_no int(11) PK

salary int(11)

from_date PK date

to_date date

from_date is the date an employee started receiving a salary and to_date is when he finished receiving it (he was fired or had a raise).

Assuming an employee received a certain salary from the year 2000 until 2005, as I inform the database that it should be included in the calculation in all years between those two?

The research that I did and that seems to have returned the result that I hope is (in part) this:

SELECT 1985 AS ano, AVG(salary) AS media FROM salaries 
WHERE 1985 BETWEEN YEAR(from_date) AND YEAR(to_date)
UNION
SELECT 1986 AS ano, AVG(salary) AS media FROM salaries
WHERE 1986 BETWEEN YEAR(from_date) AND YEAR(to_date)
UNION
...

Repeating that until senior year.

Obviously this code is not good, because it repeats the same SELECT and the same search (with the different year) several times, and in this database each of these Selects in this table takes 1.5 seconds.

Does anyone know a more efficient way to do this research?

2 answers

1


Surely you have noticed that all your queries follow the same pattern:

SELECT "ANO", AVG(salary) as media
  FROM salaries
 WHERE "ANO" BETWEEN YEAR(from_date) AND YEAR(to_date)

Creating a generic form for a range of years is easy and requires only an auxiliary table or sub-query with the desired range of years (This is important to ensure that, whether or not there is information for that period in the database, the result set will include each year in its report)

A generic form, using a sub-query to create the years interval, is the following:

SELECT ANOS.ANO, AVG(salary)
  FROM 
  ( 
         SELECT 2000 AS ANO UNION ALL
         SELECT 2001 AS ANO UNION ALL
         SELECT 2002 AS ANO UNION ALL
         SELECT 2003 AS ANO UNION ALL
         SELECT 2004 AS ANO UNION ALL
         SELECT 2005 AS ANO 
  ) ANOS
  LEFT JOIN salaries Sal
    ON ANOS.ANO BETWEEN YEAR(from_date) AND YEAR(to_date)
 GROUP BY ANO
 ORDER BY 1

Note that it is necessary to enter the clause GROUP BY YEAR(from_date), This is because, unlike their wishes where the average was calculated individually for each of the years, this form selects all valid records in the different years at once. The GROUP BY clause will group the records on the basis of similarities between them, in this case the YEAR, and apply the AVG function to each of these groups.

Now, a very important detail. The previous solution will most likely not return the right result. Your problem is not simple to solve and also depends (and not only) on how the information is stored in the database:

  • The information in the salary table represents the annual, monthly or weekly salary?
  • How often are wages paid? Per month, per week?

To get an idea, imagine the employee João who started working at the company in January 2000 until December 2000, with a monthly salary. He therefore received 12 salaries during his time in the company.

  • Initially his annual salary was 30'000$.
  • 9 months after starting, his salary increased to 60'000$.
  • The company had to pay a total of $9x2'500+ $3x5'000 for a total of 37'500$.

The above solution does not consider this change in the way the average is calculated. It will simply add up these two amounts (30'000$ and 60'000$) as if they were two different employees, inflating the average.

In order to determine the total wage expenditure, it would be necessary to first calculate the individual expense for each employee, taking into account the frequency of wages and any change during the year and then calculate the average.

  • Thanks, that helped a lot. I hadn’t thought about this question that wages change in the middle of the year, now I’m going to see a way to do this research taking this into account.

0

From what I understand, you need to know the average salary of the employee year by year. If so, it is possible to use the command "group by"

ex:

SELECT YEAR(from_date) AS ano
     , AVG(salary)     AS media 
  FROM salaries 
 WHERE emp_no = ...
GROUP BY YEAR(from_date)

more details here

Browser other questions tagged

You are not signed in. Login or sign up in order to post.