Normalization of Mysql tables

Asked

Viewed 755 times

4

I’m creating a system for employee registration, but I’d like to start in a standard way;

tb_funcionario (id_func, id_funcao, id_depto, nome_func, data_nasc, end_fun, dt_admissao, dt_demissao)

tb_funcao (id_func,id_depto,nome_funcao)

tb_depto (id_depto, nome_depto)

What it gives to improve or add in such a way that it stays within the standards. , grateful for the collaboration.

  • It could better describe how the system will work that helps identify how the relationships between the tables should be. As it is seems appropriate.

2 answers

4


Before talking about normalization, some things to be observed are:

  • In the normalization of names do not be afraid to write enough, it is much better something like: id_funcio_external instead of id_fun_ex. Plus easier to understand won’t make your database heavier.

  • Use the most appropriate data types, thus:

    • Ids: use an integer data type, possibly the type SMALLINT is more than enough but you can choose some other that fits too (Link: Whole Types).
    • Dates: To use types Date because you want to keep only the date and so you can use functions to do Selects easily.
    • Text Fields: In text fields the size can be variable (as in function name) choose to use varchar instead of char for the char stores all defined space and the varchar as required (Link: Varchar and Char)
  • I also recommend that on the table tb_employee you enter a field for RG and/or CPF to define it as unique key to ensure that there is no duplicate employee. Although it is something rare two people may have exactly the same name.

Normalization problems -> First Normal Form:

  • Your table tb_function and tb_employee possess the id_depto, this can generate unnecessary data duplicity or even incorrect information.

  • Primary Keys: Set the fields id_func, id_func and id_depto as Primary Key.

  • Foreigns Keys: In the fields id_func,id_depto table tb_function define how Foreign Key and make the right relationship.

For more information I recommend: Standardisation in BD

4

A hint using what you posted:

tb_funcionario (id_func, id_funcao, id_depto, nome_func, data_nasc, end_fun, dt_admissao, dt_demissao)

tb_funcao (id_func,id_depto,nome_funcao)

tb_depto (id_depto, nome_depto)

Would look like this:

person(
  id unique,
  datebirth,
  person_name,
  person_surname
)

person_addresses(
  id unique,
  person_id,
  country_code,
  postal_code,
  province_code,
  city,
  address
  address_add
)

person_roles(
  id unique,
  role_id,
  person_id
  date_ini (data de início ou admissão)
  date_end (data de término ou demissão)
)

roles(
  id unique
  title
)

department(
  id unique
  title
)

department_roles(
  id unique
  role_id
)

Below, I will explain just a few reasons of the above modeling, without covering the types of data (integer, var, date, etc).

The first step is to standardize nomenclature and, for this, the recommended is to follow an internationalized standard using the English language, as it is the most widespread in the world.

As commented by Giovani, avoid abbreviating the column names. Describe them clearly.

Removing redundancies from table names and column names, for example, using "tb_" for tables is unnecessary as they are already a table. The end_fun column is somewhat confusing, remembering something like "end Function", "end of function?".. But then again, it seems to be "employee’s address".

If it is an address, then how to do if the employee has more than one address? At this point the modeling is in a cast. You will have future problems and a lot of work to remodel to multiple addresses. Note that in the table "person_addresses" has a column for country code. So many think, "but the system will only be in my country because I need country code?". Once again we have fallen into the question of globalisation. Think of an immigrant official, a Bolivian, an Argentine, a Colombian, and that’s very common in Brazil. So the company needs to register this employee but he does not have a fixed address in Brazil but obviously has a fixed address in his home country. If the system does not allow international addresses, then you will have a system limitation.

Why did I change it to "person"? So that the structure is useful for other situations and not be tied only the registration of "employees". This same table can be used for clients, for example, and this is also the reason for removing other columns like id_depto, id_funcao, etc, and organizing them in relationship tables.

In "person_roles" is where does the relationships between the person id and the function id. This table also contains date of start/admission and end/dismissal. The dates could also be in another table, but I find it unnecessary. You can have cases where the same person enters and leaves a department and returns to the same office and department. You must therefore allow multiple entries for the same person.

Prefer something more generic because the end date of a job does not always mean resignation. The person/employee can be transferred or promoted or simply finished some internship, died, fell ill, pregnancy, retired, finally. The interesting thing would be to create another table to make the relationship of reasons of termination, if you want to have more control.

Also note that the "person_roles" table allows the same person to hold multiple positions. And this is very common in many companies to have people who work in 2 or more departments performing sometimes the same or totally different functions.

To quote an example, a polyglot executive. It is of the administrative department whose main function is of trader. During the time you’re not traveling around the world, you’re in the office playing a translator or whatever. This is a mere example of a multi-function employee.

On the proposed modeling, how to know which departments or departments a particular employee belongs to? Through the relationship in the table "department_roles" we obtain the information.

Note that the tables are independent of each other. The important thing is the relationship tables. This is what makes the system flexible and reusable.

*The nomenclatures, such as the proposed modeling is merely a suggestion. Doesn’t mean it should be exactly as presented. Just try to understand how to organize tables and create relationships more dynamically. Always think about reusing the structure and flexibility of the system.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.