Best way to structure history table with large amount of data

Question

Best way to structure history table with large amount of data

Asked 9 years, 2 months ago

Viewed 842 times

3

Functional requirement

I own devices. Each device roughly has its unique identifier, an IP address, and a type.

I have a routine of pings for every device that has an ip address. This routine is nothing more than a C#executable, which runs every 3 minutes and tries to drip such IP address from the device.

The result of ping i need to store in the database as well as the date of the check (independent of the result of the ping).

Technical Part:

Supposing that my process of ping and bank structuring be ready from the day 01/06/2016, I need to do two things:

Daily extraction (which is a fed table)
Real-time extraction (last 24 hours)

Both must return to me the same thing:

Devices that have been unavailable for more than 24 hours.
Devices that have been unavailable for more than 7 days.

Unavailable means the device which, when dripping, did not respond to the ping.

Available means the device which, upon being dripped, ping successfully.

What I have today and works very badly:

A table historico_disponibilidade, with the following structure:

create table historico_disponibilidade (id_dispositivo number, respondeu number, data date);

This table has a large amount of data (today it has 60 million, but the trend is to grow exponentially)

The questions:

How to achieve such goals without bumping into problems of slowness in queries?
How to create a table(s) structure that is prepared to receive millions/billions of records within my corporate universe?

I worked on a system where a single table had 40gigas of data (BD Oracle), the Selects on it was as fast as a smaller table, because it used the correct indexes. The structure of this table is good, you did not put the indexes, but if the id_device is the key and you want to get a range of dates there then you will have performance problems. If in your environment you can simulate such a large table in homologation is worth it for you to have the experience of how it worked.

– Ricardo

2016/05/24 at 15:03
@Ricardo, where do you think the index should exist? Yes, I need to pick up by a range of dates and one by id_device as well.

– Marllon Nasser

2016/05/24 at 15:16
So, in the log tables that I used we only had an index on the date of the record, nothing else, the index is a new table with that column, it takes space, and there are many, many tables. The selects in the log were always run with a date to use the index. In maintenance was run about 10 logs a day, it was not much, I think it will depend on how much space you can have and how many queries you saw want, I think an index on the date would solve in your case. This data will be displayed on screen for user?

– Ricardo

2016/05/24 at 15:57

1 answer

Browser other questions tagged database oracle modeling optimizing

You are not signed in. Login or sign up in order to post.

by Maniero • **444,682** points · Answer 1 · 2016-05-24T14:43:28+00:00

The problem does not seem to be exponential or polynomial. There are some that are, there would not be much to do. If there is an exponential or polynomial implementation, you have to solve it, but the question does not make it clear how it is being done, it seems to me that it is not even the case.

The volume of rows in a table should affect very little the performance of darlings, as long as it has the necessary indexes and the queries themselves do not do very weird things.

Even if it does, the solution would be to erase the data you no longer need. If you need to keep the data for historical purposes only and no longer need to keep queries in the old data (at least not often) the solution is to have an auxiliary table that would function as a dead file. There would be a transfer from the normal table to the dead file. Apparently you only need to keep the last 7 days. I don’t think I need to say that this transfer can be automated. It is possible to use ranges of data to partition the tables, but I don’t know how this works on Oracle.

All the good databases in the market, especially Oracle, are prepared to work with billions of lines without problems. If you have a specific problem in a query and you’re not finding a solution (probably with content) so post a specific question.

Note that the table is so simple that it has nothing to simplify. Unless you have hidden things in the question.

I wonder if the whole routine is wrong. It seems strange to me to do what is being done, I think is the right solution to the wrong problem, but I will not get into the merit.