How to save space in audit log?

Asked

Viewed 201 times

4

I was creating a system that provides me data of my users according to the date I request, from X to Y for example. And the only way I could find for that was to create a table that way:

date with the date that was inserted in DATE format

name name of the saved, for example Browser

value the value entered, for example Google Chrome

ip with the user’s IP

At the time the user enters I check if there is a Row on the date of TODAY with the name BROWSER if there is no it inserts, if there is it does nothing.

Well that way he’ll insert the browser of the users who accessed my site and then I can select, create charts etc

The problem with that would be space, wouldn’t that take up a lot of space? Taking into account for example an average of 5-10,000 visits a day, and taking into account that I don’t want to capture just the browser. How could I solve this problem?

  • 1

    And how about putting the structure since I didn’t understand what kinds of fields put in your definition...

2 answers

6


The simple answer to this is: it does not. And even if it does, if you have this need you must do it on your own. You don’t have a problem to solve, just a matter to solve.

You have log access to website connected? All website He usually does. Nobody complains about a lack of space for him. There is all this information you want and probably much more redundant that will take up much more space.

If you lack space buy more. If you can’t buy, stop doing what is taking up space. There is no miracle.

Of course there are amazing solutions but probably not worth the complexity, besides being questionable whether it will give good result, disk space is absurdly cheaper and without risks.

For some time you came to accept the other answer, perhaps because you found something good in it but not quite. Maybe you didn’t understand what I meant. I’ll try to make it clear.

The proposal of the other answer is to gather two separate items of information into one line. This is really good because it eliminates some repetitions. Only if you are to do this then you must treat each access as a line. That is, its structure has to be completely different from what it is doing. And the way you’re doing it could be necessary for some reason only you know. No one can say what’s best Without knowing everything they need.

The problem of this answer is that if it is to create a line for each access it probably does not make sense to create a database for this, this line already exists in the log from the HTTP server. Just consult it and do the statistics you want. Duplicating efforts is wasting space. In fact, other people have already done this for you and there are hundreds or thousands of free commercial programs that generate quite complete statistics.

  • One thing that could reduce space would be to do some 'cron-job' to delete old data, so it would always keep the same size. Well, this is what I do on my site, there’s a table with 1.4GB, before there was 89GB, but it was a lot of useless, old data. Now is erased what has more than 1 week. I think 'is the way'.

  • When I finished the project and analyzed it I was "can only be wrong" I see that a way for this would be to use an external server just for this,

  • I thought about it, to make my system generate auromatic reports and save them and then erase that useless data

  • @bigown what information the http log saves?

  • @user3163662 all that you want it to have and that are available on an HTTP server. What you cannot put on Kig, can’t put it anywhere else.

  • Could you for example know the duration of the customer on the page? has informative matters? I don’t know much about this subject

  • It has how to make an estimate, but it is very flawed. For this you can try, but it is also not guaranteed to make the JS go informing the server.

Show 2 more comments

1

Some things can be changed to save space, but it’s not that simple.

First, you say you have name and value.

Wouldn’t it be better to create a column for each fixed value?

   date    |   name    |     value     |      ip
2014/12/27 | BROWSER   | GOOGLE CHROME | 127.0.0.1
2014/12/27 | HTTP_REF  | GOOGLE.COM    | 127.0.0.1

In this case there would be a duplication, of date and ip that could be solved if it were:

 date      |   browser       |     HTTP_REF     |      ip
2014/12/27 | GOOGLE CHROME   |    GOOGLE.COM    | 127.0.0.1

That way it would save some lines and duplicate data.

Why not use INT?

If the name is entered by you, why not use INT?

0 -> BROWSER
1 -> HTTP_REF

   date    | name  |     value     |      ip
2014/12/27 | 0     | GOOGLE CHROME | 127.0.0.1
2014/12/27 | 1     | GOOGLE.COM    | 127.0.0.1

Cron-Job to erase old data

If you save daily, create a Cronjob at 00:00 and with it make all the necessary averages and erase the data, in an old TRUNCATE. So it would convert 10thousand lines into one containing all the averages and data already calculated.

  • Show, had not thought about it, is little but on a large scale becomes much.. I had a question, that way I presented taking into consideration the insertion of values like COUNTRY (where it came from) and BROWSER, would have to create a query that shows me which browsers most used in a given COUNTRY?

  • You can use "SELECT Value, Count(Value) Value FROM Visitas WHERE ip IN (SELECT ip FROM table WHERE Name = 'Country' and Value = 'BRAZIL') and Name = 'Browser' GROUP BY Value" If this post was useful, mark it as useful.:)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.