Save Captured Data and Executed Commands with Python

Asked

Viewed 450 times

1

I need to use Python to make several commands on a system, and at the same time go saving in a dataframe everything that is being done. There will be hundreds of commands per minute, for several hours, daily.

In that situation, how would you deal with the risk of problems during implementation that terminate the application abruptly? (e.g.: power outage, computer crash, miscellaneous bugs, etc.)

Please correct me if I’m wrong, but I believe that when I add data to a dataframe in the Pandas library, however large the dataframe is, it is still just a variable during execution, and if there is a problem the variable is totally lost along with its data? Or it is stored in another location?

I thought about saving in excel... but in this situation, with this data volume and processing speed, add each command executed immediately in an Excel file, and save it to each new information added, it would be feasible?

Or would saving to SQL Server database be more appropriate? I think that this volume of data would overload the network daily, and there may be better alternatives.

And then, someone knows a better alternative?

  • What would these commands be saved for? A string that was typed in a cmd?

  • Leandro, exactly. Strings of a cmd-like system, both captured and sent would be saved.

1 answer

1


Yes - you are correct - the dataframe is just an object in memory - and if the program stops for any reason, all data is lost. Furthermore, the size of the dataframe is limited to the size of the memory - although computers today typically have several GB of memory and this can accommodate a good size database, depending on the amount of data you need to save in each transaction, This may run out.

This is a typical case where a database is a good solution - you don’t even have to worry about setting up a database, or having a specific machine for it - the Sqlite itself, which comes embedded in Python can be used very well in such a situation - without having to set up anything.

And then, you can ensure that every transaction is saved to DB independently -if something happens, when restarting the application, all the data will be there.

If you need to read the data to a dataframe to operate on it (as long as it fits into memory), this is relatively easy too.

If you want to have an independent application that allows you to analyze the data already saved while another application saves the "commands" in the database, then it may be better to use a Mysql or a Postgresql - one of the limitations of Sqlite is to just have one program using DB at a time.

Since you didn’t put an example of the data you want to save and which analysis you want to do later, I can’t give an example - but simply, at the point of your code where you would create a new line in the dataframe, you make a command of db.execute("""INSERT INTO nome_da_tabela VALUES (?, ?, ?, ?)""", (dado1, dado2, dado3, dado4) - and ready, the values are persisted (it is only necessary to give a command to create the table with the proper structure before, of course).

Sqlite banks still have the advantage of being contained in a single file - so if you want to copy the bank to another machine for analysis just send that file.

Just read some tutorial/video about using Sqlite in Python before you start - what you want is actually quite simple.

  • Thank you very much for the explanations and suggestions, Where I will use, they use Microsoft SQL Server in databases, is it possible to do this in it too? Can I save this SQL database on the computer itself, not necessarily being a database that is on a server? (so I decrease problems with the drop in connection and slowness of the network) I’m sorry about the layman’s question, but if it’s saved to the computer, it would be more feasible/faster in database format or in . xls (excel) or . csv ?

  • 1

    It is possible to use microsoft SQL Server - and it is possible to save on the computer itself -in this case, use sqlite as suggested. Excel fomrato is much unsuitable for this, as it will require that the entire file be recreated with each new data line. CSV could serve, but has no advantage over a bank, even in relation to Sqlite.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.