Good practices when working with file processing

Asked

Viewed 503 times

2

I have a web application in Asp.Net MVC with C# and received a new requirement where the goal is to read a text file with thousands of lines, each line containing a set of data that will be used to insert and update the database.

My question is what best practices to adopt in this development. With my little experience, I know that the large volume of information makes the processing take a long time, reaching more than an hour and end up causing timeout. I believe that just increasing the timeout is not the best solution.

I also need to present to the user the processing situation, preferably in real time, what would be my options?

  • 1

    You can consider Bulk operations as discussed in this answer: (http://answall.com/a/9344/3084). Operations in Bulk are provided by the database itself and the goal is exactly this, to load large volumes of data.

  • If I got it right, Bulk would be for me to save the text file data in a database table. In my case, the "columns" of each row not separated by a character or tab (FIELDTERMINATOR), which seems to me to be a prerequisite for Bulk to work. Files follow their own layout and I am not at liberty to change that layout. I would first have to edit row by row, adding something that separates these columns. And also how would be the validations of each field? I can validate the file before, but still have the question of performance.

2 answers

2

With my little experience, I know that the large volume of information makes the processing take a long time, reaching more than an hour and end up causing timeout. I believe that just increasing the timeout is not the best solution.

But in this case it is. As fast as its application, I find it interesting a higher timeout in this case.

I also need to present to the user the processing situation, preferably in real time, what would be my options?

Ajax and a pretty progress bar. I suggest to Nprogress.js.

  • Very cool this Nprogress, I will certainly use, if not this, in my projects. Regarding the processing of the file you then believe it is interesting to increase the timeout? Assuming I have no way of knowing and I really don’t have, how long would the processing take, which takes 10 hours, would it really be interesting if I had a timeout of 12 or more hours? I don’t think the user wants to expect all this. What would be my options to improve process performance? Async with Parallel?

  • Yes, you can leave as much time as possible. I don’t think it takes so long to process a file with thousands of lines. Async with Parallel can be a good option.

0

Take a look at this Macoratti tutorial http://www.macoratti.net/vbn_prvd.htm and also on this link https://www.connectionstrings.com/textfile/

I made an adaptation of this code and use the Windows OLEDB library to read the file. Once the connection is created, I select the file and Gero a Datatable in memory.

Once loaded into memory, I use Bulk to load the Bank.

I did several, and it has worked well for text files of well varied formats.

Important tip: for this to work, you need to create a Schema.ini file at the same address as the source file. This file contains configuration data the OLEDB library can read the TXT file.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.