Manage Datatable with lots of information

Asked

Viewed 123 times

3

I have here a "small" problem with a DataTable.

From an SQL query I am getting a huge amount of records (> 1,000,000), and although the consultation is relatively quick, the loading of the DataTable from the Fill() is extremely slow and in some cases generates an error of System.OutOfMemoryException because the process where the code is being executed exceeds the 4gb (compulsory compilation in x86).

My question is whether there is any way to reduce the charging time of DataTable and at the same time prevent the memory from exceeding the limit.

Note that this information will be used later for XML serialization.

  • 1

    Not using Datatabel? : ) Really. Other technologies were created because this was problematic.

  • Yes, if there really is no other chance we may have to change! Will using a DataReader and go filling the objects, one by one, will it be faster? Or using a IEnumerable<T>?

  • 3

    Much faster. Doing it by hand or using Dapper will be the quickest options. Even the Entity Framework will be much better most of the time, especially Core, although a little more complex.

  • Thanks @Maniero, I think that with one of these options we can overcome the problem!

  • 1

    also think about paging, after all if only the data weighs enough, imagine an XML with more than 1kk of records...

  • @Rovannlinhalis in this case has to be a single file, however large it is.

  • 1

    complicated... but then would think of using a datareader to go reading and writing, without storing everything in memory

Show 2 more comments

1 answer

4


The DataTable was an interesting idea, but very poorly implemented, it really carries too much memory.

Hence the need to make a real ORM and Microsoft did the Entity Framework which is heavy, boring to use at first, is slow, but works much better. And the EF Core it’s not that bad. I’m not a fan of it for several reasons, but it’s an evolution.

Curiously, right at the beginning, I already had better technology for most of the cases, which is the DataReader, if you only need to read data, if you don’t need to be online with the database it is a more correct solution always. In fact even if you need to write and be online the DataTable is only feasible for simple things, and just because the problem is not very noticeable, not because it is good.

Direct access to database provider or use a simple ORM such as Dapper are the fastest options you can get. The first has no extra layer, it just gives you access and you do all access at hand. It’s less work than you think in most cases, even if you have to take care of certain things. Dapper generates the objects for you and achieves incredible performance even so and delivers more ready, which can be very useful, especially if you are going to serialize.

  • +1 for quoting Dapper, although many "lazy" say that he is very verbose for writing Sql in the code nor care for his performance that he brings.

  • 1

    I find the EF verbose :D Knowing to create the right abstractions the code gets much smaller c/ Dapper :D

Browser other questions tagged

You are not signed in. Login or sign up in order to post.