Modifying a list item shared by multiple threads

Asked

Viewed 283 times

6

I have the following pseudo-code:

public void Associar(List<Data> dados)
{
   List<Task> tasks = new List<Task>();
   foreach(dado in dados)
   {
       tasks.Add(AdicionarAsync(dado));
   }
   Task.WaitAll(tasks.ToArray());

   Debug.WriteLine(dados.Select(e => e.Colecao).Sum(e => e.Count));
}

public async Task AdicionarAsync(Data dado)
{
   dado.Colecao = await consultanobanco(dado.Id);
}

The output of this code should always be 411 (equivalent to the sum of the records in the database). However, the result varies whenever the Associate method is executed. I put a Thread.Sleep(10); only to check whether it would be a competition problem and the problem was "solved". What is the correct way to use a list thread safe to modify each item of a collection distributed in several Tasks?

Debugging the code a little more, I noticed that the difference in values is actually on the line dado.Colecao = await consultanobanco();

Within the method consultanobanco(); the return is correct. However, when arriving at the assignment to dado.Colecao it comes wrong. Modifying of await consultanobanco(); for consultanobanco().Result the result is returned as expected.

Any reason for this behavior? What is the difference between await and . Result in this scenario?

  • already tried to await Addirasync?

2 answers

1

Well the problem is because you are changing the state of the object in a different thread than the one that reads it. More precisely this line is giving problems, as you have noted:

dado.Colecao = await consultanobanco();

I’m not sure why the .Result It works, the best thing is not to rely too much on it. The framework . Net is free to decide whether or not to use a thread from the thread pool to perform a task job, in which case you may not use.

I’m not going to tell you what the solution to this problem is because I believe you have a bigger problem before this one. Note that the Collect property of all objects in the data list will have the same data, unless the behavior of consultanobanco is not deterministic. In fact it may even happen to have a racing condition, in which the data from the database changes and only some of the objects on the list have that change.

The same is best to read the data once and share them by all objects. For efficiency reasons too.

public asyn Task Associar(List<Data> dados)
{
   var dadosDoBanco = await consultanobanco();
   foreach(dado in dados)
   {
       dado.Colecao = dadosDoBanco;
   }

   Debug.WriteLine(dados.Select(e => e.Colecao).Sum(e => e.Count));
}
  • the consultanobanco pseudomethod depends on specific data to return its values, updated the question with an example. It is done in parallel because the consultations take too long (about minutes), and it would take too long to be done synchronously.

  • @Joãopaulo Instead of doing Lazy loading, Ager loading knife. That is, when Voce obtains its Date object from the database it should already come with the collection information. Another alternative is to load all objects at once, instead of making a query for each object.

  • This query is being performed for a reason, it cannot come directly with the object in the query to the database. And as I said, it’s millions (billions in certain circumstances), so you can’t put them all in memory. The parallelization of these queries was a solution that fit the case, but is with this "feature" that I can not find the reason... using the . Result today is running normally

  • @Paul seems to be missing something. I never said to bring all the data. I just said to bring all those who need at once. If still they are millions as you say, maybe you need to make pagination.

  • I’m talking about the data that’s really needed. However, using pagination I would have to parallelize these queries, because they would be very long equal

  • @John Paul is not like that. What are you trying to do in your program? Looks like a data migrator/exporter/backup... or a huge xD warehouse manager

  • That’s exactly it, a real-time data exporter. = D

  • @Paul seems to have got it right. I advise you to edit your question. You need to put all the details. What is your database? Can you use a migration tool to do your task? Can you just script and run in rdbms? Can you put your real code instead of pseudo code? Can you just copy the database file and give it another name? Ask these questions to yourself because it helps you understand your need and the best way to solve it.

Show 3 more comments

0

According to this doubt your problem may be related to closure foreach:

Try using this code to see if the problem is fixed:


public void Associar(List dados)
{
   List tasks = new List();
   foreach(dado in dados)
   {
       var tempDado = dado;
       tasks.Add(AdicionarAsync(tempDado));
   }
   Task.WaitAll(tasks.ToArray());

   Debug.WriteLine(dados.Select(e => e.Colecao).Sum(e => e.Count));
}

public async Task AdicionarAsync(Data dado)
{
   dado.Colecao = await consultanobanco(dado.Id);
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.