await Task.Whenall how to run multiple processes?

Asked

Viewed 1,906 times

5

I am trying to create several task (async) they will perform the following logic:

  1. Parse an Html based on a received url with Htmlagilitypack
  2. Return a product model after parse
  3. Insert the Product into the database
  4. Download the product images
  5. Mark url as read

Items 1 and 4, especially 4 take time because of the speed of the internet link, so they should be async. But I’m having difficulties, all my code runs, but in a synchronous way.

 private static void Main(string[] args)
{
     IEnumerable<UrlsProdutos> registros = db.UrlsTable.Where(w => w.Lido == false).Take(1000);

  ExecutaTarefasAsync(registros).Wait();
}


  public static async Task ExecutaTarefasAsync(IEnumerable<UrlsProdutos> registros)
        {
            var urlTasks = registros.Select((registro, index) =>
            {
                Task downloadTask = default(Task);

                //parsing html
                var produtoTask =  ExtraiDados.ParseHtml(registro.Url);
                if (produtoTask.IsCompleted)
                {
                    var produto = produtoTask.Result;
                    //aqui faço um insert com Dapper
                    downloadTask = InsertAdo.InsertAdoStpAsync(produto);
                }

                //marca url como lida, igual ao insert do produto
                InsertAdo.MarcaComoLido(registro.UrlProdutoId);

                Output(index);

                return downloadTask;
            });

            await Task.WhenAll(urlTasks);
        }

        public static void Output(int id)
        {
            Console.WriteLine($"Executando {id.ToString()}");
        }

The Insert made a fixed just to test

public static async Task InsertAdoStpAsync(Imovel imovel)
{
    var stringConnection = db.Database.Connection.ConnectionString;
    var con = new SqlConnection(stringConnection);
    var sqlQuery = "insert tblProdutos...etc..etc"
    con.ExecuteAsync(sqlQuery);
}

I don’t know if each function should be async. or if I could select type the Download and parse to be async..

My async photo download system works perfectly.

  public static async Task DownloadData(IEnumerable<FotosProdutos> urls)
        {
            var urlTasks = urls.Select((url, index) =>
            {
                var filename = "";

                var wc = new WebClient();
                string path = "C:\teste\" + url.FileName;

                var downloadTask = wc.DownloadFileTaskAsync(new Uri(url), path);
                return downloadTask;
            });

            await Task.WhenAll(urlTasks);
        }

I need help to make and understand how Executatarefasasync is really async like the photos I can’t even incorporate into this project.

NOTE: I don’t know if the download of the photos I do there in the parse or I put in this task.

2 answers

3

To make an asynchronous function, Voce uses the Task, as you’ve already discovered. The right way to do something like this is like this:

public static Task MakeRequest(int i) { 

    return Task.Run(() => {

       // seu codigo aqui
    });

}

public static void Main(string[]) {

    var tasks = new List<Task>();

    for (int i = 0; i < 10; i++)
    {
        tasks.Add(MakeRequestAsync(i));
    }

    // Aguarda todos MakeRequestAsync terminarem.
    Task.WaitAll(tasks.ToArray());
}

That way, when you call MakeRequestAsync Voce can use the await as is expected with asynchronous methods:

var resp = await MakeRequestAsync(i);

It is also worth saying that a Task gives no guarantees as to when it will be executed: it can either start running immediately or queue.

Another important thing is to understand the difference between Task.WhenAll and Task.WaitAll. Task.WhenAll returns another Task that you can expect (await) at the moment it is interesting, and the code continues running, that is, it is a "non-blocking" function. Already the Task.WaitAll will stop the flow of code until all Tasks are executed.

In short: your method that should be asynchronous should actually fire a Task and return an object awaitable. The easiest way to do this is by using Task.Run.

3


A point I always reinforce in questions about async/await: it does not make the execution of an asynchronous method by itself, it only allows the programmer to write methods in a running stream close to what would be written for synchronous methods. The big issue is that async/await signals to the compiler that when finding a await, it will wait until the execution is completed, but without blocking the main thread (UI Thead in Windows Forms or IIS Thread pipeline in web applications, for example).

In your specific case, the use of async/await is not correct. When a method is async, although you say it returns a Task, it does not mean that whoever uses it needs to take this Task as a return. By using await in your call, you are already telling the compiler that it should wait for the method to run. I don’t know if I could explain in a clear way, but I think with the example will give to understand this dynamic.

To make the execution really asynchronous, in a matter of getting each record to run on different Threads in a "parallel" way, I did the following:

// fiz as classes aqui só pra conseguir executar e mostrar um caso de execução com 5 segundos de duração pra cada chamada

public class FotosProdutos
{
    public string FileName { get; set; }
    public string Url { get; set; }
}

public class UrlsProdutos
{
    public int UrlProdutoId { get; set; }
    public string Url { get; set; }
}

public class Imovel
{

}

public class InsertAdo
{
    public static async Task InsertAdoStpAsync(Imovel imovel, int index)
    {
        await Task.Delay(TimeSpan.FromSeconds(5));

        Console.WriteLine(String.Format("{0} - InsertAdoStpAsync {1}", DateTime.Now, index));
    }

    public static async Task MarcaComoLido(int urlProdutoId, int index)
    {
        await Task.Delay(TimeSpan.FromSeconds(5));

        Console.WriteLine(String.Format("{0} - MarcaComoLido {1}", DateTime.Now, index));
    }
}

public class ExtraiDados
{
    public static async Task<Imovel> ParseHtml(string url, int index)
    {
        await Task.Delay(TimeSpan.FromSeconds(5));

        Console.WriteLine(String.Format("{0} - ParseHtml {1}", DateTime.Now, index));

        return new Imovel();
    }
}

class Program
{
    static void Main(string[] args)
    {
        // coloquei qualquer coisa aqui só pra eu conseguir reproduzir sem a sua dependência de Dapper
        IEnumerable<UrlsProdutos> registros = new List<UrlsProdutos>() { new UrlsProdutos { UrlProdutoId = 1 }, new UrlsProdutos { UrlProdutoId = 2 }, new UrlsProdutos { UrlProdutoId = 3 } };

        // Roda uma Task diferente pra cada registro.
        // Do jeito que você estava fazendo, sem o Task.Run(), acontecia basicamente a mesma coisa que um loop for executando item a item sincronamente a sua coleção
        var tarefas = registros.Select((registro, index) =>
        {
            return Task.Run(async () => await ExecutaTarefaAsync(registro, index));
        });

        Task.WaitAll(tarefas.ToArray());

        // espera mais um pouco só pra vermos uma diferença até o log de fim
        Task.Delay(TimeSpan.FromSeconds(5));

        Console.WriteLine(String.Format("{0} - Acabou!", DateTime.Now));

        Console.Read();
    }

    // Mudei o seu método pra se referir apenas a um registro só pra ser mais didático
    public static async Task ExecutaTarefaAsync(UrlsProdutos registro, int index)
    {
        Output(index);

        // chama o seu método de parse, falando pra esperar tudo o que tiver de assíncrono nele, e pega o retorno logo em seguida
        var produto = await ExtraiDados.ParseHtml(registro.Url, index);

        // como o seu parse já acabou, insere o registro com o resultado dele
        await InsertAdo.InsertAdoStpAsync(produto, index);

        // por fim, marcar todo mundo como lido
        await InsertAdo.MarcaComoLido(registro.UrlProdutoId, index);

        Output(index);
    }

    // não usei, mas mudei ele pra você ver a questão do await
    public static async Task DownloadData(FotosProdutos url, int index)
    {
        var wc = new WebClient();
        string path = @"C:\teste\" + url.FileName;

        // aqui você não precisa pegar a Task, ao usar o await ele já entende que você quer esperar o resultado do método async pra prosseguir na execução do método
        await wc.DownloadFileTaskAsync(new Uri(url.Url), path);

        Console.WriteLine(String.Format("{0} - DownloadData {1}", DateTime.Now, index.ToString()));
    }

    public static void Output(int id)
    {
        Console.WriteLine(String.Format("{0} - Executando {1}", DateTime.Now, id.ToString()));
    }
}

When running this example, I had the following output:

13/10/2016 00:56:27 - Executando 1
13/10/2016 00:56:27 - Executando 2
13/10/2016 00:56:27 - Executando 0
13/10/2016 00:56:32 - ParseHtml 1
13/10/2016 00:56:32 - ParseHtml 2
13/10/2016 00:56:32 - ParseHtml 0
13/10/2016 00:56:37 - InsertAdoStpAsync 0
13/10/2016 00:56:37 - InsertAdoStpAsync 1
13/10/2016 00:56:37 - InsertAdoStpAsync 2
13/10/2016 00:56:42 - MarcaComoLido 2
13/10/2016 00:56:42 - MarcaComoLido 1
13/10/2016 00:56:42 - Executando 2
13/10/2016 00:56:42 - Executando 1
13/10/2016 00:56:42 - MarcaComoLido 0
13/10/2016 00:56:42 - Executando 0
13/10/2016 00:56:42 - Acabou!

That is: as each method takes 5 seconds to execute, we can observe that he created a different Thread for each record.

  • I’ll play your code here, but thank you for the lesson! , very good, very detailed...

  • only a question the Downloaddata, what would be the correct location? in the parse products? or there at the root in the program.Cs in the perform tasks? pq it is the longest and it can only run after parseHtml is completed and returns the product model.

  • 1

    @Dorathoto as you said is part of your steps to download based on your parse, I would put a call inside Executatarefaasync(). Since the processing of each image seems to be independent, I would test the difference between running the list in parallel or one by one to see which application would best render.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.