Low Performance when viewing Datatable C# - Plan Coded Ui Test

Asked

Viewed 85 times

1

I’m having a little trouble checking the numbers of a DataTable-X with a DataTable-Y, I have to check if the number of DataTable-X exists in the DataTable-Y (and if the DataTable-X with the data of DataTable-Y), the way I’m doing spending right 50 minutes.

Number of Lines of each Datatable:

DataTable-X => 38.258 linhas (10 colunas)

DataTable-X => 61.235 linhas (4 colunas)

In practice, the loop occurs DataTable-X x DataTable-X times, that’s a lot, but 50 minutes I think too much time.

Follows the code:

for (int i = 0; i < dTx.Rows.Count; i++)
{    
    bool ganbis1 = false;
    foreach (DataRow oLinha in dTy.Rows)
    {
        if (oLinha[0] != "")
        {
            if (Convert.ToInt64(oLinha[0].ToString()) ==             
                 Convert.ToInt64(dTx.Rows[i][1].ToString().Replace(" ","")))
            {
                // dTx = "Coluna1", "Coluna2", "Coluna3", "Coluna4",
                // "Coluna5", "Coluna6", "Coluna7", "Coluna8", "Coluna9", "Coluna10"
                
                // dTy = "Coluna1", "Coluna2", "Coluna3","Coluna4"
                // Preenche coluna 7
                
                if ((Convert.ToString(oLinha[1]) == "") || 
                    (Convert.ToString(oLinha[1]) == "#N/A"))
                {
                    dTx.Rows[i][7] = "#n/D";
                }
                else
                {
                    dTx.Rows[i][7] = Convert.ToString(oLinha[1]);    
                }
                // Preenche coluna 8
                if ((Convert.ToString(oLinha[2]) == "") || (Convert.ToString(oLinha[2]) == "#N/A"))
                {
                    dTx.Rows[i][8] = "#n/D";
                }
                else
                {
                    dTx.Rows[i][8] = Convert.ToString(oLinha[2]);
                }
                // Preenche coluna 9
                if ((Convert.ToString(oLinha[3]) == "") || (Convert.ToString(oLinha[3]) == "#N/A"))
                {
                    dTx.Rows[i][9] = "#n/D";
                }
                else
                {
                    dTx.Rows[i][9] = Convert.ToString(oLinha[3]);
                }
                // Preenche coluna 10
                if ((Convert.ToString(oLinha[4]) == "") || (Convert.ToString(oLinha[4]) == "#N/A"))
                {
                    dTx.Rows[i][10] = "#n/D";
                }
                else
                {
                    dTx.Rows[i][10] = Convert.ToString(oLinha[4]);
                }            
                ganbis1 = true;
            }
        } 
    }   
}

if ((ganbis1 == false) && (i != dTx.Rows.Count-1))
{
    // Quando não for encontrado preencher com #n/D
    dTx.Rows[i][7] = "#n/D";
    dTx.Rows[i][8] = "#n/D";
    dTx.Rows[i][9] = "#n/D";
    dTx.Rows[i][10] = "#n/D";    
}
  • Do you bring the database data? If it is database could not make an SQL comparison and do this operation in the database?

  • Unfortunately I have to read N xlsx files, and while performing this write treatment on a new xlsx with all unified.

  • 1

    I understood @D1dih but, if you did so, the reading of data play on two tables in the database (could be even temporary) and then make a process to expedite the process and soon after the result generate again the xlsx?

  • @Virgilionovic understood. This way it is possible to do yes. You believe there is no improvement in the form used above?

  • 1

    I can’t tell you and measure, but it’s slow to really read so many lines and keep comparing two very large items in memory, in my opinion you need to organize the data first (I would throw it in a database) and then extract the data. This process can improve performance since organized and logical data tends to be easier to compare ... !!!

  • @Virgilionovic Very observant and good suggestion.

  • I followed that line of organizing the data and include a break when finding the item. I sorted the columns I’m comparing in ascending order: Dataview dv = dTx.Defaultview; dv.Sort = "Column name to be sorted Asc"; dTx = dv.Totable(); Dataview dvTwo = dTy.Defaultview; dvTwo.Sort = "Column name to be sorted Asc"; dTy = dvTwo.Totable(); .

  • haaa... this has halved the processing time. :)

Show 3 more comments

2 answers

2

You’re iterating over these 2342728630 arrays, converting strings and comparing in all iterations. No wonder it’s taking so long.

I believe that the least you can do is create an associative array to avoid one being inside another is, so you only need to go through each array once, reducing the iterations to 99493.

Dictionary<long, DataRow> dic = new Dictionary<long, DataRow>();

foreach (DataRow oLinhaX in dTx.Rows) { 
    dic.Add(Convert.ToInt64(oLinhaX[1].ToString().Replace(" ",""))), oLinhaX);
}

foreach (DataRow oLinhaY in dTy.Rows) {
    if (oLinhaY[0] == "") continue;

    long oLinhaYInt = Convert.ToInt64(oLinhaY[0].ToString());

    if (dic.ContainsKey(oLinhaYInt)) {
        DataRow oLinhaX = dic[oLinhaYInt];
        //seu código aqui
    }
}

This should be a good relief, but still wait a few minutes of processing. Want more performance? Leave for multi-core use.

Edit

It was not mentioned in the question, but this solution only works if the relationship between the tables is one-to-one.

  • Very good answer, I saw that you are new to the site, after a look at our code of conduct https://answall.com/conduct and get to know how our community works.

1

Here’s another way to address the problem:

var rowsX = dtX.Rows.Cast<DataRow>().Select(r => r[1].ToString().Trim());
var rowsY = dtY.Rows.Cast<DataRow>().Where(r => r[0].ToString() != string.Empty).Select(r => r[0].ToString());

var rows = rowsX.Intersect(rowsY);
var dataRowsX = dtX.Rows.Cast<DataRow>().Where(r => rows.Contains(r[1]));

foreach (var row in dataRowsX)
{
    var dataRowY = dtY.Rows.Cast<DataRow>().FirstOrDefault(r => r[0].ToString() == row[1].ToString());

    // código...
}

Basically compares the lines of the two DataTable and then intersect in one.
Finally, it uses this intersection to traverse the DataTable-X and obtaining, for each line, the DataTable-Y for changing values.

  • can explain better. Unfortunately for me the cast was not recognized in dtX/dtY. To recognize you have to use dataset.table[table name].Rows.Cast... I would like to test your suggestion to measure processing time.

  • Which of the Casts was not recognized? The dtX.Rows.Cast<DataRow>?

  • In fact all cast did not recognize. And about my comment on dataset above, there has been nothing, also not recognized.

  • Ha, you probably need the using System.Linq to work :)

  • that’s right :D. Able to compile up to foreach, now presents the msg 'Unable to cast Object of type 'System.Double' to type 'System.String'. '. I’m debugging to understand your suggestion, because I still don’t understand :)

  • I assumed that the information contained in the lines was text, but being numerical gives error. I changed everything to ToString(). Try again.

  • I know it’s a bad question, but I haven’t used lambda expressions and the debug goes very transparent :D... would it be possible to explain what these passages did? 'var rowsX = dtX.Rows.Cast<Datarow>(). Select(r => r[1]. Tostring(). Trim());'

  • Basically creates an enumerate with all values, converted to String, column 1 of each row in table X (in this case).

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.