How to concatenate two lists efficiently?

Asked

Viewed 373 times

3

Aiming at the union represented by the figure:

Resultado esperado.

In view of the structure below:

TMyRecord = record
    Index: Int64;
    Kind: TMyEnum;
    Value: String;
end

Having two lists (Generics.Collections.TList<TMyRecord>) A and B use the following method to achieve the union:

ListA.AddRange(ListB.ToArray);
ListA.TrimExcess;

The problem with this method is that it results in low performance for lengths above 400 elements per list.

Another requirement would be to maintain the order where after the last element of A the first of B is to be found. Such ordination would hinder the use of parallelism.

Therefore, what would be some effective ways to achieve union guaranteeing performance and ordination?

  • What is the content of the record?

  • Okay I saw you put the record structure, then check what’s equal you’re doing how? (I get that it’s compiler stuff, but I don’t know what you want to do yet, so it doesn’t hurt to ask)

  • There is no such check. I should simply concatenate the two lists somehow. It is a part of the Tokens.

  • Well, the index I believe is the ordering of the disposition of your lexemas, so what I suggest to you is this: order both lists, and then iterate step by step, always keeping the A with the lowest index. If the index of A is higher than that of B means that it does not have the lexema, and then you can add it in the middle of the list, in the position before the current of your iteration. Even with the extra sorting step, if the sorting is given by the index you will probably get better performance.

  • I don’t think you understand the question.

  • Oh right, I reread it two more times and finally understood what you want. This is a very common problem of using lists... The method of AddRange is usually not optimized in any language... The first thing I would try to do is to for each of List B giving Add in List A. Very stupid, but maybe I’ll give you some performance gain.

  • I believe you don’t have much to do, because adding a list to another is an On operation, where n is the amount of elements on your list, as you have to go through all the elements on your list, the more elements on your list, the longer the operation will be

  • You tried to use the addRange without converting the ListaB to arrray? Gave some difference?

Show 3 more comments

1 answer

1

I could not reproduce low performance even with 3 thousand elements per list. Also the order is being maintained. I believe other things are interfering with your code.

See the code below:

program Project2;

{$APPTYPE CONSOLE}

uses
  SysUtils, Generics.Collections, Diagnostics;

type
  TMyEnum = (meUno, meDuo, meFoo, meBar);

  TMyRecord = record
    Index: Int64;
    Kind: TMyEnum;
    Value: String;
  end;

  TMyLista = Generics.Collections.TList<TMyRecord>;


const
  TamanhoCadaArray = 3000;

var
 ListaA,ListaB: TMyLista;
 ts1: TStopwatch;

procedure PreencheLista(Lista: TMyLista; const nn:string);
var
  I: Integer;
  myRec: TMyRecord;
begin
  for I := 1 to TamanhoCadaArray do
  begin
    myRec.Index := I;
    myRec.Kind := tmyenum(I mod 4);
    myRec.Value := nn+ ' '+ IntToStr(I);
    Lista.Add(myRec);
  end;

end;

begin
  try
    ListaA := TMyLista.Create;
    ListaB := TMyLista.Create;

    PreencheLista(ListaA, 'A');
    PreencheLista(ListaB, 'B');

    ts1.Create;
    ts1.Start;
    ListaA.AddRange(ListaB.ToArray);
    ListaA.TrimExcess;
    ts1.Stop;

    Writeln(ts1.Elapsed.TotalMilliseconds);
    Writeln(ListaA[0].Value);
    Writeln(ListaA[1].Value);
    Writeln(ListaA[2].Value);
    Writeln(ListaA[3].Value);
    Writeln(ListaA[4].Value);

    Writeln(ListaA[TamanhoCadaArray].Value);
    Readln;
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
end.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.