Faster way to access properties in a C#list

Asked

Viewed 1,360 times

7

I have a project that works with a large volume of data, and I need to optimize it to bring results of some calculations in a considerably small time. I know that I have several aspects to take into consideration, such as the structure in which the data was saved in the database, the way I am doing access to them, how I am performing the calculations among others, but disregarding all these items, I would like to take into consideration only the question raised below.

My doubt is more conceptual than a problem in my code. But something specific...

Consider the following list:

var minhaLista = new List<MeuObjeto>
{
   // Objetos...
};

meuObjeto has the following properties:

public class MeuObjeto
{
  public int Prop1 {get; set;}
  public string Prop2 {get; set;}
  public decimal Prop3 {get; set;}
  public bool Prop4 {get; set;}
}


I need to access each of the properties in a loop with n items as quickly as possible and economically in relation to memory. But if I have to choose between speed and memory, I must choose speed.

Every millisecond is very important, so I’m taking into consideration some aspects like for be faster than the foreach, or declare a constant with 0 and uses it to instantiate the loop control variable to be better than instantiating directly with 0.

So consider INICIO as follows:

private const int INICIO = 0;

Consider OutroObjeto as an object similar to MeuObjeto for example only.

Form 1:

var outraLista = new List<OutroObjeto>();

for (int i = INICIO; i < minhaLista.Count; i++)
{
   var outroObjeto = new OutroObjeto
   {
      Prop1 = minhaLista[i].Prop1,
      Prop2 = minhaLista[i].Prop2,
      Prop3 = minhaLista[i].Prop3,
      Prop4 = minhaLista[i].Prop4
   };
   outraLista.Add(outroObjeto );
}

In this case, for each property a search is made on the list by object in position i?


Form 2:

var outraLista = new List<OutroObjeto>();

for (int i = INICIO; i < minhaLista.Count; i++)
{
   var meuObjetoI = minhaLista[i];

   var outroObjeto = new OutroObjeto
   {
      Prop1 = meuObjetoI.Prop1,
      Prop2 = meuObjetoI.Prop2,
      Prop3 = meuObjetoI.Prop3,
      Prop4 = meuObjetoI.Prop4
   };
   outraLista.Add(outroObjeto );
}

Apparently this section works in a similar way to foreach, but access to each property of the object in position i of the list will be faster than in Form 1?

Technically meuObjetoI only points to the list object in the position i which is already allocated in memory, correct?


What would be the most suitable way taking into account the time and the memory consumption?
Or is there a third option that’s better?

  • How are you getting to minhaLista ?

  • @Randrade, thank you so much for your interest in helping, but I’d like you to consider minhaLista as a common list of n MeuObjeto, and only take into account access to them in the forms presented. Is that possible or do you really need that detail to formulate a response?

  • Is that usually the best way to do this could be at the source, where you’re getting the data originally.

  • 1

    Youthful, install this.

3 answers

7


Pre-buy what you can

Zero is a constant, so you don’t have to worry about it. But .Count will be re-evaluated in each interaction if the compiler fails to "prove" that minhaLista is a strictly local variable. As there are no considerations on this then first optimization is to use for to decrease pressure on the Garbage Collector, and pre-compute .Count:

var count = minhaLista.Count;

for (int i = 0; i < count; i++)
{
    ...
}

This optimization assumes that the list has constant size.

Decrease the indirect 1

As commented, each indirect to solve is a further processing, at least in theory. This may be a dumb optimization, but again, if minhaLista is not strictly local, within the for:

for (int i = 0; i < limit; i++)
{
    var item = minhaLista[ i ];
    var outroObjeto = new OutroObjeto
    {
        Prop1 = item.Prop1,
        Prop2 = item.Prop2,
        Prop3 = item.Prop3,
        Prop4 = item.Prop4
    };
    outraLista.Add(outroObjeto );
}

This cuts the repeated access type minhaLista[i].Obj.

Decreasing Indirect 2

Simple properties, or properties with simple lifts ({get;set;}) has extremely optimized code, so the code above, although it sounds like verbiage, may already be as fast as possible. However, such a copy may violate the principle of DRY, then create a constructor of OutroObjeto that accepts MeuObjeto is interesting because:

  1. Improves the expressiveness of your code
  2. Decreases the indirect because half of the properties will be local accesses of one of the objects.

An alternative is to hang a method .ToOutro() in MeuObjeto, doing something similar to the content of for, above.

var minhaLista = new List<MeuObjeto>();
...

var outraLista = new List<OutroObjeto>();
var count = minhaLista.Count;

for (int i = 0; i < count; i++)
{
    outraLista.Add( minhaLista[ i ].ToOutro() );
}

It depends somewhat on personal taste. If you prefer a .To of life but ugly mix domains of the application, an extension method is the path of having a code that connects the objects, without been in any of the objects.

Readonly Collection

The machinery of List<> It’s complicated. It has to be. But trying to replace that feature with a homemade implementation doesn’t have the best prognosis there. However "all" what your above code does is copy one list of objects into another. Hence the question, new list has to be dynamic?

If the answer is no, specifically if the new created list is not then modified in terms of amount of records, then:

var minhaLista = new List<MeuObjeto>();
...
var count = minhaLista.Count;
var outraLista = new OutroObjeto[count];

for (int i = 0; i < count; i++)
    outraLista[ i ] = minhaLista[ i ].ToOutro();

Readonly Object

This is a radical proposal, but one that is actually used in places where performance is absolutely priority: read-only objects, and if they are small enough, structs instead of objects.

It is not a transition that should be done without much study. Getting out of mutant classes for struct is difficult, and prone to errors. If these objects come from a database, hence the suggestion is nay go that way.

But if these objects are somehow ephemeral data, loaded to be discarded or calculation products, using read-only classes/structs gives you the surety that this data will not be changed under any circumstances, which may prevent this type of conversion, and opens some compiler-level optimizations, which can speed up the execution of the code well.

4

In this case, for each property a search is made in the list by the object at position i?

Search is not exactly the word, I think it implies a long processing. A direct access to the element is made according to the index. It is an access to a memory point like any variable.

Essentially it makes no difference whether you are accessing the variable as an element or as a whole object. If in the end you have to access all elements this little changes. Of course it depends on what you’re going to do, what might be better in one context may be worse in another.

I wouldn’t venture to say for sure which one would be faster without testing under appropriate conditions, which we don’t always know how to do. Still the test result in simple operation may be irrelevant within another context. I have seen several times the person test under controlled conditions to obtain the most accurate result and then when use in production happens something else.

Apparently this section works in a similar way to foreach, but access to each object property at position i of the list will be faster than in Form 1?

It has nothing like the foreach. The foreach works with an enumerator object, accesses other methods, has more difficulty of optimization and mainly has a proposal quite different from the presented.

Technically meObjetoI only points to the list object at the i position that is already allocated in memory, correct?

Correct in this case. You will receive the value stored in the list element. As it is an object by reference, shall receive the reference (or pointer if you prefer) for the object and not the object itself. If the type were by value it would receive it.

What would be the most appropriate way taking into account memory time and consumption? Or is there a third option that is better?

I see no reason for the second form. it is worse. I have serious doubts whether it really needs to do this. What I can say is that without context everything can be right or wrong. I think you can do better, but it depends on the context.

Completion

Overall I don’t know if every millisecond is that important. Mostly without context. I see very experienced people doing everything wrong trying to optimize. Optimizing right is harder than it seems.

I don’t think that’s the case but often choosing not to worry about memory is choosing to have worse speed. To see how complex it is.

I don’t know if it’s so true foreach is slower than the for. I I’ve shown here at Sopt that it’s not like that. This is one of the reasons people make mistakes trying to optimize, believe in false truths.

I hope that this answer will serve even more to demystify the idea of optimization than to answer the specific case that does not seem to be relevant.

2

The two shapes grow linearly with the size of your list, so the difference is small.

But the second way is a little worse in terms of performance and memory because despite being just a pointer, you need to allocate the address and memory to it to point to the object of the worm[i] every time and this is an extra step and unnecessary.

As you asked otherwise, you could pass to the Otherobject constructor and let it handle the required properties internally, without needing to allocate temporary variables, since you add the reference to your otherList.

Form 3:

for (int i = INICIO; i < minhaLista.Count; i++)
{
    outraLista.Add(new OutroObjeto(minhaLista[i]));
}
  • Great reply @Rodrigo-Guiotti and also thank you very much for the proposal of Form 3. You helped me a lot and I will definitely implement! I would just like to ask, if possible for you, to address a little more about the doubts highlighted in the question...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.