How to check the execution time of a method?

Asked

Viewed 11,086 times

17

Example: I have the Metodo1 and the Metodo2 both have different processing and actions. I want to check how long each takes to run.

How do I do that?

1 answer

24


You will use the StopWatch().

using static System.Console;
using System.Diagnostics;

public class Program {
    public static void Main() {
        var stopwatch = new Stopwatch();
        stopwatch.Start();
        Teste();
        stopwatch.Stop();
        WriteLine($"Tempo passado: {stopwatch.Elapsed}");
        stopwatch.Restart();
        for (var i = 0; i < 1000; i++) Teste();
        WriteLine($"Tempo passado: {stopwatch.ElapsedTicks}");
        stopwatch.Stop();
    }
    public static void Teste() => WriteLine("Fazendo algo aqui");
}

Behold working in the ideone. And in the .NET Fiddle. Also put on the Github for future reference.

Note that measuring the speed of a single call usually has little relevance. Some factors may influence and give wrong result. In some cases measure the amount of ticks processor is better than time.

Care in doing benchmarking

The main thing is not to take it so seriously. Measuring the time spent under controlled conditions is not the same as knowing the time it will take in a real environment. Measuring without having controlled conditions means that you have variables that may not repeat under other conditions making the measurement unreliable. That is, don’t rely so much on measurement.

Know what you’re measuring

Another important factor is knowing you might be measuring wrong. Even hitting the measuring mechanism (many make mistakes even this) is easy but doing the right test algorithm is not always obvious. You may be testing the wrong thing.

It’s common to see tests where 80, 90 percent of the time spent is not quite what you’re measuring. If you want to know how long it takes the postman to deliver a letter don’t start measuring when put the letter in another state.

Often the problem is the definition of what one wants to measure. Just as often the programmer does not know what he is going to program, often he does not know what exactly to test. I’ve seen tests prove that something is fast and when you actually use an asset, it turns out to be a disaster.

And there are situations where finding the right way to test is virtually impossible.

Biased algorithms

Remember that you may be testing something that has been done to go well in tests but not in real situations (Volkswagen?!?!?). And worse, I’ve seen algorithms actually worsen to meet a specific performance requirement. I remember a case in the 80’s where a database war made the creation of an index to be faster and faster in competing products. And the access to the index, which really mattered, was getting worse and worse.

Measurement validity

Measurements are only valid for that situation. That is, if you are using that dium in another condition, it may have resulted differently. If you change the library version, compiler, Runtime, etc. the result may be different. If the operating system or hardware is different, the result may be different. And it is not something linear. And when I say version, I’m not just talking 1.0 to 2.0. Anything that makes any difference counts.

For example. If you have a 1.6 Ghz processor and then run the test on a 3.2 Ghz machine, it doesn’t mean you’ll have twice the speed. The test will have its own characteristics. So a test can behave well in one environment and badly in another. This goes for the number of colors, size and shape of the cache, quantity and speed of memory, mass storage, quality of the other chips, etc.

Interference

Prepare your environment for minimal interference.

In environments where other tasks can be performed by the operating system, the result will be compromised at some level. Modern operating systems usually give a specific time for an application to run and after that it will return to the scheduler that can deliver the processor for another task. Even though it is very short the time this task is running will be counted in your measurement, but that time was not being spent doing what you are testing.

Ideally the test should be done in an environment where the application has domain over the processor. If it is not possible, try to avoid external interference. I have seen too much testing be completely distorted because of antivirus, for example.

Debug and configuration

Needless to say, test in mode debug doesn’t help much.

Depending on the platform, technology used, it is possible to make adjustments to the installation or configuration to give more accurate or more biased results (without or by choice). Know everything that can be done and choose what to use.

You obviously need to be careful with optimizations. There is a case where the compiler can detect that something can be removed and you think you are testing what you wanted and you are testing only an empty loop, or at least emptied. And worse, you might not even be testing anything, because a compiler can remove an empty loop. Again. Learn how your platform works.

There are cases where it is interesting to turn off certain optimizations. But if you do this wrong you may be getting manufactured result.

Garbage Collector

Another factor that influences is the Garbage Collector. We should all know that you cannot call . Net’s GC manually. It is not the case now because. But then why is it possible to do this? Just for testing. It is common to run the GC before starting the test to take the memory pressure and it end up being called in the middle of the test, changing the result.

But think about it, in the real world it can be called, right? Why should I avoid it? To have a cleaner result. Still out of the real, out of the worst case. In general we want to test the best case.

Can I force the GC to measure a situation where it is called? Power, can, but will probably not yet reproduce the common case in production. Probably put extra pressure on the actual execution. That is, it is very difficult to have a clean result, controlled, close to reality.

Needless to say, the behavior of GC is also an implementation detail and the test will be different in different versions or modes of use. And keep in mind that it will operate very differently in a simple test and an application full of objects. Therefore it is better to leave it out and measure the case where, as far as possible, it does not operate the collection. Until it is proven that it should do the opposite.

GC.Collect();
GC.WaitForPendingFinalizers();

So if you don’t have enough memory, the results can vary enormously.

What to test?

Time is not the only scarce resource. Often doing faster means consuming more memory, worsening and readability, maintainability, security, etc.

We usually measure the average running time in certain circumstances. We did not analyze the worst use case, we did not analyze the use of those resources in a quadratic situation. In general we choose a way to use and we test this.

It’s common to have algorithms that grow geometrically but we think arithmetically. So we tested it by doing 10 iterations and we think that in 100 it will spend 10 times as much time. But it may cost thousands or even millions of times.

Testing something that is in sequence or random can make a huge different. And it can happen counterintuitively.

Testing the individual is quite different from testing the collective. There are cases where testing is done on something too isolated. I already tested software on my machine and it was great, I put it on the client machine and it was a shame.

It gets a lot worse when there’s the possibility of parallelizing the processing. I’m not even going into this.

Sample size

It is obvious that you cannot run the test once and get a reliable result. On the other hand in actual use is probably how it will be used. Of course the low volume usage means that the performance is not critical and you are testing just out of curiosity.

Especially in . Net the first run can cost much more expensive than the others because of the Jitter. Some like to do a code execution to be tested before testing it purely. Thus isolates the worst case of the first execution. When you repeat the test many times, this is little interference. But there are cases that interfere a lot. There are cases that the cache, the location of the information, can make a huge difference.

When you run a repeated tests it will cost artificial loop that against repetition. Is this satisfactory? Under real conditions will there be a similar loop? Is the cost of the code being tested so low that the loop is deeply interfering? It is possible to reliably calculate the cost of the empty loop and remove the total processing?

Repeat often avoids the error of stopping the clock at the wrong time. We may want to save code and let something else run before presenting the result, or start another test before doing all the preparation. This will affect the result in an execution, but if you execute thousands, it will be derisory.

There are cases that it is still interesting to run the test more than once to see if it gives very different results in each attempt. This is different from repeating the execution of the code to be tested. Here I’m talking about repeating the test. Manually.

Right mechanism

Saw that in the above test was used the Stopwatch? Do not measure running time with a clock, but with a processing time meter. It has more resolution. The clock is made to show hours at some point, it does not need precision. It would even be acceptable in tests that last minutes. But why wear something worse? Then avoid testing during daylight saving time :) You may have a test that takes negative time or takes more than 1 hour :)

No wonder he’s in Diagnostics. It is not so precise either. In general it is in the range of tens or even a few hundred ticks processor. It’s a lot but it’s not "microscopic". You can tell the resolution with Stopwatch.Frequency.

Avoid using ElapsedMilliseconds unless you know you have the test, it will never go beyond hundreds of milliseconds. It truncates the result when it goes past one second.

It makes little difference to use Elapsed and ElapsedTicks. The first is just a calculation based on the second. If you use a Elapsed, it might be more interesting to pick up exactly what you want from it, ticks, seconds, milliseconds, minutes, total seconds, etc. It is a TimeSpan.

Benchmarking is not profiling

The purpose of the question here was to measure the execution time of the method. It may be that the right thing is to use a tool of profiling and not test the execution.

Library

I strongly recommend using the library Benchmarkdotnet to make these checks. At the time of the answer I did not know or was not available.

  • These factors really do influence a lot. I noticed that the values between my methods change a lot with each execution. This "tick measurement" also has this 'variation''?

  • Yes, then I will go into detail about this, but I don’t know if I can do it today. I want to do it right.

  • All right, I’ll wait for the rest of the answer. vlw :D

  • I just wanted to see how to make an accountant and I read a benchmarking class, thank you very good Maniero your subject

Browser other questions tagged

You are not signed in. Login or sign up in order to post.