What is "performance"

Application performance is often a primary concern for mission-critical systems. If your question refers to optimization, whether it’s database queries, algorithms, reducing network overhead or transactions, or anything that deals with speed or capacity, consider using this tag.

A good question shows the performance targets that need to be met, as well as other constraints. Trying to optimize something without measurement is not a matter of "performance".

Performance is usually represented in [Big O notation] (https://en.wikipedia.org/wiki/Big_O_notation) , classifying how the resource needs of an algorithm change in response to a change in input size.

This tag can also represent Performance tuning, which is one of the key Non-functional Requirements of an application or system.

The two main performance measures are:

Transfer rate (how much in a period of time), for example TPS (transactions per second), MB/s (Mega bytes per second), Gb/s (giga bits per second), messages/requests/pages per second.
Latency (how long to wait for an action), for example 8 ms search time.

Latency is often qualified as a statistical measure. Note: latencies typically do not follow a normal distribution and have very high upper limits compared to average latency. Just like standard deviation, not useful.

Average latency. The average of all latencies.
Typical or average latency. The midpoint of the range of possible latencies. This is usually 50% to 90% of the average latency. How this is the lowest value is often reported by sellers.
- Percentile latency. The result is less than or equal to N% of the time. i.e., 99 percentile if the latency is no more than that, 99 times of 100.
- Worst or maximum latency. The maximum latency measured .

Before improving performance, prototype and measure first, optimize only if and when needed.