The following is a table with the clock latency of the multiplication and division operations for comparison.
Note that if your program actually does perform such operations, this is what will occur at the hardware level and there is no escape from this.
| Intel Core i7 |
|-----------------------------------|
| Instruction | Operand | Latency |
|-------------|-----------|---------|
| MUL/IMUL | r8 | 3 |
| MUL/IMUL | r16 | 5 |
| MUL/IMUL | r32 | 5 |
| MUL/IMUL | r64 | 3 |
| IMUL | r16,r16 | 3 |
| IMUL | r32,r32 | 3 |
| IMUL | r64,r64 | 3 |
| IMUL | r16,r16,i | 3 |
| IMUL | r32,r32,i | 3 |
| IMUL | r64,r64,i | 3 |
| MUL/IMUL | m8 | 3 |
| MUL/IMUL | m16 | 5 |
| MUL/IMUL | m32 | 5 |
| MUL/IMUL | m64 | 3 |
| IMUL | r16,m16 | 3 |
| IMUL | r32,m32 | 3 |
| IMUL | r64,m64 | 3 |
| IMUL | r16,m16,i | |
| IMUL | r32,m32,i | |
| IMUL | r64,m64,i | |
| DIV | r8 | 11-21 |
| DIV | r16 | 17-22 |
| DIV | r32 | 17-28 |
| DIV | r64 | 28-90 |
| IDIV | r8 | 10-22 |
| IDIV | r16 | 18-23 |
| IDIV | r32 | 17-28 |
| IDIV | r64 | 37-100 |
| FMUL | r | 5 |
| FMUL | m | |
| FDIV | r | 7-27 |
| FDIV | m | 7-27 |
| FIMUL | m | 5 |
| FIDIV | m | 7-27 |
| AMD Steamroller |
|-------------------------------------|
| Instruction | Operand | Latency |
|-------------|-------------|---------|
| MUL/IMUL | r8/m8 | 4 |
| MUL/IMUL | r16/m16 | 4 |
| MUL/IMUL | r32/m32 | 4 |
| MUL/IMUL | r64/m64 | 6 |
| IMUL | r16,r16/m16 | 4 |
| IMUL | r32,r32/m32 | 4 |
| IMUL | r64,r64/m64 | 6 |
| IMUL | r16,(r16),i | 5 |
| IMUL | r32,(r32),i | 4 |
| IMUL | r64,(r64),i | 6 |
| IMUL | r16,m16,i | |
| IMUL | r32,m32,i | |
| IMUL | r64,m64,i | |
| DIV | r8/m8 | 17-22 |
| DIV | r16/m16 | 15-25 |
| DIV | r32/m32 | 13-39 |
| DIV | r64/m64 | 13-70 |
| IDIV | r8/m8 | 17-22 |
| IDIV | r16/m16 | 14-25 |
| IDIV | r32/m32 | 13-39 |
| IDIV | r64/m64 | 13-70 |
| FMUL | r/m | 5 |
| FIMUL | m | |
| FDIV | r | 9-37 |
| FDIV | m | |
| FIDIV | m | |
| VIA Nano L3050 |
|-----------------------------------|
| Instruction | Operand | Latency |
|-------------|-----------|---------|
| MUL/IMUL | r8 | 2 |
| MUL/IMUL | r16 | 3 |
| MUL/IMUL | r32 | 3 |
| MUL/IMUL | r64 | 8 |
| IMUL | r16,r16 | 2 |
| IMUL | r32,r32 | 2 |
| IMUL | r64,r64 | 5 |
| IMUL | r16,r16,i | 2 |
| IMUL | r32,r32,i | 2 |
| IMUL | r64,r64,i | 5 |
| DIV | r8 | 22-24 |
| DIV | r16 | 24-28 |
| DIV | r32 | 22-30 |
| DIV | r64 | 145-162 |
| IDIV | r8 | 21-24 |
| IDIV | r16 | 24-28 |
| IDIV | r32 | 18-26 |
| IDIV | r64 | 182-200 |
| FMUL | r/m | 4 |
| FDIV | r/m | 14-23 |
Source: Instruction Tables: Lists of Instruction latencies, throughputs and micro-Operation Breakdowns for Intel, AMD and VIA Cpus
This shows that in fact your processor will perform divisions slower than multiplications. This is due to the hardware implementation of these operations suffering from the same questions of algorithmic complexity already discussed in other answers. Regardless, your program in one way or another will make use of these instructions to perform such operations efficiently, and will be limited to the performance they offer.
As for the benchmarks and dubious tips that are supposedly based on this, beware. Each language will treat the expressions as it defines, the expression 1/2
in the source in C and C++ will end up becoming a constant 0
in the executable, and no division will be performed while running the program, in a dynamic language like javascript for example, the cost of interpreting the two operations may make the cost of the discussed arithmetic instructions irrelevant to the analysis. This is just an example, and the wise programmer will know when multiplication vs division makes sense or not, there are cases that actually makes, algorithms (I come to mind Bresenham now...) and languages where this makes a difference. When the programmer is unaware of this, and practices habits in a religious way, he falls into the case of micro optimization that is the root of all evil.
I did the test on the console and did not notice any difference
– Silvio Andorinha
@Silvioandorinha I believe it is noticeable if the execution of the code needs to be repeated numerous times.
– Guilherme de Jesus Santos
I created a performance test in Javascript: http://jsperf.com/multiply-vs-divide-fight from time to time the division is faster, so if there is even a difference in Javascript, it should be insignificant.
– Oralista de Sistemas
Now the irony: the split seems to be consistently faster than multiplication in Internet Explorer 10.
– Oralista de Sistemas
I’m using Chrome and in the jsperf test the division was faster... I don’t think this question is pertinent in js
– rafaelcastrocouto
In my opinion, a good answer needs to contemplate the following: 1) hardware, division (iterative) is inherently slower than multiplication (parallel); 2) even with division having more latency in hardware than multiplication, what is the impact of the flow of these operations; 3) Constant operations are optimized by modern compilers. Does the same occur in Javascript engines? If yes, a state-of-the-art discussion; 4) improve the evaluation tests: the ones listed here appear to be inconclusive, either by use of constants or by not (des)considering the administrative burden of the tests.
– user7835
Do the bill by hand. Division is more work than multiplication, right? For the processor too :)
– epx