6
I was looking at some publications on the BLAKE cryptographic algorithm, which was one of the finalists in the SHA-3 competition, whose winner was Keccak.
Finally, in a specific excerpt from the book "The Hash Function BLAKE", right at the beginning, says:
Keccak offers acceptable performance in software, and Excellent performance in hardware.
Source 1 Source 2 (original by NIST)
NIST could just as easily have stated that BLAKE offers Excellent performance in software and acceptable performance in hardware; Nowhere Did NIST Suggest that hardware is more Important than software
My question was to research how something can be fast in hardware and slow in software and especially the other way around. All the places I found talk about such a FPGA (and also ASIC), this is also present in the NIST text, in the SHA-3 competition:
3.2 Performance
NIST was Fortunate to have a Great Depth of performance data on the five finalists that could also be Compared with the performance data of the SHA-2 Algorithms. This data included software implementations on Many Different Kinds of Central Processing Units (Cpus), and hardware implementations in Both Field Programmable Gate Arrays (Fpgas) and Application Specific Integrated Circuits (Asics). All this data made simple comparisons very Difficult; Most Algorithms excelled on some Platforms and lagged on others. However, a few Patterns emerged from the performance data, which affected NIST’s Decision
We know then that there are Cpus, Fpgas and Asics. This has also been mentioned in other answers, that only thing I found in the O.R., in "An attacker with GPU or FPGA may want to do this, but will have difficulty.".
What is the difference between running on a CPU and running on an FPGA? How is it possible something be faster in software, CPU, than an FPGA? What would be the difficulties of an FPGA being as fast as the CPU?
I think which is out of scope. I would like it to be accepted and interested in an answer, including my positive vote.
– Jéf Bueno
I’m not very familiar with encryption algorithms, but I think it’s possible that an implementation is faster in software when the calculation is essentially sequential. FPGA works naturally with parallelism, while the CPU is already designed to receive sequential commands.
– Woss