Grøstl – a SHA-3 candidate
Implementations
Please read the Grøstl Implementation Guide before implementing Grøstl yourself.
The reference implementation, some optimized code in C and assembly language are part of the new NIST submission package for tweaked Grøstl.
These and other software implementations have also been submitted to the eBASH benchmarking project. This tarball contains the Grøstl implementations currently (or soon to be) included in eBASH. Among these are
- NEW! The first Grøstl-256 implementation using AVX2 VPGATHERQQ instructions using only 169 instructions per round can be downloaded here.
- NEW! A new Grøstl-256 NEON implementation using bitslicing (48.6 cycles/byte on Cortex-A8, Hercules eCAFE) is available in eBASH.
- NEW! A vperm implementation of Grøstl-256 using NEON can be downloaded from eBASH. This implementation can serve as a basis for future NEON implementations using ARMv8 AES instructions.
- NEW! The first Grøstl-256 NEON implementation running at 45.8 cycles/byte (Cortex-A8, Hercules eCAFE) can be downloaded from eBASH.
- NEW! Improved Lightweigth Grøstl implementations have been submitted to eBASH, resulting in an overall 3rd place in XBX.
- NEW! The Grøstl-512 AVX2 implementation presented at the 3rd SHA-3 Candidate Conference with 40% less instructions (compared to AVX) can be found here.
- NEW! First Grøstl-256 implementation below 10 cycles/byte on Intel Core i7-2600K with AES-NI (see below).
- Very fast constant-time inline assembly and intrinsics implementations containing Intel AES and AVX instructions
- Fast constant-time Grøstl implementations using Mike Hamburg's technique to compute the S-box using vector permute instructions (partially based on Çağdaş Çalık's implementations)
- Inline assembly implementations optimized for Intel Core 2 Duo and AMD Opteron processors
- C implementations optimized for 32-bit and 64-bit processors
Software benchmarks of Grøstl from eBASH
All eBASH benchmarking results for Grøstl-256 can be found here, all results for Grøstl-512 can be found here.
Digest size | Processor | Mode | Speed |
---|---|---|---|
224/256 | Intel Core i7-2600K (with AES-NI) | 64-bit | 9.6 cycles/byte |
Intel Xeon E5620 (with AES-NI) | 64-bit | 11.3 cycles/byte | |
AMD Phenom II X6 | 64-bit | 17.3 cycles/byte | |
AMD Opteron 8354 | 64-bit | 19.8 cycles/byte | |
Intel Core 2 Duo E6400 | 64-bit | 22.3 cycles/byte | |
Intel Pentium M | 32-bit | 38.8 cycles/byte | |
384/512 | Intel Core i7-2600K (with AES-NI) | 64-bit | 13.8 cycles/byte |
Intel Xeon E5620 (with AES-NI) | 64-bit | 16.2 cycles/byte | |
AMD Phenom II X6 | 64-bit | 31.7 cycles/byte | |
AMD Opteron 8354 | 64-bit | 33.6 cycles/byte | |
Intel Core 2 Duo E8400 | 64-bit | 32.2 cycles/byte | |
Intel Pentium M | 32-bit | 76.1 cycles/byte |
8-bit software implementations of Grøstl
Johannes Feichtner has implemented new 8-bit versions of Grøstl-256 for an 8-bit ATmega163 microcontroller, based on the work by Günther A. Roland. This results in by far the smallest RAM implementation (136 bytes) of any SHA-3 finalist. The detailed results are given below.
Version (state) | RAM (bytes) | ROM (bytes) | Speed for 1536 (cycles/byte) |
---|---|---|---|
Low RAM | 136 | 1898 | 571 |
Low ROM | 519 | 1322 | 498 |
High speed | 533 | 4982 | 458 |
Hardware ASIC implementations of Grøstl-0
Stefan Tillich developed high-speed Grøstl-0-256 ASIC implementations in 0.18µm technology of UMC. Here are the synthesis results.
Total area (mm²) | Total area (GE) | Throughput (Gbit/s) |
---|---|---|
547,227.47 | 58,403 | 6.290 |
538,462.41 | 57,467 | 6.141 |
523,472.74 | 55,867 | 5.690 |
471,626.06 | 50,334 | 2.725 |