sha/asm/keccak1600-armv4.pl: improve non-NEON performance by ~10%.
authorAndy Polyakov <appro@openssl.org>
Mon, 31 Jul 2017 07:36:46 +0000 (09:36 +0200)
committerAndy Polyakov <appro@openssl.org>
Wed, 2 Aug 2017 21:22:28 +0000 (23:22 +0200)
commitd9ca12cbf6287aee7d86579f4c03be1155696c9f
tree8c5da203a938449b0cca52f8633bb58d63555475
parent7e885b7bdfad897596e3c954e7c3a2d53a9a5cbe
sha/asm/keccak1600-armv4.pl: improve non-NEON performance by ~10%.

This is achieved mostly by ~10% reduction of amount of instructions
per round thanks to a) switch to KECCAK_2X variant; b) merge of
almost 1/2 rotations with logical instructions. Performance is
improved on all observed processors except on Cortex-A15. This is
because it's capable of exploiting more parallelism and can execute
original code for same amount of time.

Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/4057)
crypto/sha/asm/keccak1600-armv4.pl