ec: Add PPC64 vector assembly version of p521 field operations
Only field multiplication and squaring (but not reduction) show a
significant improvement. This is enabled on Power ISA >= 3.0.
On a Power 9 CPU an average 10% performance improvement is seen (ECHDE:
14%, ECDSA sign: 6%, ECDSA verify 10%), compared to existing code.
On an upcoming Power 10 CPU we see an average performance improvement
of 26% (ECHDE: 38%, ECDSA sign: 16%, ECDSA verify 25%), compared to
existing code.
Signed-off-by: Amitay Isaacs <amitay@ozlabs.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15401)