LoongArch64 assembly pack: Fix ChaCha20 ABI breakage
The [LP64D ABI][1] requires the floating-point registers f24-f31
(aka fs0-fs7) callee-saved. The low 64 bits of a LSX/LASX vector
register aliases with the corresponding FPR, so we must save and restore
the callee-saved FPR when we writes into the corresponding vector
register.
This ABI breakage can be easily demonstrated by injecting the use of a
saved FPR into the test in bio_enc_test.c:
static int test_bio_enc_chacha20(int idx)
{
register double fs7 asm("f31") = 114.514;
asm("#optimize barrier":"+f"(fs7));
return do_test_bio_cipher(EVP_chacha20(), idx) && fs7 == 114.514;
}
So fix it. To make the logic simpler, jump into the scalar
implementation earlier when LSX and LASX are not enumerated in AT_HWCAP,
or the input is too short.
[1]: https://github.com/loongson/la-abi-specs/blob/v2.20/lapcs.adoc#floating-point-registers
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22817)