LoongArch64 assembly pack: Fix ChaCha20 ABI breakage
authorXi Ruoyao <xry111@xry111.site>
Sat, 25 Nov 2023 09:53:57 +0000 (17:53 +0800)
committerTomas Mraz <tomas@openssl.org>
Tue, 19 Dec 2023 13:12:24 +0000 (14:12 +0100)
commitb46de72c260e7c4d9bfefa35b02295ba32ad2ac6
treef3a4258174259d6135c1faf0d457c5debd93732e
parentdfd986b6f5402e5646e42425d14f098ed6bc4544
LoongArch64 assembly pack: Fix ChaCha20 ABI breakage

The [LP64D ABI][1] requires the floating-point registers f24-f31
(aka fs0-fs7) callee-saved.  The low 64 bits of a LSX/LASX vector
register aliases with the corresponding FPR, so we must save and restore
the callee-saved FPR when we writes into the corresponding vector
register.

This ABI breakage can be easily demonstrated by injecting the use of a
saved FPR into the test in bio_enc_test.c:

    static int test_bio_enc_chacha20(int idx)
    {
        register double fs7 asm("f31") = 114.514;
        asm("#optimize barrier":"+f"(fs7));
        return do_test_bio_cipher(EVP_chacha20(), idx) && fs7 == 114.514;
    }

So fix it.  To make the logic simpler, jump into the scalar
implementation earlier when LSX and LASX are not enumerated in AT_HWCAP,
or the input is too short.

[1]: https://github.com/loongson/la-abi-specs/blob/v2.20/lapcs.adoc#floating-point-registers

Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22817)
crypto/chacha/asm/chacha-loongarch64.pl