md5: add assembly implementation for loongarch64
authorMin Zhou <zhoumin@loongson.cn>
Wed, 13 Dec 2023 14:40:14 +0000 (22:40 +0800)
committerTomas Mraz <tomas@openssl.org>
Wed, 27 Dec 2023 09:15:29 +0000 (10:15 +0100)
commit3d68e2937ee5c50eacef5f4c34abdf7c0e4dc479
tree62032cc26380c6a302b45a3db9d6d447d0366bc7
parent9277ed0a4fc082807ad8d8f66925fb7968437cf6
md5: add assembly implementation for loongarch64

This change can improve md5 performance by using a hand-optimized
assembly implementation of the inner loop of md5 calculation.
This implementation refered to md5-x86_64.pl and made more effort
to reorder instructions for separating data dependencies as much
as possible.

Test with:
$ openssl speed md5

3A5000
type             16 bytes    64 bytes     256 bytes    1024 bytes   8192 bytes   16384 bytes
md5              45061.04k   130440.75k   291105.28k   421101.23k   484639.27k   488320.43k
md5-modified     47179.95k   139015.57k   308836.69k   445963.26k   512540.67k   518215.00k
                   +5%         +7%          +6%          +6%          +6%          +6%

3A6000
type             16 bytes    64 bytes     256 bytes    1024 bytes   8192 bytes   16384 bytes
md5              60070.06k   161822.76k   325817.60k   438017.02k   486864.21k   492243.31k
md5-modified     62827.74k   170294.04k   343795.03k   463324.50k   515831.13k   520060.93k
                   +5%         +5%          +6%          +6%          +6%          +6%

Signed-off-by: Min Zhou <zhoumin@loongson.cn>
Co-authored-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21704)
crypto/md5/asm/md5-loongarch64.pl [new file with mode: 0755]
crypto/md5/build.info
crypto/md5/md5_local.h