-# on UltraSPARC-III&IV, and 2 cycles latency, such as on SPARC64-V[?],
-# respectively. Being 2x-parallelized the procedure is "worth" 5, 8.5
-# or 6 ticks per SHA1 round. As FPU/VIS instructions are perfectly
-# pairable with IALU ones, the round timing is defined by the maximum
-# between VIS and IALU timings. The latter varies from round to round
-# and averages out at 6.25 ticks. This means that USI&II and SPARC64-V
-# should operate at IALU rate, while USIII&IV - at VIS rate. This
-# explains why performance improvement varies among processors. Well,
-# it should be noted that pure IALU sha1-sparcv9.pl module exhibits
-# virtually uniform performance of ~9.3 cycles per SHA1 round. Timings
-# mentioned above are theoretical lower limits. Real-life performance
-# was measured to be 6.6 cycles per SHA1 round on USIIi and 8.3 on
-# USIII. The latter is lower than half-round VIS timing, because there
-# are 16 Xupdate-free rounds, which "push down" average theoretical
-# timing to 8 cycles...
+# on UltraSPARC-III&IV, and 2 cycles latency(*), respectively. Being
+# 2x-parallelized the procedure is "worth" 5, 8.5 or 6 ticks per SHA1
+# round. As [long as] FPU/VIS instructions are perfectly pairable with
+# IALU ones, the round timing is defined by the maximum between VIS
+# and IALU timings. The latter varies from round to round and averages
+# out at 6.25 ticks. This means that USI&II should operate at IALU
+# rate, while USIII&IV - at VIS rate. This explains why performance
+# improvement varies among processors. Well, given that pure IALU
+# sha1-sparcv9.pl module exhibits virtually uniform performance of
+# ~9.3 cycles per SHA1 round. Timings mentioned above are theoretical
+# lower limits. Real-life performance was measured to be 6.6 cycles
+# per SHA1 round on USIIi and 8.3 on USIII. The latter is lower than
+# half-round VIS timing, because there are 16 Xupdate-free rounds,
+# which "push down" average theoretical timing to 8 cycles...
+
+# (*) SPARC64-V[II] was originally believed to have 2 cycles VIS
+# latency. Well, it might have, but it doesn't have dedicated
+# VIS-unit. Instead, VIS instructions are executed by other
+# functional units, ones used here - by IALU. This doesn't
+# improve effective ILP...