X-Git-Url: https://git.openssl.org/gitweb/?p=openssl.git;a=blobdiff_plain;f=crypto%2Fbn%2Fasm%2Fia64.S;h=abc11000c4e7249c11e3aa90e175b97e85680955;hp=2fdf5bbabe15f8e25e3db2bcfbac71aac101cd29;hb=dcf6e50f48e6bab92dcd2dacb27fc17c0de34199;hpb=44c8a5e2b9af8909844cc002c53049311634b314;ds=sidebyside diff --git a/crypto/bn/asm/ia64.S b/crypto/bn/asm/ia64.S index 2fdf5bbabe..abc11000c4 100644 --- a/crypto/bn/asm/ia64.S +++ b/crypto/bn/asm/ia64.S @@ -29,7 +29,7 @@ // ports is the same, i.e. 2, while I need 4. In other words, to this // module Itanium2 remains effectively as "wide" as Itanium. Yet it's // essentially different in respect to this module, and a re-tune was -// required. Well, because some intruction latencies has changed. Most +// required. Well, because some instruction latencies has changed. Most // noticeably those intensively used: // // Itanium Itanium2 @@ -370,7 +370,7 @@ bn_mul_words: // The loop therefore spins at the latency of xma minus 1, or in other // words at 6*(n+4) ticks:-( Compare to the "production" loop above // that runs in 2*(n+11) where the low latency problem is worked around -// by moving the dependency to one-tick latent interger ALU. Note that +// by moving the dependency to one-tick latent integer ALU. Note that // "distance" between ldf8 and xma is not latency of ldf8, but the // *difference* between xma and ldf8 latencies. .L_bn_mul_words_ctop: @@ -432,7 +432,7 @@ bn_mul_add_words: // version was performing *all* additions in IALU and was starving // for those even on Itanium 2. In this version one addition is // moved to FPU and is folded with multiplication. This is at cost -// of propogating the result from previous call to this subroutine +// of propagating the result from previous call to this subroutine // to L2 cache... In other words negligible even for shorter keys. // *Overall* performance improvement [over previous version] varies // from 11 to 22 percent depending on key length. @@ -1529,9 +1529,8 @@ bn_div_words: // output: f8 = (int)(a/b) // clobbered: f8,f9,f10,f11,pred pred=p15 -// One can argue that this snippet is copyrighted to Intel -// Corporation, as it's essentially identical to one of those -// found in "Divide, Square Root and Remainder" section at +// This snippet is based on text found in the "Divide, Square +// Root and Remainder" section at // http://www.intel.com/software/products/opensource/libraries/num.htm. // Yes, I admit that the referred code was used as template, // but after I realized that there hardly is any other instruction