crypto/engine/README

   1 NOTES, THOUGHTS, and EVERYTHING
   2 -------------------------------
   3
   4 (1) Concurrency and locking ... I made a change to the ENGINE_free code
   5     because I spotted a potential hold-up in proceedings (doing too
   6     much inside a lock including calling a callback), there may be
   7     other bits like this. What do the speed/optimisation freaks think
   8     of this aspect of the code and design? There's lots of locking for
   9     manipulation functions and I need that to keep things nice and
  10     solid, but this manipulation is mostly (de)initialisation, I would
  11     think that most run-time locking is purely in the ENGINE_init and
  12     ENGINE_finish calls that might be made when getting handles for
  13     RSA (and friends') structures. These would be mostly reference
  14     count operations as the functional references should always be 1
  15     or greater at run-time to prevent init/deinit thrashing.
  16
  17 (2) nCipher support, via the HWCryptoHook API, is now in the code.
  18     Apparently this hasn't been tested too much yet, but it looks
  19     good. :-) Atalla support has been added too, but shares a lot in
  20     common with Ben's original hooks in bn_exp.c (although it has been
  21     ENGINE-ified, and error handling wrapped around it) and it's also
  22     had some low-volume testing, so it should be usable.
  23
  24 (3) Of more concern, we need to work out (a) how to put together usable
  25     RAND_METHODs for units that just have one "get n or less random
  26     bytes" function, (b) we also need to determine how to hook the code
  27     in crypto/rand/ to use the ENGINE defaults in a way similar to what
  28     has been done in crypto/rsa/, crypto/dsa/, etc.
  29
  30 (4) ENGINE should really grow to encompass more than 3 public key
  31     algorithms and randomness gathering. The structure/data level of
  32     the engine code is hidden from code outside the crypto/engine/
  33     directory so change shouldn't be too viral. More important though
  34     is how things should evolve ... this needs thought and discussion.
  35
  36
  37 -----------------------------------==*==-----------------------------------
  38
  39 More notes 2000-08-01
  40 ---------------------
  41
  42 Geoff Thorpe, who designed the engine part, wrote a pretty good description
  43 of the thoughts he had when he built it, good enough to include verbatim here
  44 (with his permission)                                   -- Richard Levitte
  45
  46
  47 Date: Tue, 1 Aug 2000 16:54:08 +0100 (BST)
  48 From: Geoff Thorpe
  49 Subject: Re: The thoughts to merge BRANCH_engine into the main trunk are
  50  emerging
  51
  52 Hi there,
  53
  54 I'm going to try and do some justice to this, but I'm a little short on
  55 time and the there is an endless amount that could be discussed on this
  56 subject. sigh ... please bear with me :-)
  57
  58 > The changes in BRANCH_engine dig deep into the core of OpenSSL, for example
  59 > into the RSA and RAND routines, adding a level of indirection which is needed
  60 > to keep the abstraction, as far as I understand.  It would be a good thing if
  61 > those who do play with those things took a look at the changes that have been
  62 > done in the branch and say out loud how much (or hopefully little) we've made
  63 > fools of ourselves.
  64
  65 The point here is that the code that has emerged in the BRANCH_engine
  66 branch was based on some initial requirements of mine that I went in and
  67 addressed, and Richard has picked up the ball and run with it too. It
  68 would be really useful to get some review of the approach we've taken, but
  69 first I think I need to describe as best I can the reasons behind what has
  70 been done so far, in particular what issues we have tried to address when
  71 doing this, and what issues we have intentionally (or necessarily) tried
  72 to avoid.
  73
  74 methods, engines, and evps
  75 --------------------------
  76
  77 There has been some dicussion, particularly with Steve, about where this
  78 ENGINE stuff might fit into the conceptual picture as/when we start to
  79 abstract algorithms a little bit to make the library more extensible. In
  80 particular, it would desirable to have algorithms (symmetric, hash, pkc,
  81 etc) abstracted in some way that allows them to be just objects sitting in
  82 a list (or database) ... it'll just happen that the "DSA" object doesn't
  83 support encryption whereas the "RSA" object does. This requires a lot of
  84 consideration to begin to know how to tackle it; in particular how
  85 encapsulated should these things be? If the objects also understand their
  86 own ASN1 encodings and what-not, then it would for example be possible to
  87 add support for elliptic-curve DSA in as a new algorithm and automatically
  88 have ECC-DSA certificates supported in SSL applications. Possible, but not
  89 easy. :-)
  90
  91 Whatever, it seems that the way to go (if I've grok'd Steve's comments on
  92 this in the past) is to amalgamate these things in EVP as is already done
  93 (I think) for ciphers or hashes (Steve, please correct/elaborate). I
  94 certainly think something should be done in this direction because right
  95 now we have different source directories, types, functions, and methods
  96 for each algorithm - even when conceptually they are very much different
  97 feathers of the same bird. (This is certainly all true for the public-key
  98 stuff, and may be partially true for the other parts.)
  99
 100 ENGINE was *not* conceived as a way of solving this, far from it. Nor was
 101 it conceived as a way of replacing the various "***_METHOD"s. It was
 102 conceived as an abstraction of a sort of "virtual crypto device". If we
 103 lived in a world where "EVP_ALGO"s (or something like them) encapsulated
 104 particular algorithms like RSA,DSA,MD5,RC4,etc, and "***_METHOD"s
 105 encapsulated interfaces to algorithms (eg. some algo's might support a
 106 PKC_METHOD, a HASH_METHOD, or a CIPHER_METHOD, who knows?), then I would
 107 think that ENGINE would encapsulate an implementation of arbitrarily many
 108 of those algorithms - perhaps as alternatives to existing algorithms
 109 and/or perhaps as new previously unimplemented algorithms. An ENGINE could
 110 be used to contain an alternative software implementation, a wrapper for a
 111 hardware acceleration and/or key-management unit, a comms-wrapper for
 112 distributing cryptographic operations to remote machines, or any other
 113 "devices" your imagination can dream up.
 114
 115 However, what has been done in the ENGINE branch so far is nothing more
 116 than starting to get our toes wet. I had a couple of self-imposed
 117 requirements when putting the initial abstraction together, and I may have
 118 already posed these in one form or another on the list, but briefly;
 119
 120    (i) only bother with public key algorithms for now, and maybe RAND too
 121        (motivated by the need to get hardware support going and the fact
 122        this was a comparitively easy subset to address to begin with).
 123
 124   (ii) don't change (if at all possible) the existing crypto code, ie. the
 125        implementations, the way the ***_METHODs work, etc.
 126
 127  (iii) ensure that if no function from the ENGINE code is ever called then
 128        things work the way they always did, and there is no memory
 129        allocation (otherwise the failure to cleanup would be a problem -
 130        this is part of the reason no STACKs were used, the other part of
 131        the reason being I found them inappropriate).
 132
 133   (iv) ensure that all the built-in crypto was encapsulated by one of
 134        these "ENGINE"s and that this engine was automatically selected as
 135        the default.
 136
 137    (v) provide the minimum hooking possible in the existing crypto code
 138        so that global functions (eg. RSA_public_encrypt) do not need any
 139        extra parameter, yet will use whatever the current default ENGINE
 140        for that RSA key is, and that the default can be set "per-key"
 141        and globally (new keys will assume the global default, and keys
 142        without their own default will be operated on using the global
 143        default). NB: Try and make (v) conflict as little as possible with
 144        (ii). :-)
 145
 146   (vi) wrap the ENGINE code up in duct tape so you can't even see the
 147        corners. Ie. expose no structures at all, just black-box pointers.
 148
 149    (v) maintain internally a list of ENGINEs on which a calling
 150        application can iterate, interrogate, etc. Allow a calling
 151        application to hook in new ENGINEs, remove ENGINEs from the list,
 152        and enforce uniqueness within the global list of each ENGINE's
 153        "unique id".
 154
 155   (vi) keep reference counts for everything - eg. this includes storing a
 156        reference inside each RSA structure to the ENGINE that it uses.
 157        This is freed when the RSA structure is destroyed, or has its
 158        ENGINE explicitly changed. The net effect needs to be that at any
 159        time, it is deterministic to know whether an ENGINE is in use or
 160        can be safely removed (or unloaded in the case of the other type
 161        of reference) without invalidating function pointers that may or
 162        may not be used indavertently in the future. This was actually
 163        one of the biggest problems to overcome in the existing OpenSSL
 164        code - implementations had always been assumed to be ever-present,
 165        so there was no trivial way to get round this.
 166
 167  (vii) distinguish between structural references and functional
 168        references.
 169
 170 A *little* detail
 171 -----------------
 172
 173 While my mind is on it; I'll illustrate the bit in item (vii). This idea
 174 turned out to be very handy - the ENGINEs themselves need to be operated
 175 on and manipulated simply as objects without necessarily trying to
 176 "enable" them for use. Eg. most host machines will not have the necessary
 177 hardware or software to support all the engines one might compile into
 178 OpenSSL, yet it needs to be possible to iterate across the ENGINEs,
 179 querying their names, properties, etc - all happening in a thread-safe
 180 manner that uses reference counts (if you imagine two threads iterating
 181 through a list and one thread removing the ENGINE the other is currently
 182 looking at - you can see the gotcha waiting to happen). For all of this,
 183 *structural references* are used and operate much like the other reference
 184 counts in OpenSSL.
 185
 186 The other kind of reference count is for *functional* references - these
 187 indicate a reference on which the caller can actually assume the
 188 particular ENGINE to be initialised and usable to perform the operations
 189 it implements. Any increment or decrement of the functional reference
 190 count automatically invokes a corresponding change in the structural
 191 reference count, as it is fairly obvious that a functional reference is a
 192 restricted case of a structural reference. So struct_ref >= funct_ref at
 193 all times. NB: functional references are usually obtained by a call to
 194 ENGINE_init(), but can also be created implicitly by calls that require a
 195 new functional reference to be created, eg. ENGINE_set_default(). Either
 196 way the only time the underlying ENGINE's "init" function is really called
 197 is when the (functional) reference count increases to 1, similarly the
 198 underlying "finish" handler is only called as the count goes down to 0.
 199 The effect of this, for example, is that if you set the default ENGINE for
 200 RSA operations to be "cswift", then its functional reference count will
 201 already be at least 1 so the CryptoSwift shared-library and the card will
 202 stay loaded and initialised until such time as all RSA keys using the
 203 cswift ENGINE are changed or destroyed and the default ENGINE for RSA
 204 operations has been changed. This prevents repeated thrashing of init and
 205 finish handling if the count keeps getting down as far as zero.
 206
 207 Otherwise, the way the ENGINE code has been put together I think pretty
 208 much reflects the above points. The reason for the ENGINE structure having
 209 individual RSA_METHOD, DSA_METHOD, etc pointers is simply that it was the
 210 easiest way to go about things for now, to hook it all into the raw
 211 RSA,DSA,etc code, and I was trying to the keep the structure invisible
 212 anyway so that the way this is internally managed could be easily changed
 213 later on when we start to work out what's to be done about these other
 214 abstractions.
 215
 216 Down the line, if some EVP-based technique emerges for adequately
 217 encapsulating algorithms and all their various bits and pieces, then I can
 218 imagine that "ENGINE" would turn into a reference-counting database of
 219 these EVP things, of which the default "openssl" ENGINE would be the
 220 library's own object database of pre-built software implemented algorithms
 221 (and such). It would also be cool to see the idea of "METHOD"s detached
 222 from the algorithms themselves ... so RSA, DSA, ElGamal, etc can all
 223 expose essentially the same METHOD (aka interface), which would include
 224 any querying/flagging stuff to identify what the algorithm can/can't do,
 225 its name, and other stuff like max/min block sizes, key sizes, etc. This
 226 would result in ENGINE similarly detaching its internal database of
 227 algorithm implementations from the function definitions that return
 228 interfaces to them. I think ...
 229
 230 As for DSOs etc. Well the DSO code is pretty handy (but could be made much
 231 more so) for loading vendor's driver-libraries and talking to them in some
 232 generic way, but right now there's still big problems associated with
 233 actually putting OpenSSL code (ie. new ENGINEs, or anything else for that
 234 matter) in dynamically loadable libraries. These problems won't go away in
 235 a hurry so I don't think we should expect to have any kind of
 236 shared-library extensions any time soon - but solving the problems is a
 237 good thing to aim for, and would as a side-effect probably help make
 238 OpenSSL more usable as a shared-library itself (looking at the things
 239 needed to do this will show you why).
 240
 241 One of the problems is that if you look at any of the ENGINE
 242 implementations, eg. hw_cswift.c or hw_ncipher.c, you'll see how it needs
 243 a variety of functionality and definitions from various areas of OpenSSL,
 244 including crypto/bn/, crypto/err/, crypto/ itself (locking for example),
 245 crypto/dso/, crypto/engine/, crypto/rsa, etc etc etc. So if similar code
 246 were to be suctioned off into shared libraries, the shared libraries would
 247 either have to duplicate all the definitions and code and avoid loader
 248 conflicts, or OpenSSL would have to somehow expose all that functionality
 249 to the shared-library. If this isn't a big enough problem, the issue of
 250 binary compatibility will be - anyone writing Apache modules can tell you
 251 that (Ralf? Ben? :-). However, I don't think OpenSSL would need to be
 252 quite so forgiving as Apache should be, so OpenSSL could simply tell its
 253 version to the DSO and leave the DSO with the problem of deciding whether
 254 to proceed or bail out for fear of binary incompatibilities.
 255
 256 Certainly one thing that would go a long way to addressing this is to
 257 embark on a bit of an opaqueness mission. I've set the ENGINE code up with
 258 this in mind - it's so draconian that even to declare your own ENGINE, you
 259 have to get the engine code to create the underlying ENGINE structure, and
 260 then feed in the new ENGINE's function/method pointers through various
 261 "set" functions. The more of the code that takes on such a black-box
 262 approach, the more of the code that will be (a) easy to expose to shared
 263 libraries that need it, and (b) easy to expose to applications wanting to
 264 use OpenSSL itself as a shared-library. From my own explorations in
 265 OpenSSL, the biggest leviathan I've seen that is a problem in this respect
 266 is the BIGNUM code. Trying to "expose" the bignum code through any kind of
 267 organised "METHODs", let alone do all the necessary bignum operations
 268 solely through functions rather than direct access to the structures and
 269 macros, will be a massive pain in the "r"s.
 270
 271 Anyway, I'm done for now - hope it was readable. Thoughts?
 272
 273 Cheers,
 274 Geoff
 275
 276
 277 -----------------------------------==*==-----------------------------------
 278