Issue 16487: Allow ssl certificates to be specified from memory rather than files. (original) (raw)

Created on 2012-11-16 15:10 by kristjan.jonsson, last changed 2022-04-11 14:57 by admin.

Messages (67)

msg175691 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2012-11-16 15:10

The _ssl module (and indeed the openssl lib) relies heaviliy on actual filesystem locations to load certificates. A client or a server may not want to rely on physical filesystem locations to load certificates for authentication or verification. Physical disc files are cumbersome and present a management burden in the presence of multiple processes.

This patch adds extensions to the _ssl.c file which allows certificates, keys and certification chains to be provided by file contents, rather than file name.

The ctx.load_cert_chain and ctx.load_verify_locations take additional arguments to specify the data on this form.

the ssl.wrap_socket does not add arguments, rather the function is polymorphic in that the conents of the certfil/keyfile are examined and treated as file-data if beginning with -----BEGIN. the ca_certs is similarly treated as a list of file contents, if it is a list, (rather than a string)

This patch is the result of work at CCP for deploying ssl clients and servers in an isolated environment without having to resort to temporary disk files.

msg175692 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2012-11-16 15:10

(the patch contains a local change to set the location of the 'external' dir, please disregard this.

msg175701 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-11-16 18:30

(the patch contains a local change to set the location of the 'external' dir, please disregard this.

Can you upload a cleaned up patch then? :)

msg175709 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2012-11-16 19:41

Can't right now. It's only relevant for pcbuild anyway so you can test it for Unix if you want without fear. Don't worry, I always give my patches a cleanup before committing them. Meanwhile, I'd welcome comments on the substance.

msg175741 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-11-17 14:35

Can't right now. It's only relevant for pcbuild anyway so you can test it for Unix if you want without fear. Don't worry, I always give my patches a cleanup before committing them.

I don't want to review a patch if you tell me that it has problematic stuff in it. Please first upload a cleaned up patch.

msg175779 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-11-17 17:12

By the way, without looking too much at the patch, I think it would be cleaner to let the certificate functions accept file-like objects, rather than adding keydata/certdata arguments.

msg175806 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2012-11-17 20:24

The please don't. I'm actually growing tired of the way you seem to always latch onto every submission I make almost with hostility. I will find someone else to look this over.

msg175809 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-11-17 21:12

The please don't. I'm actually growing tired of the way you seem to always latch onto every submission I make almost with hostility. I will find someone else to look this over.

I fully realize my reply was blunt. But perhaps you could try to understand that people care about the quality of the codebase and it is why they care about reviewing patches before letting them in. Asking for reviews of patches that you know contain unneeded cruft is rude. Disregarding their comments and saying "I'll ask someone else because I don't want to deal with you anymore" is even worse.

This certainly happens often to you, because (IME) you are often pushy with your patches. So, yes, you must expect a certain amount of defensive replies when being so. Just be prepared to deal with them.

msg180734 - (view)

Author: Jesús Cea Avión (jcea) * (Python committer)

Date: 2013-01-27 02:06

Kristján, are you pursuing this yet?. Can we move on?

msg180736 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-01-27 02:49

I wasn't pushing it anymore because there seemed to be no interest and Antoine attitude made me lose interest in contributing this. No one liked to comment on my approach except Antoine and I had grown tired of his persistent negativity and aloof manners. I might as well say that here that it is this kind of feedback that has turned me away from contributing to the project on several occations. At CCP we have a heavily patched version of python that we have modified throug time to deal with real world projects. I (and my company) genuinely like to contribute some of those improvements back to the trunk and regularly make efforts to do so (e.g. by taking the time and effort to port to python 3) but often they are met with with almost hostile negativity. I hope, for the sake of Python, that I am the only contributor meeting this kind of attitude.

Antoine, please take the time to read what I posted here again. I had a working patch, used in production. I uploaded, it, but when I got back home realized that an unrelated change was part of it. Unable to change it from there I asked politely that you disregard that "unrelated" change while looking at the substance of patch. As you know, I am a committer and will polish things before checking them in, as anyone does. How is that rude?

The patch still stands. I on a poor internet in a Chinese hotel and will attempt to download and edit the patch so that the irrelevant change to the PC build solution should no longer need to hinder anyone from reading it.

msg180737 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-01-27 02:55

Manually edited the .patch file The build changes are gone but should probably go in separately in some form later.

msg190997 - (view)

Author: Brandon Rhodes (brandon-rhodes) *

Date: 2013-06-11 23:22

Kristján, your patch is a wonderful idea—I am about to commit production code that will have to create tens of thousands of temporary files during operation, one file each time SSL is started up on a socket, which could be avoided if something like this patch were applied. I had always assumed that it was simply a limitation of the underlying SSL library! An interface that takes a filename like this, instead of a live file-like object, seems so un-Pythonic that I assumed the only reason for it was a limitation in OpenSSL. Thank you very much for looking under the covers and discovering that this was not the case!

I do, though, feel a slight twinge when we add Even More Parameters to a Standard Library routine but in such a way that it cannot be used with an existing parameter — as here in your patch, where we gain a parameter like certdata that cannot be (should not be?) used at the same time as certfile. It seems redundant to have two names for the same parameter to the underlying library, and makes it look like the routine needs more information than it really does.

Since my own instinct was to think, ten minutes ago, "Maybe I can pass a StringIO, since it says it wants a fine!", I am very much in support of the idea of keeping only the existing parameters, but making them accept both strings (which, for compatibility, would continue to be interpreted as filenames) and file-like objects as arguments. I think this would have a great deal of symmetry with how other parts of the Standard Library work, while keeping this patch's central value of making it possible for those of us with cert-heavy code to avoid the creation of thousands of files a minute.

Again, thank you VERY much for discovering that OpenSSL can do this, and I will try to provide whatever encouragement I can as you try to shepherd this past the other committers.

msg191018 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-12 09:25

Hi there. Thanks for your comments. This is the kind of discussion I was hoping to have about my draft patch.

I too have reservations about adding arguments. In the version of this that we have in house, we actually don't use a "certdata" argument, but rather this:

The reason I didn't do it this way in the patch is because: a) it could be confusing and possibly un-pythonic to have that level of polymorphism employed b) There are possible edge cases. What if you had a filename that started with the magic tokens (unlikely but technically possible).

I also would prefer passing in the certificates as raw bytes data, rather than file-like objects. there is no reason for them to be files, this is not data that is read on demand. libssl reads the entire contents of these files in one gulp, so there is no efficiency to be gained through any sort of pipelining by providing a file-like-object.. Adding the plumbing inside the library to do python "read()" calls is completely unnecessary when the caller can simply do that.

The only possible reason would be to resolve the above ambiguity, i.e. to allow the api to try to invoke a "read()" function on the 'file' to determine if it is a file-like object.

msg191828 - (view)

Author: Brandon Rhodes (brandon-rhodes) *

Date: 2013-06-25 01:05

Kristján, you are certainly correct that a single-argument that can be either a filename or a cert is inappropriate; we should not be peeking inside of strings to guess what they contain.

And I think you also have a good point about Pythonic-ness when it comes to polymorphism; routines that are sensitive to object type have been going more and more out of style for the past twenty years, and for good reason.

But, having ceded those points, I still think string-or-file-like-object is the correct approach here, simply because it is the overwhelming approach of the Standard Library; everyone is used to it, and will already "know the ropes" of that approach; and it prevents noisying up the interface with redundant arguments. Were we doing this over again, we would simply not allow a filename at all, and let the user open the file if they needed to. But since a filename is allowed, it feels like the Official Standard Library Approach to also allow a file-like object.

Some examples:

zipfile.ZipFile: "Open a ZIP file, where file can be either a path to a file (a string) or a file-like object."

http://docs.python.org/2/library/zipfile#zipfile.ZipFile

binhex.hexbin: "Decode a binhex file input. input may be a filename or a file-like object supporting read() and close() methods."

http://docs.python.org/release/2.7.5/library/binhex.html#binhex.hexbin

xml.dom.minidom.parse: "filename_or_file may be either a file name, or a file-like object."

http://docs.python.org/2/library/xml.dom.minidom.html#xml.dom.minidom.parse

mailbox.Mailbox.add: "Parameter message may be a Message instance, an email.message.Message instance, a string, or a file-like object (which should be open in text mode)."

http://docs.python.org/2/library/mailbox.html#mailbox.Mailbox.add

pickletools.dis: "pickle can be a string or a file-like object."

http://docs.python.org/2/library/pickletools.html#pickletools.dis

I suggest that these precedents, along with others that I believe we could find with a more exhaustive search of the Standard Library, are sufficient to suggest that in this case the least-surprise approach is a single argument that's either a filename or file-like object. I would suggest reviewing quickly the code for the above examples and following their example for how to distinguish most cleanly between a filename and file-like object; I wonder if they call any common code to get the contents out, or each do it completely by themselves? :)

Thanks again for wanting to add this to the SSL module, it will be a great addition that solves an important use case!

msg191838 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-25 08:34

ok, I concede that a file-like object makes sense from a polymorphism point of view. It makes no sense from a streaming point of view. A caller can then wrap their data into a StringIO instance.

I'll rework the patch in the manner you suggest.

msg191847 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2013-06-25 11:04

Agreed, a file-like object is the way to go. I don't think you need to write the logic in C, by the way. You can write a high-level function and defer to a low-level C func for the basic API wrapping.

msg191859 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-25 13:59

Antoine, are you suggesting that we remove the current c-level capability to use file system files (using open()) and just go with raw bytes data at the C api level, and then do the 'filename or filelike object' in Python land?

msg191860 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2013-06-25 14:39

Antoine, are you suggesting that we remove the current c-level capability to use file system files (using open()) and just go with raw bytes data at the C api level, and then do the 'filename or filelike object' in Python land?

Yes, I think that's reasonable. Hopefully it won't break any existing uses.

msg191862 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-06-25 14:52

I didn't know about this issue and have worked on a similar feature in #18138.

msg191868 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2013-06-25 16:51

Ha, funny. Now it's time to reconciliate your respective patches :)

msg191922 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-26 22:50

Here is an updated patch. We now support file-like objects. New helper functions try to turn file arguments into either Py_buffer objects containing the read data, or PyBytesObject argument with the file system encoding of the path. A file-like object is recognized by the successful execution of a read() method call. docs and tests updated.

msg191923 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-06-27 00:48

Thx Kristján!

My patch maps PyUnicode to PEM encoded cert data and objects with Py_Buffer support to DER encoded cert data. Perhaps you like to you the same concept in your patch to support TextIO and BytesIO read() methods. Feel free to reuse as much of my patch as you like.

You don't check ERR_GET_LIB() in some places.

Could you please add a comment about extra_chain_cert in PySSL_CTX_use_certificate_chain_mem() and SSL_CTX_get_cert_store() in PySSL_CTX_load_verify_certs_mem()? The difference between the extra chain and the trusted store got me confused more than once. As far as I recall the trusted store is suppose to contain only trusted root CA. The extra chain is uesd to provide intermediate certs for the cert chain between a root CA and the service's cert.

msg191924 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-06-27 00:59

PS: I like to have DER cert support for #17134. I'd rather not convert DER to PEM.

msg191936 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-27 09:38

Thanks for your comments Christian. "You don't check ERR_GET_LIB() in some places." Do you have a particular place in mind?

About DER. As I understand, currently _ssl only supports PEM. If that is the case, then supporting DER should, IMHO, be a separate patch. It will probably involve adding an argument to the functions.

Both DER and PEM files are binary formats. the PEM is base64 encoded ascii with some strict ascii headers and footers. As such, the encoding is quite explicit. I am not sure that automatic conversion from unicode to ascii should be undertaken on the C level, particularly if the intention later is to support DER, which is pure binary and where "string" has no place.

I'll have a look at your patch as well, haven't gotten round to do that yet.

msg191937 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-06-27 09:54

I found two places:

if (ERR_GET_REASON(err) == X509_R_CERT_ALREADY_IN_HASH_TABLE) { if (ERR_GET_REASON(err) == PEM_R_BAD_BASE64_DECODE)

AFAIK the _ssl module only supports PEM certs for loading. On the other hands cert data can only be retrieved as dict representation or binary DER data, e.g. getpeercert(binary_form=True) -> DER bytes. It's a bit of a puzzle to me.

It feels a bit strange to treat PEM certs as binary data, especially since the SSL module treats PEM as ASCII unicode. For example DER_cert_to_PEM_cert() accepts bytes and returns str, PEM_cert_to_DER_cert() converts str to bytes.

msg191938 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2013-06-27 10:00

It feels a bit strange to treat PEM certs as binary data, especially since the SSL module treats PEM as ASCII unicode. For example DER_cert_to_PEM_cert() accepts bytes and returns str, PEM_cert_to_DER_cert() converts str to bytes.

I agree that PEM is logically text (i.e. unicode). Also, having the unicode == PEM, bytes == DER distinction sounds reasonably practical.

msg191941 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-27 10:43

Ok, thanks. The consistency argument is strong, also Antoine's suggestion to use the return type of read() as a discriminant.

also please excuse me because I am not a habitual user of Python 3 and haven't become used to the str/binary dichotomy yet. I didn´t know, for instance, about io.BytesIO until I found to my horror that io.StringIO.read() would output unicode (i.e str).

msg191942 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-06-27 10:53

EVE Online is still using Python 2.7? You gotta hurry up or Guido will beat you with Dropbox's 3.x port. :)

msg191957 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-27 17:17

2.7 is the pinnacle of pythonic achievement. Particularly our branch of it :) One day we'll move, I'm sure, when there is an opportune moment. For example, if we were to start supporting a new game, a new platform. But for now, if it ain't broke, we won't fix it.

In fact, today I was asked about python for the ps4... It might be worth taking a look at porting p3k for that and making a clean break :)

msg191970 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-06-27 23:15

Okay, I have updated the patch as suggested. string mode means PEM encoding, binary mode DER encoding. Note that DER encoding is more limited, there is no way to concatentate private keys and certificates into one file (the PEM decoder searches the file until it finds the proper data it is looking for, certificate or private key).

Also, the unittests don't test for DER encoded private key, because there is no way to generate them currently. The conversion functions in ssl only cope with certificates. Although adding them, changing them would be trivial. These functions should also probably know how to split a PEM file into sections so that they can be individually converted.

msg202800 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-11-13 23:32

What's the status of this patch? I need a way to load CA certs from memory for another patch of mine.

msg202828 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-11-14 11:47

Update the latest patch to the current state of python.

msg202829 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-11-14 11:48

The patch should be valid, please try it out.

msg202832 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2013-11-14 12:04

Your patch looks like Benjamin's fix for issue #17828 and not like a SSL improvement. :)

msg202833 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2013-11-14 12:11

Haha, indeed. what nonsense. Here is the correct one.

msg246488 - (view)

Author: Martin Richard (martius) *

Date: 2015-07-09 13:01

Hi,

I would like to update this patch so it can finally land in cpython, hopefully 3.6.

tl;dr of the thread: In a nutshell, the latest patch from Kristján Valur Jónsson updates SSLContext.load_cert_chain(certfile, keyfile=None, password=None) and SSLContext.load_verify_locations(cafile=None, capath=None)

so certfile, keyfile and cafile can be either a string representing a path to a file or a file-like object.

The discussion seems to favor this API (pass file-like objects) rather than using new arguments (certdata, keydata) to pass string or bytes objects.

However, Christian Heimes proposed a patch (which landed in 3.4) which adds a cadata argument to load_verify_locations().

So, what should we do?

I'd go the the 2nd solution to be consistent with the API and keep things simple.

msg246489 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2015-07-09 13:14

I thing adding "keydata" and "certdata" makes things more complicated, on the contrary. You start having an API with many optional arguments but some of them are exclusive with each other (because you can only specify a single key and cert chain).

The "cafile", "capath", "cadata" in load_verify_locations() are cumulative, but they would be exclusive in load_cert_chain(): there's not much symmetry.

msg246492 - (view)

Author: Martin Richard (martius) *

Date: 2015-07-09 13:40

You are right.

And if certfile and keyfile (args of load_cert_chain()) accept file-like objects, we agree that cafile (load_verify_location()) should accept them too?

msg246494 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2015-07-09 13:42

Le 09/07/2015 15:40, Martin Richard a écrit :

And if certfile and keyfile (args of load_cert_chain()) accept file-like objects, we agree that cafile (load_verify_location()) should accept them too?

It could, but that's a separate issue. Let's stay focused on what is necessary to solve this use case, IMO (i.e. specify non-file-backed data as a certification chain or private key).

msg246495 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2015-07-09 13:43

Sorry, I didn't take time to read the whole discussion. For me, it's a good idea to accept a filename or a file object in the same parameter. Having two exclusive parameters for the same thing (ex: CA) doesn't smell like a great API.

msg246496 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2015-07-09 13:48

I'd rather introduce new types and have the function accept either a string (for path to fiel) or a X509 object and a PKey object. It's more flexible and secure. With a private key type we can properly support crypto ENGINEs and wipe memory when the object gets deallocated.

msg246555 - (view)

Author: Martin Richard (martius) *

Date: 2015-07-10 10:41

I'm not sure I know how to do this correctly: I lack of experience both with openssl C API and writing python modules in C.

It may be more flexible, but unless the key is protected/crypted somehow, one would need a string or bytes buffer to hold the key when creating the private key object: not much secure. Don't you think that it should be addressed in a separate issue?

2015-07-09 15:48 GMT+02:00 Christian Heimes <report@bugs.python.org>:

Christian Heimes added the comment:

I'd rather introduce new types and have the function accept either a string (for path to fiel) or a X509 object and a PKey object. It's more flexible and secure. With a private key type we can properly support crypto ENGINEs and wipe memory when the object gets deallocated.



Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16487>


msg272585 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2016-08-13 12:58

Hello!

Like everybody in this thread I would love to see this land and have prepared a new patch, hoping that we can process this still for 3.6.

Antoine summarized the core task here very well:

Let's stay focused on what is necessary to solve this use case, IMO (i.e. specify non-file-backed data as a certification chain or private key).

What follows is what I believe is the consensus in this thread about a minimum viable solution (points mentioned earlier, enriched with some detail):

More behavioral detail that I planned with:

A summary of a few things that are in my opinion beyond a minimal implementation (but probably useful in the future!):

Supporting additional serialization formats greatly increases the complexity of the task at hand here. We can save that for later.

An important note on backwards compatibility:

Antoine, are you suggesting that we remove the current c-level capability to use file system files (using open()) and just go with raw bytes data at the C api level, and then do the 'filename or filelike object' in Python land?

Yes, I think that's reasonable. Hopefully it won't break any existing uses.

I absolutely agree that Python code should be responsible for doing the distinction between file paths and file objects. However, I do not agree with

remove the current c-level capability to use file system files

After following the code I am rather sure that we should leave the current C implementation (_ssl__SSLContext_load_cert_chain_impl() in Modules/_ssl.c) as untouched as possible, and not try to integrate the new logic into that very same function.

Instead, I figured, we should use Python code to delegate to one of two different C functions after the distinction between file path and file object has been made, whereas one of the two C functions is the one we used so far.

The central argument for having two separate C code paths is that old code must proceed using OpenSSL's SSL_CTX_use_certificate_chain_file().

Hence, I opted for implementing SSLContext.load_cert_chain() in Python, delegating the actual work

Comments on tests:

Now I'd appreciate your review!

Jan-Philip

msg275288 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2016-09-09 08:17

Thanks for your patch. I have left some comments (sorry for the delay).

msg277173 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2016-09-21 16:58

Thanks Christian, much appreciated. Just responded to your review.

msg281457 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2016-11-22 09:42

Christian, Senthil, would appreciate if I got another round of feedback (in the review thread) :-)

msg293908 - (view)

Author: Senthil Kumaran (orsenthil) * (Python committer)

Date: 2017-05-18 02:30

Hi Jan-Philip, we might be close on this one. Could you convert your latest patch into PR against (https://github.com/python/cpython). Is will help to record you as the author as we can discuss the patch in python sprints and get this in. :-) Thanks!

msg297068 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2017-06-27 18:59

Hey Senthil and Christian!

Could you convert your latest patch into PR against https://github.com/python/cpython

That was fun. There we go: https://github.com/python/cpython/pull/2449

I hope I was not too late with that for the 3.7 development.

msg303514 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2017-10-02 09:20

Hey Antoine, Christian, Senthil!

I have invested quite a bit more time to double-check my responses to the questions asked so far, clarified where appropriate, and updated the pull request on GitHub after manually resolving the merge conflicts that accumulated over time.

I would very much appreciate to get another round of feedback. Can we somehow expedite this? I am available for pushing this further, so that maybe we can still land this in 3.7 after we missed 3.6 last year.

Based on the diff this change appears to be risky but I would like to stress again that I approached this issue pretty conservatively. I tried to make this a conceptually backwards-compatible change by keeping the old code path intact (the old tests pass, too, but this is not the best criterion here).

Cheers and thanks!

Jan-Philip

msg307153 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2017-11-28 17:48

Hello! I would very much appreciate if we can find a way for us to get another review for the last patch.

I did most of the work in August 2016 and got a review from Senthil and Christian which I processed. When I got back to the patch for converting it to the GitHub workflow about a year later I realized that the details faded from my mind. 4 months later, after re-digging into the details, I went into the patch again and resolved the merge conflicts that accumulated over time. Antoine kindly left another round of comments which I replied to.

Now, 2 months after that last effort I'd like to not forget about the details again, but instead be able to respond to review feedback while still being on top of the matter. So, any kind and level of feedback would be greatly appreciated.

Thank you!

msg307154 - (view)

Author: Senthil Kumaran (orsenthil) * (Python committer)

Date: 2017-11-28 18:07

Hello JP, I had been little inactive for a while (>2 years). However, I think, I can take now some load in CPython world and I will take this ticket/feature forward this Sunday (1-Dec-2017).

If Christian (or currently active committers) don't get to it before, then expect from me to move this forward. Thank you.

msg307296 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2017-11-30 11:52

I'm -1 on the patch. Any new feature and API shall follow https://www.python.org/dev/peps/pep-0543/. I have some code that adds a proper Certificate class that wraps OpenSSL's X509* type.

msg307298 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2017-11-30 11:58

Christian, what is the status of PEP 543? Is someone actively working on it? Should we hope for a resolution (i.e. an implementation accepted in the stdlib) in the near future?

I'm worried about such purity arguments. It reminds me of when we decided to freeze distutils because distutils2 was to become the future (and then distlib took that place, etc.). In the end "the future" hasn't really happened and distutils has rotten without a clear path forward for users (except the venerable setuptools, which has its own issues).

msg307299 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2017-11-30 12:03

I'm working on a PEP that builds on top of PEP 543 and addresses some issues like IDNA #28414, OpenSSL/LibreSSL compatibility, hostname verification, verification chain, and TLS 1.3. As part of the PEP implementation, I'll add a certificate class.

I don't to introduce yet another way to load a certificate. The C code is already complicated enough.

msg307320 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2017-11-30 15:23

OP here, lurking. The need to load server certificates from memory is quite real. Some seven years ago I wrote custom code to handle that for CCPs python branch, and contributed patches to that effect. It's always dismaying to see how peoples efforts get bogged down by one thing or another. So, now there is a pep that prohibits this change? Fun times.

2017-11-30 12:03 GMT+00:00 Christian Heimes <report@bugs.python.org>:

Christian Heimes <lists@cheimes.de> added the comment:

I'm working on a PEP that builds on top of PEP 543 and addresses some issues like IDNA #28414, OpenSSL/LibreSSL compatibility, hostname verification, verification chain, and TLS 1.3. As part of the PEP implementation, I'll add a certificate class.

I don't to introduce yet another way to load a certificate. The C code is already complicated enough.



Python tracker <report@bugs.python.org> <https://bugs.python.org/issue16487>


msg307322 - (view)

Author: Martin Richard (martius) *

Date: 2017-11-30 15:55

FWIW, PyOpenSSL allows to load certificates and keys from a memory buffer and much more. It's also fairly easy to switch from ssl to PyOpenSSL.

It's probably a viable alternative in many cases.

msg307357 - (view)

Author: Nathaniel Smith (njs) * (Python committer)

Date: 2017-12-01 04:34

My impression was that progress on PEP 543 is temporarily stalled and not going to be finished for 3.7. Is that wrong?

There's going to need to be a significant amount of shimming to implement the PEP 543 interfaces on top of ssl no matter what you do. Maybe it would be best to merge a simple load_cert_chain_from_buffer method now for 3.7, and worry about redesigning the world later?

msg307372 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2017-12-01 09:25

Correct, PEP 543 won't land in Python 3.7 because neither me nor Cory have the resources to land it. However I want all changes and new additions to the SSL module to follow PEP 543 so I can provide a PEP 543-compatible interface in the near future (3.8 or 3.9). We need a proper certificate class anyway to address other issues and feature requests. I don't want to have three ways to load certificates, especially when it involves more C code.

msg307373 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2017-12-01 09:40

Is it possible to have a single C function that loads the certificates from memory, or would that make the file-loading case less secure?

Le 01/12/2017 à 10:25, Christian Heimes a écrit :

Christian Heimes <lists@cheimes.de> added the comment:

Correct, PEP 543 won't land in Python 3.7 because neither me nor Cory have the resources to land it. However I want all changes and new additions to the SSL module to follow PEP 543 so I can provide a PEP 543-compatible interface in the near future (3.8 or 3.9). We need a proper certificate class anyway to address other issues and feature requests. I don't want to have three ways to load certificates, especially when it involves more C code.



Python tracker <report@bugs.python.org> <https://bugs.python.org/issue16487>


msg307377 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2017-12-01 09:47

It doesn't matter for certificates. I guess you also want to load the key from a memory buffer, don't you? That's going to be less secure.

msg307380 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2017-12-01 10:33

Le 01/12/2017 à 10:47, Christian Heimes a écrit :

It doesn't matter for certificates. I guess you also want to load the key from a memory buffer, don't you? That's going to be less secure.

Yes, that's what I meant.

msg307681 - (view)

Author: Senthil Kumaran (orsenthil) * (Python committer)

Date: 2017-12-05 20:17

Hi Cristian,

I don't want to have three ways to load certificates, especially when it involves more C code.

I think this (more C code) is the primary and the only negative point against the current patch. And that seems necessary for the feature specific to OpenSSL.

Not sure if you looked at the latest version (https://github.com/python/cpython/pull/2449/files) recently.

The current patch does not deviate in-principle from the PEP 543.

It maintains the same API arguments SSLContext.load_cert_chain(certfile, keyfile=None, password=None)

These are the benefits in my opinion.

PEP-543 is important and seems like a major effort. The current patch might still be valuable and perhaps might be useful towards PEP-543 implementation. It deals only with certificates only.

msg309841 - (view)

Author: Christian Heimes (christian.heimes) * (Python committer)

Date: 2018-01-12 09:25

Senthil,

I'm not a fan of PR 2449 because it provides yet another way to load certificates and keys from memory. It's a clever idea to use MemoryBIO here. But the approach is not compatible with PEP 543. The PEP requires an API that can turn a memory blob into a series of certificate objects. PR 2449 doesn't enable memory -> certificate. The new API is OpenSSL specific and restricted to PKCS#8 key in PEM/DER encoding. PEP 543 is extensible for PKCS#11, engine and HSM support. PR 2449 is not.

There are other issues with PR 2449. For example it's missing several GIL releases calls. The password callback doesn't look correct in some places.

https://github.com/python/cpython/pull/5162 provides a PEP 543-compatible implementation (plus additions from pending PR):

import ssl

chain = ssl.Certificate.chain_from_file("Lib/test/ssl_cert.pem") cas = ssl.Certificate.bundle_from_file("Lib/test/pycacert.pem") pkey = ssl.PrivateKey.from_file("Lib/test/ssl_key.passwd.pem") Traceback (most recent call last): File "", line 1, in ssl.SSLError: [PEM: BAD_PASSWORD_READ] bad password read (_ssl.c:58) pkey = ssl.PrivateKey.from_file("Lib/test/ssl_key.passwd.pem", password="somepass")

chain (<_ssl.Certificate '/C=XY/L=Castle Anthrax/O=Python Software Foundation/CN=localhost'>,) cas [<_ssl.Certificate '/C=XY/O=Python Software Foundation CA/CN=our-ca-server'>]

ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER) ctx.load_cert_chain(chain, pkey) ctx.load_verify_locations(cadata=cas)

get_ca_certs() doesn't return ssl.Certificates yet

ctx.get_ca_certs() [{'subject': ((('countryName', 'XY'),), (('organizationName', 'Python Software Foundation CA'),), (('commonName', 'our-ca-server'),)), 'issuer': ((('countryName', 'XY'),), (('organizationName', 'Python Software Foundation CA'),), (('commonName', 'our-ca-server'),)), 'version': 3, 'serialNumber': 'B09264B1F2DA21D0', 'notBefore': 'Jan 4 19:47:07 2013 GMT', 'notAfter': 'Jan 2 19:47:07 2023 GMT'}]

msg324686 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2018-09-06 12:00

Thanks for the discussion.

Since I tried to join the efforts here in 2016 two years ago I was (and still am) enthusiastic, and willing to invest quite a bit of energy. Still, we have missed the 3.6 and 3.7 releases to change something about the fact that there is no simple way to load certificate data from memory rather than files.

I really appreciate the work that as been done on https://www.python.org/dev/peps/pep-0543/. It's a truly great effort. But I think the fact that we did not land a solution since 2012 is somewhat an indicator showing that building smaller chunks and showing the willingness to iterate on a solution would have led to a more tangible outcome, something that I could use right now when building Python applications (how great would that be?). Honestly, the small API changes that were discussed here for years and that I think I implemented do not seem problematic to me from a maintenance and deprecation perspective. Yes, PEP 543 proposes a more idealistic way of doing things. But I do not see how implementing PEP 543 would have been slowed down or negatively affected in any way by the last patch I proposed.

https://github.com/python/cpython/pull/5162 looks very big, and C is not as much my comfort zone as is Python. Still, please let me know if you would like me to contribute!

By the way, I would have really liked to see the following feedback around the first time I submitted the patch:

For example it's missing several GIL releases calls. The password callback doesn't look correct in some places.

I remember that this aspect of writing the patch consumed quite a bit of my time and I operated with care, knowing the risks. I very much would have appreciated specific feedback on this. To learn, and to actually address problems.

Jan-Philip

msg362692 - (view)

Author: Lord Anton Hvornum (Lord Anton Hvornum)

Date: 2020-02-26 14:26

I agree with Jan-Philip Gehrcke, would have been nice to have had this.

Pretty pissed reading through this ancient issue.

In regards to how people treat other volunteers: We're all working for free, and I think expectations from people with authoritative powers must have room for accepting a initially lower contribution standard than they themselves claim to produce. Mainly to not crush the will to help and contibute, but also because the attitude could be a lot better towards fellow programmers. It would also allow for a more iterative process in achieving higher standards. Otherwise you limit yourself to a very select few in the contribution process. Not to mention a lot of "grunt work" could be done by people having an idea, but not nessecarily the same "academic background" in terms of producing perfect code.

I've seen this across numerous Open Source projects, and quite frankly it deter me from ever contributing. PoC code shouldn't be seen as garbage, but a first step in creating new fresh things. Someone with more knowledge should and could step in and clean up things that are deamed "unfit" for production or help said contributor to understanding why certain things needs to be improved, not just blatently say patches needs to be improved.

This whole "It ain't perfect, we don't touch it" is elitism at it's finest. Stop it, and have a meaningful discussion instead.

msg363074 - (view)

Author: Dr. Jan-Philip Gehrcke (jgehrcke) *

Date: 2020-03-01 18:10

I am not too attached to "my" patch, but because I love Python I really would like us to land on a solution.

However I want all changes and new additions to the SSL module to follow PEP 543 so I can provide a PEP 543-compatible interface in the near future (3.8 or 3.9).

Christian, I wonder: did we make progress on this front? Will the stdlib in the 3.9 release make it easy to load cert and key material from memory?

Mainly to not crush the will to help and contribute

Thanks Anton for your words, truly appreciated. I do not want to sound too negative, but I would like to confirm that emotionally this issue here was a bit of an unpleasant experience for me, and it certainly inhibited my intentions, energy, motivation to (try to) contribute more. Thanks, though, to all supporters here.

msg363101 - (view)

Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer)

Date: 2020-03-01 22:19

I gave up contributing a long time ago now because it was too emotionally exhausting to me. This issue was one that helped tip the scales. I hope things have become easier now because good projects like Python need the enthusiasm and spirit of volunteer contributors. Good luck.