Issue 16134: Add support for RTMP schemes to urlparse (original) (raw)

Please add support in urlparse for the family of RTMP schemes: rtmp rtmpe rtmps rtmpt

I believe these schemes should be added to the following module variables: uses_relative uses_netloc uses_params uses_query [essentially, the one where rtsp already is]

The RTMP spec is hosted at http://www.adobe.com/devnet/rtmp.html which describes the format as "protocol://servername:port/" The example provided there is rtmp://localhost:1935/test

An example YouTube RTMP service URL looks like: rtmp://a.rtmp.youtube.com/videolive?ns=yt-live&id=123456&itag=35&signature=blahblahblah

Please let me know if further information is required.

Thanks!

======================================== Footnote:

A full YouTube RTMP stream URL may look like this:

rtmp://a.rtmp.youtube.com/videolive?ns=yt-live&id=123456&itag=35&signature=blahblahblah/yt-live.123456.35

i.e. it is the stream service url suffixed with '/' + the_stream_name.

When one uses urlparse (extended with the 'rtmp' scheme), the stream name part gets lumped in with the last query value. I think it's reasonable to expect the user of the urlparse library to strip the stream name off, thus returning just the service URL, which can be parsed normally. However, if urlparse could handle this sort use-case generically, then that would be great.

Personally, I want to do away with all those scheme specific stuff, if we can. I have tried previously, but failed due to some backwards incompatibility. 3.4 gives a good chance/time to make those changes to get rid of those scheme specific stuff (again).

So, instead of adding the rmtp* modules to the various categories, I would like to see if can find a way out.

- another related one which.

Also, Jorge Gomes: If you care about 2.7 version only, then the way I have seen this issue being handled in production is you extend the uses_relative list with the protocols that you want to support. Like

from urlparse import uses_netloc uses_netloc.extend(['rtmp','rtmpe'])

2.7.x is in bugfix mode and this change may not be considered a bug-fix to find it's place in 2.7.x

Looks like Issue 9374 already covers most of this, with fixes in 2.7, 3.2 and 3.3.

$ python3.3 Python 3.3.2 (default, May 16 2013, 23:40:52) [GCC 4.6.3] on linux Type "help", "copyright", "credits" or "license" for more information.

from urllib.parse import urlparse urlparse("protocol://servername:port/") ParseResult(scheme='protocol', netloc='servername:port', path='/', params='', query='', fragment='') urlparse("rtmp://a.rtmp.youtube.com/videolive?ns=yt-live&id=123456&itag=35&signature=blahblahblah/yt-live.123456.35") ParseResult(scheme='rtmp', netloc='a.rtmp.youtube.com', path='/videolive', params='', query='ns=yt-live&id=123456&itag=35&signature=blahblahblah/yt-live.123456.35', fragment='')

Now there are only the three unresolved aspects listed below, as I see it. Personally I think the first, for urljoin(), should be fixed (hopefully in a generic way without whitelists). I mentioned this in Issue 18828. I wonder if last two really matter?