## December 27, 2018

### Python urllib2 and TLS

I was thinking about dropping support for TLSv1.0 in this webserver. All the major browser vendors have announced that they are dropping it from their browsers. And you’d think that since TLSv1.2 has been around for a decade, even very old clients ought to be able to negotiate a TLSv1.2 connection.

But, when I checked, you can imagine my surprise that this webserver receives a ton of TLSv1 connections… including from the application that powers Planet Musings. Yikes!

The latter is built around the Universal Feed Parser which uses the standard Python urrlib2 to negotiate the connection. And therein lay the problem …

At least in its default configuration, urllib2 won’t negotiate anything higher than a TLSv1.0 connection. And, sure enough, that’s a problem:

ERROR:planet.runner:Error processing http://excursionset.com/blog?format=RSS
...
ERROR:planet.runner:Error processing https://www.scottaaronson.com/blog/?feed=atom
ERROR:planet.runner:URLError: <urlopen error [Errno 54] Connection reset by peer>
...
ERROR:planet.runner:Error processing https://www.science20.com/quantum_diaries_survivor/feed
ERROR:planet.runner:URLError: <urlopen error EOF occurred in violation of protocol (_ssl.c:590)>

Even if I’m still supporting TLSv1.0, others have already dropped support for it.

Now, you might find it strange that urllib2 defaults to a TLSv1.0 connection, when it’s certainly capable of negotiating something more secure (whatever OpenSSL supports). But, prior to Python 2.7.9, urllib2 didn’t even check the server’s SSL certificate. Any encryption was bogus (wide open to a MiTM attack). So why bother negotiating a more secure connection?

Switching from the system Python to Python 2.7.15 (installed by Fink) yielded a slew of

ERROR:planet.runner:URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>

errors. Apparently, no root certificate file was getting loaded.

The solution to both of these problems turned out to be:

--- a/feedparser/http.py
+++ b/feedparser/http.py
@@ -5,13 +5,15 @@ import gzip
import re
import struct
import zlib
+import ssl
+import certifi

try:
import urllib.parse
import urllib.request
except ImportError:
from urllib import splithost, splittype, splituser
-    from urllib2 import build_opener, HTTPDigestAuthHandler, HTTPRedirectHandler, HTTPDefaultErrorHandler, Request
+    from urllib2 import build_opener, HTTPSHandler, HTTPDigestAuthHandler, HTTPRedirectHandler, HTTPDefaultErrorHandler, Request
from urlparse import urlparse

class urllib(object):
@@ -170,7 +172,9 @@ def get(url, etag=None, modified=None, agent=None, referrer=None, handlers=None,

# try to open with urllib2 (to use optional headers)
-    opener = urllib.request.build_opener(*tuple(handlers + [_FeedURLHandler()]))
+    context = ssl.SSLContext(ssl.PROTOCOL_TLS)
+    opener = urllib.request.build_opener(*tuple(handlers + [HTTPSHandler(context=context)] + [_FeedURLHandler()]))
opener.addheaders = [] # RMK - must clear so we only send our custom User-Agent
f = opener.open(request)
data = f.read()

Actually, the lines in red aren’t strictly necessary. As long as you set a ssl.SSLContext(), a suitable set of root certificates gets loaded. But, honestly, I don’t trust the internals of urllib2 to do the right thing anymore, so I want to make sure that a well-curated set of root certificates is used.

With these changes, Venus negotiates a TLSv1.3 connection. Yay!

Now, if only everyone else would update their Python scripts …

#### Update:

This article goes some of the way towards explaining the brokenness of Python’s TLS implementation on MacOSX. But only some of the way …

#### Update 2:

Another offender turned out to be the very application (MarsEdit 3) that I used to prepare this post. Upgrading to MarsEdit 4 was a bit of a bother. Apple’s App-sandboxing prevented my Markdown+itex2MML text filter from working. One is no longer allowed to use IPC::Open2 to pipe text through the commandline itex2MML. So I had to create a Perl Extension Module for itex2MML. Now there’s a MathML::itex2MML module on CPAN to go along with the Rubygem.
Posted by distler at December 27, 2018 11:28 AM

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3079