WiFiServerSecure: Cache SSL sessions by ZakCodes · Pull Request #7774 · esp8266/Arduino (original) (raw)
WiFiClientSecure::setSession
allows users to use a feature in BearSSL to cache the SSL session to a server.
BearSSL also allows caching SSL session on the server side, therefore I've created the method WiFiServerSecure::setCache
to allow the user to setup a cache allowing BearSSL to resume the SSL sessions of client and greatly shorten the length of TLS handshakes.
Here are the steps that I have followed when implementing this feature:
- Implement the feature the
ESP8266WiFi
library - Add the new classes and methods to the
ESP8266WiFi
librarykeywords.txt
file - Use the feature in an example in the
ESP8266WiFi
library (I've chosen the BearSSL_Server example) - Use the feature in an example in
ESP8266WebServer
library (I've chosen the HelloServerBearSSL example) - Document the feature in the rst docs
If you want to test this feature, I encourage you to use these examples and to enable and disable the cache to see the performance improvements. In order to reset the server's cache, you simply have to reset the microcontroller.
Testing the examples
Here's a ruby script that I've written to test the performance improvements that this PR is bringing. It does 100 requests using a new SSL session each time and 100 more using the same session.
#!/bin/env ruby
require 'net/http' require 'benchmark'
DOMAIN = "<your controller's IP>" TIMES = 100
TEST_URI = URI("https://#{DOMAIN}/")
def start_session() http = Net::HTTP.new(TEST_URI.host, TEST_URI.port) http.use_ssl = true
Allow self signed certificates
http.verify_mode = OpenSSL::SSL::VERIFY_NONE return http end
request = Net::HTTP::Get.new(TEST_URI) request["Connection"] = "close"
Benchmark.bm(20) do |bm| http = start_session() bm.report("don't reuse session:") { TIMES.times do |_| http = start_session() http.request(request) end }
The cached session of the last request is used.
Otherwise this would massively slow down the first request
when we're trying to test the improvement of cached sessions.
bm.report("reuse session:") { TIMES.times do |_| response = http.request(request) # Reuse the cached session. end } end
Results
I've used this script to test the BearSSL_Server and HelloServerBearSSL examples before, after this PR without the cache activated and after this PR with the cache activated. For BearSSL_Server, I did the test once with the RSA key and another time with the EC key.
BearSSL_Server with the RSA key
Before the PR
user system total real
don't reuse session: 0.288973 0.082934 0.371907 (183.731719) reuse session: 0.285080 0.091825 0.376905 (184.243461)
After the PR without caching
user system total real
don't reuse session: 0.367846 0.071787 0.439633 (183.834841) reuse session: 0.340344 0.089750 0.430094 (184.414041)
After the PR with caching
user system total real
don't reuse session: 0.312145 0.102594 0.414739 (184.679486) reuse session: 0.204378 0.063052 0.267430 ( 6.986146)
Summary
Don't reuse the session (s) | Reuse the session (s) | Improvement | |
---|---|---|---|
Before the PR | 183.731719 | 184.243461 | 0.997 |
After the PR (without caching) | 183.834841 | 184.414041 | 0.997 |
After the PR (with caching) | 184.679486 | 6.986146 | 26.435 |
The improvement ratio is the time of Don't reuse the session
over the time of Reuse the session
.
BearSSL_Server with the EC key
Before the PR
user system total real
don't reuse session: 0.302631 0.104204 0.406835 ( 35.997831) reuse session: 0.305480 0.119427 0.424907 ( 36.882044)
After the PR without caching
08:54:58 PM user system total real don't reuse session: 0.339462 0.088985 0.428447 ( 36.511242) reuse session: 0.320082 0.097038 0.417120 ( 37.105757)
After the PR with caching
user system total real
don't reuse session: 0.338105 0.107970 0.446075 ( 36.216895) reuse session: 0.224127 0.066317 0.290444 ( 5.896962)
Summary
Don't reuse the session (s) | Reuse the session (s) | Improvement | |
---|---|---|---|
Before the PR | 35.997831 | 36.882044 | 0.976 |
After the PR (without caching) | 36.511242 | 37.105757 | 0.984 |
After the PR (with caching) | 36.216895 | 5.896962 | 6.142 |
The improvement ratio is the time of Don't reuse the session
over the time of Reuse the session
.
HelloServerBearSSL
Before the PR
user system total real
don't reuse session: 0.304455 0.081010 0.385465 (184.723887) reuse session: 0.321123 0.100000 0.421123 (184.567364)
After the PR without caching
user system total real
don't reuse session: 0.344634 0.115067 0.459701 (187.112183) reuse session: 0.371673 0.073724 0.445397 (185.084550)
After the PR with caching
user system total real
don't reuse session: 0.345185 0.102189 0.447374 (185.611363) reuse session: 0.221588 0.078481 0.300069 ( 7.925369)
Summary
Don't reuse the session (s) | Reuse the session (s) | Improvement | |
---|---|---|---|
Before the PR | 184.723887 | 184.567364 | 1.001 |
After the PR (without caching) | 187.112183 | 185.084550 | 1.011 |
After the PR (with caching) | 185.611363 | 7.925369 | 23.4199 |
The improvement ratio is the time of Don't reuse the session
over the time of Reuse the session
.
Analysis
Those numbers show that this PR makes the HTTPS requests about 25x faster with an RSA key and 6x with an EC key when caching is enabled. When caching isn't enabled, this PR doesn't seem to negatively affect performance at all.
We can see that BearSSL_Server is faster than HelloServerBearSSL and its improvement is greater, because this PR only improves the TLS handshake, so the longer the server takes to parse the request and create a response, the less improvement there is and HelloServerBearSSL implements a web server which is slower than BearSSL_Server that answers all requests with the same response without parsing them.
It is to be noted that, in the script that reuses the session, the time of the first request of the session isn't counted because this PR doesn't improve it.
Testing the TLS handshake improvement
The previous test was testing the speed improvement for the full HTTP request, but this PR should only improves the TLS handshake.
In order to see it I've tested the BearSSL_Server with an RSA and an EC key using Firefox's network timing analyzer.
This test is a little less rigourous, because I didn't do it 100 times like the others, but it allows us to see the improvement for each part of the request.
RSA key
Before the PR
After the PR without caching
After the PR with caching
Summary
Connection | TLS Setup | Waiting | Total without connection | |||||
---|---|---|---|---|---|---|---|---|
Measure (ms) | Improvement | Measure (ms) | Improvement | Measure (ms) | Improvement | Measure (ms) | Improvement | |
Before the PR | 6 | 1 | 1730 | 1 | 96 | 1 | 1826 | 1 |
After the PR, without caching | 48 | 0.125 | 1730 | 1 | 92 | 1.043 | 1822 | 1.002 |
After the PR, with caching, not cached | 52 | 0.115 | 1740 | 0.994 | 90 | 1.067 | 1830 | 0.998 |
After the PR, with caching, cached | 3 | 2 | 39 | 44.359 | 15 | 6.4 | 54 | 33.815 |
The improvement is the measure before the PR over the current measure.
EC key
Before the PR
After the PR without caching
After the PR with caching
Summary
Connection | TLS Setup | Waiting | Total without connection | |||||
---|---|---|---|---|---|---|---|---|
Measure (ms) | Improvement | Measure (ms) | Improvement | Measure (ms) | Improvement | Measure (ms) | Improvement | |
Before the PR | 28 | 1 | 309 | 1 | 38 | 1 | 347 | 1 |
After the PR, without caching | 2 | 14 | 312 | 0.99 | 39 | 0.974 | 351 | 0.989 |
After the PR, with caching, not cached | 41 | 0.683 | 309 | 1 | 44 | 0.864 | 353 | 0.983 |
After the PR, with caching, cached | 3 | 9.333 | 84 | 3.679 | 13 | 2.923 | 97 | 3.577 |
The improvement is the measure before the PR over the current measure.
Analysis
These numbers show the same thing as the previous tests: this PR greatly improves the speed of cached requests and doesn't have any noticeable downside.
However, this test shows clearly shows how much the TLS handshake is a bottleneck without this PR and how much it's improved when the server caches the client's sessions.
Somehow it also slightly improves the waiting time for the server response. I don't think it means that the server decrypts the request faster or processes it faster in any way. I simply think that this is because when resuming cached sessions the client ends the handshake instead of the server. This means that the client can start sending the application data at the same time as it sends the TLS record to end the handshake. This would therefore reduce the time the client has to wait for the server response.
You can see this at page 35 and 36 of the TLS 1.2 standard:
Client Server
ClientHello -------->
ServerHello
Certificate*
ServerKeyExchange*
CertificateRequest*
<-------- ServerHelloDone
Certificate*
ClientKeyExchange
CertificateVerify*
[ChangeCipherSpec]
Finished -------->
[ChangeCipherSpec]
<-------- Finished
Application Data <-------> Application Data
Figure 1. Message flow for a full handshake
Client Server
ClientHello -------->
ServerHello
[ChangeCipherSpec]
<-------- Finished
[ChangeCipherSpec]
Finished -------->
Application Data <-------> Application Data
Figure 2. Message flow for an abbreviated handshake
Conclusion
This PR doesn't seem to have any negative performance impact, only positive ones. Once enabled by the user, it will increase performance of cached sessions by 20 to 25 times depending the type of encryption used. The slower the encryption is, the more this feature will boost performances. However, users need to be well aware that the TLS client they're using also needs to cache the session in order to experiment this performance boost. This is why it was clearly mentionned in the documentation.