Issue 34745: asyncio ssl memory leak (original) (raw)

Created on 2018-09-19 22:41 by thehesiod, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
memory_usage.png cnpeterson,2018-12-20 22:32 Memory Usage
Pull Requests
URL Status Linked Edit
PR 12386 merged fantix,2019-03-17 22:18
PR 12387 merged miss-islington,2019-03-17 22:51
PR 12393 closed fantix,2019-03-18 04:39
Messages (8)
msg325811 - (view) Author: Alexander Mohr (thehesiod) * Date: 2018-09-19 22:41
I've been trying to track down a leak in aiohttp: https://github.com/aio-libs/aiohttp/issues/3010 it seems like this leak now occurs with raw asyncio SSL sockets. when the gist script is run like so: python3.7 `which mprof` run --interval=1 ~/dev/test_leak.py -test asyncio_test it slowly leaks memory. This is effectively doing the following: URLS = { 'https://s3.us-west-2.amazonaws.com/archpi.dabase.com/style.css': { 'method': 'get', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025044Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ae552641b9aa9a7a267fcb4e36960cd5863e55d91c9b45fd39b30fdcd2e81489', 'Accept-Encoding': 'identity'} }, 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist': { 'method': 'GET' if sys.argv[1] == 'get_object' else 'HEAD', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025221Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/ap-southeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=7a7675ef6d70cb647ed59e02d532ffa80d437fb03976d8246ea9ef102d118794', 'Accept-Encoding': 'identity'} } } class HttpClient(asyncio.streams.FlowControlMixin): transport = None def __init__(self, *args, **kwargs): self.__url = kwargs.pop('url') self.__logger = logging.getLogger() super().__init__() def connection_made(self, transport): self.transport = transport url_parts = urlparse(self.__url) entry = URLS[self.__url] body = f'{entry["method"]} {url_parts.path} HTTP/1.1\r\nAccept: */*\r\nHost: {url_parts.hostname}\r\n' for name, value in entry['headers'].items(): body += f'{name}: {value}\r\n' body += '\r\n' self.transport.write(body.encode('ascii')) self.__logger.info(f'data sent: {body}') def data_received(self, data): self.__logger.info(f'data received: {data}') self.transport.close() # asyncio.get_event_loop().call_later(1.0, ) def eof_received(self): self.__logger.info('eof_received') def connection_lost(self, exc): self.__logger.info(f'connection lost: {exc}') super().connection_lost(exc) @classmethod def create_factory(cls, url: str): def factory(*args, **kwargs): return cls(*args, url=url, **kwargs) return factory async def test_asyncio(ssl_context): loop = asyncio.get_event_loop() url = 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist' url_parts = urlparse(url) port = url_parts.port or (80 if url_parts.scheme == 'http' else 443) infos = await loop.getaddrinfo(url_parts.hostname, port, family=socket.AF_INET) family, type, proto, canonname, sockaddr = infos[0] await loop.create_connection(HttpClient.create_factory(url), sockaddr[0], port, ssl=ssl_context, family=family, proto=proto, flags=socket.AI_NUMERICHOST, server_hostname=url_parts.hostname, local_addr=None) async def asyncio_test(): ssl_context = ssl.create_default_context() while True: await test_asyncio(ssl_context) await asyncio_test()
msg325813 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-09-19 22:44
What is "raw asyncio SSL sockets"? We don't use SSL sockets in asyncio, we use SSL Memory BIO. Do you think that some SSL context objects aren't being properly released? BTW, can you see the leak when run under uvloop?
msg325817 - (view) Author: Alexander Mohr (thehesiod) * Date: 2018-09-19 23:10
sorry, by "raw" I mean in the context of aiohttp, so just using the normal python ssl context and asyncio sockets. I don't think it's an object not getting GC'd because I didn't see any increase on object counts, nor leaks per tracemalloc. I think it's some low-level native memory leaked by openssl. I've updated the gist w/ uvloop and ran with it and still get a leak, see gist + aiohttp issue for plot
msg325819 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-09-19 23:13
Would you be able to test uvloop/master branch? Current uvloop 0.11.x uses pretty much the asyncio implementation; the master branch has a completely rewritten SSL layer. If the master branch has the leak it might mean that the root cause is indeed in either the ssl module or openssl itself.
msg332273 - (view) Author: Chase Peterson (cnpeterson) * Date: 2018-12-20 22:32
I ran the code snippet below using uvloop/master in a docker container. As it ran, the container continually leaked memory. I included a graph with the memory usage. Environment: # cat /etc/*-release PRETTY_NAME="Debian GNU/Linux 9 (stretch)" NAME="Debian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" # uname -r 4.18.16-300.fc29.x86_64 # python -V Python 3.7.1 # pip freeze asyncio==3.4.3 Cython==0.29.2 idna==2.8 multidict==4.5.2 uvloop==0.12.0rc2 yarl==1.3.0 I had to tweak the code a bit to run in a docker container successfully, but here is the code I used: import asyncio import logging import ssl import socket import sys import yarl import uvloop CREDENTIAL = '' URLS = { 'https://s3.us-west-2.amazonaws.com/archpi.dabase.com/style.css': { 'method': 'get', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025044Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ae552641b9aa9a7a267fcb4e36960cd5863e55d91c9b45fd39b30fdcd2e81489', 'Accept-Encoding': 'identity'} }, 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist': { 'method': 'GET' if sys.argv[1] == 'get_object' else 'HEAD', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025221Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/ap-southeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=7a7675ef6d70cb647ed59e02d532ffa80d437fb03976d8246ea9ef102d118794', 'Accept-Encoding': 'identity'} } } asyncio.set_event_loop_policy(uvloop.EventLoopPolicy()) class HttpClient(asyncio.streams.FlowControlMixin): transport = None def __init__(self, *args, **kwargs): self.__url = kwargs.pop('url') self.__logger = logging.getLogger() super().__init__() def connection_made(self, transport): self.transport = transport url_parts = yarl.URL(self.__url) entry = URLS[self.__url] body = f'{entry["method"]} {url_parts.path} HTTP/1.1\r\nAccept: */*\r\nHost: {url_parts.host}\r\n' for name, value in entry['headers'].items(): body += f'{name}: {value}\r\n' body += '\r\n' self.transport.write(body.encode('ascii')) self.__logger.info(f'data sent: {body}') def data_received(self, data): self.__logger.info(f'data received: {data}') self.transport.close() # asyncio.get_event_loop().call_later(1.0, ) def eof_received(self): self.__logger.info('eof_received') def connection_lost(self, exc): self.__logger.info(f'connection lost: {exc}') super().connection_lost(exc) @classmethod def create_factory(cls, url: str): def factory(*args, **kwargs): return cls(*args, url=url, **kwargs) return factory async def test_asyncio(ssl_context): loop = asyncio.get_event_loop() url = 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist' url_parts = yarl.URL(url) port = url_parts.port or (80 if url_parts.scheme == 'http' else 443) infos = await loop.getaddrinfo(url_parts.host, port, family=socket.AF_INET) family, type, proto, canonname, sockaddr = infos[0] await loop.create_connection(HttpClient.create_factory(url), sockaddr[0], port, ssl=ssl_context, family=family, proto=proto, flags=socket.AI_NUMERICHOST, server_hostname=url_parts.host, local_addr=None) async def asyncio_test(): ssl_context = ssl.create_default_context() while True: await test_asyncio(ssl_context) def main(): print('running') loop = asyncio.get_event_loop() loop.run_until_complete(asyncio_test()) main()
msg338145 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-03-17 22:51
New changeset f683f464259715d620777d7ed568e701337a703a by Yury Selivanov (Fantix King) in branch 'master': bpo-34745: Fix asyncio sslproto memory issues (GH-12386) https://github.com/python/cpython/commit/f683f464259715d620777d7ed568e701337a703a
msg338146 - (view) Author: miss-islington (miss-islington) Date: 2019-03-17 23:09
New changeset 7f7485c0605fe979e39c58b688f2bb6a4f43e65e by Miss Islington (bot) in branch '3.7': bpo-34745: Fix asyncio sslproto memory issues (GH-12386) https://github.com/python/cpython/commit/7f7485c0605fe979e39c58b688f2bb6a4f43e65e
msg338784 - (view) Author: Alexander Mohr (thehesiod) * Date: 2019-03-25 06:10
going to close, I've verified that it fixes my original issue, ty!!
History
Date User Action Args
2022-04-11 14:59:06 admin set github: 78926
2019-03-25 06:10:45 thehesiod set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2019-03-18 04:39:31 fantix set pull_requests: + <pull%5Frequest12348>
2019-03-17 23:09:17 miss-islington set nosy: + miss-islingtonmessages: +
2019-03-17 22:51:37 miss-islington set pull_requests: + <pull%5Frequest12344>
2019-03-17 22:51:15 yselivanov set messages: +
2019-03-17 22🔞29 fantix set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest12343>
2018-12-20 22:34:52 thehesiod set versions: + Python 3.6
2018-12-20 22:32:04 cnpeterson set files: + memory_usage.pngnosy: + cnpetersonmessages: +
2018-09-30 03:57:02 fantix set nosy: + fantix
2018-09-19 23:13:32 yselivanov set messages: +
2018-09-19 23:10:57 thehesiod set messages: +
2018-09-19 22:44:28 yselivanov set messages: +
2018-09-19 22:41:00 thehesiod create