msg325811 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2018-09-19 22:41 |
I've been trying to track down a leak in aiohttp: https://github.com/aio-libs/aiohttp/issues/3010 it seems like this leak now occurs with raw asyncio SSL sockets. when the gist script is run like so: python3.7 `which mprof` run --interval=1 ~/dev/test_leak.py -test asyncio_test it slowly leaks memory. This is effectively doing the following: URLS = { 'https://s3.us-west-2.amazonaws.com/archpi.dabase.com/style.css': { 'method': 'get', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025044Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ae552641b9aa9a7a267fcb4e36960cd5863e55d91c9b45fd39b30fdcd2e81489', 'Accept-Encoding': 'identity'} }, 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist': { 'method': 'GET' if sys.argv[1] == 'get_object' else 'HEAD', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025221Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/ap-southeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=7a7675ef6d70cb647ed59e02d532ffa80d437fb03976d8246ea9ef102d118794', 'Accept-Encoding': 'identity'} } } class HttpClient(asyncio.streams.FlowControlMixin): transport = None def __init__(self, *args, **kwargs): self.__url = kwargs.pop('url') self.__logger = logging.getLogger() super().__init__() def connection_made(self, transport): self.transport = transport url_parts = urlparse(self.__url) entry = URLS[self.__url] body = f'{entry["method"]} {url_parts.path} HTTP/1.1\r\nAccept: */*\r\nHost: {url_parts.hostname}\r\n' for name, value in entry['headers'].items(): body += f'{name}: {value}\r\n' body += '\r\n' self.transport.write(body.encode('ascii')) self.__logger.info(f'data sent: {body}') def data_received(self, data): self.__logger.info(f'data received: {data}') self.transport.close() # asyncio.get_event_loop().call_later(1.0, ) def eof_received(self): self.__logger.info('eof_received') def connection_lost(self, exc): self.__logger.info(f'connection lost: {exc}') super().connection_lost(exc) @classmethod def create_factory(cls, url: str): def factory(*args, **kwargs): return cls(*args, url=url, **kwargs) return factory async def test_asyncio(ssl_context): loop = asyncio.get_event_loop() url = 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist' url_parts = urlparse(url) port = url_parts.port or (80 if url_parts.scheme == 'http' else 443) infos = await loop.getaddrinfo(url_parts.hostname, port, family=socket.AF_INET) family, type, proto, canonname, sockaddr = infos[0] await loop.create_connection(HttpClient.create_factory(url), sockaddr[0], port, ssl=ssl_context, family=family, proto=proto, flags=socket.AI_NUMERICHOST, server_hostname=url_parts.hostname, local_addr=None) async def asyncio_test(): ssl_context = ssl.create_default_context() while True: await test_asyncio(ssl_context) await asyncio_test() |
|
|
msg325813 - (view) |
Author: Yury Selivanov (yselivanov) *  |
Date: 2018-09-19 22:44 |
What is "raw asyncio SSL sockets"? We don't use SSL sockets in asyncio, we use SSL Memory BIO. Do you think that some SSL context objects aren't being properly released? BTW, can you see the leak when run under uvloop? |
|
|
msg325817 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2018-09-19 23:10 |
sorry, by "raw" I mean in the context of aiohttp, so just using the normal python ssl context and asyncio sockets. I don't think it's an object not getting GC'd because I didn't see any increase on object counts, nor leaks per tracemalloc. I think it's some low-level native memory leaked by openssl. I've updated the gist w/ uvloop and ran with it and still get a leak, see gist + aiohttp issue for plot |
|
|
msg325819 - (view) |
Author: Yury Selivanov (yselivanov) *  |
Date: 2018-09-19 23:13 |
Would you be able to test uvloop/master branch? Current uvloop 0.11.x uses pretty much the asyncio implementation; the master branch has a completely rewritten SSL layer. If the master branch has the leak it might mean that the root cause is indeed in either the ssl module or openssl itself. |
|
|
msg332273 - (view) |
Author: Chase Peterson (cnpeterson) * |
Date: 2018-12-20 22:32 |
I ran the code snippet below using uvloop/master in a docker container. As it ran, the container continually leaked memory. I included a graph with the memory usage. Environment: # cat /etc/*-release PRETTY_NAME="Debian GNU/Linux 9 (stretch)" NAME="Debian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" # uname -r 4.18.16-300.fc29.x86_64 # python -V Python 3.7.1 # pip freeze asyncio==3.4.3 Cython==0.29.2 idna==2.8 multidict==4.5.2 uvloop==0.12.0rc2 yarl==1.3.0 I had to tweak the code a bit to run in a docker container successfully, but here is the code I used: import asyncio import logging import ssl import socket import sys import yarl import uvloop CREDENTIAL = '' URLS = { 'https://s3.us-west-2.amazonaws.com/archpi.dabase.com/style.css': { 'method': 'get', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025044Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ae552641b9aa9a7a267fcb4e36960cd5863e55d91c9b45fd39b30fdcd2e81489', 'Accept-Encoding': 'identity'} }, 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist': { 'method': 'GET' if sys.argv[1] == 'get_object' else 'HEAD', 'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025221Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/ap-southeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=7a7675ef6d70cb647ed59e02d532ffa80d437fb03976d8246ea9ef102d118794', 'Accept-Encoding': 'identity'} } } asyncio.set_event_loop_policy(uvloop.EventLoopPolicy()) class HttpClient(asyncio.streams.FlowControlMixin): transport = None def __init__(self, *args, **kwargs): self.__url = kwargs.pop('url') self.__logger = logging.getLogger() super().__init__() def connection_made(self, transport): self.transport = transport url_parts = yarl.URL(self.__url) entry = URLS[self.__url] body = f'{entry["method"]} {url_parts.path} HTTP/1.1\r\nAccept: */*\r\nHost: {url_parts.host}\r\n' for name, value in entry['headers'].items(): body += f'{name}: {value}\r\n' body += '\r\n' self.transport.write(body.encode('ascii')) self.__logger.info(f'data sent: {body}') def data_received(self, data): self.__logger.info(f'data received: {data}') self.transport.close() # asyncio.get_event_loop().call_later(1.0, ) def eof_received(self): self.__logger.info('eof_received') def connection_lost(self, exc): self.__logger.info(f'connection lost: {exc}') super().connection_lost(exc) @classmethod def create_factory(cls, url: str): def factory(*args, **kwargs): return cls(*args, url=url, **kwargs) return factory async def test_asyncio(ssl_context): loop = asyncio.get_event_loop() url = 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist' url_parts = yarl.URL(url) port = url_parts.port or (80 if url_parts.scheme == 'http' else 443) infos = await loop.getaddrinfo(url_parts.host, port, family=socket.AF_INET) family, type, proto, canonname, sockaddr = infos[0] await loop.create_connection(HttpClient.create_factory(url), sockaddr[0], port, ssl=ssl_context, family=family, proto=proto, flags=socket.AI_NUMERICHOST, server_hostname=url_parts.host, local_addr=None) async def asyncio_test(): ssl_context = ssl.create_default_context() while True: await test_asyncio(ssl_context) def main(): print('running') loop = asyncio.get_event_loop() loop.run_until_complete(asyncio_test()) main() |
|
|
msg338145 - (view) |
Author: Yury Selivanov (yselivanov) *  |
Date: 2019-03-17 22:51 |
New changeset f683f464259715d620777d7ed568e701337a703a by Yury Selivanov (Fantix King) in branch 'master': bpo-34745: Fix asyncio sslproto memory issues (GH-12386) https://github.com/python/cpython/commit/f683f464259715d620777d7ed568e701337a703a |
|
|
msg338146 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-03-17 23:09 |
New changeset 7f7485c0605fe979e39c58b688f2bb6a4f43e65e by Miss Islington (bot) in branch '3.7': bpo-34745: Fix asyncio sslproto memory issues (GH-12386) https://github.com/python/cpython/commit/7f7485c0605fe979e39c58b688f2bb6a4f43e65e |
|
|
msg338784 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2019-03-25 06:10 |
going to close, I've verified that it fixes my original issue, ty!! |
|
|