Issue 21790: Change blocksize in http.client to the value of resource.getpagesize (original) (raw)

Created on 2014-06-17 15:40 by demian.brecht, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (8)

msg220839 - (view)

Author: Demian Brecht (demian.brecht) * (Python triager)

Date: 2014-06-17 15:40

When sending data, the blocksize is currently hardcoded to 8192. It should likely be set to the value of resource.getpagesize().

msg220887 - (view)

Author: PCManticore (Claudiu.Popa) * (Python triager)

Date: 2014-06-17 20:47

resource module is available only on Unix.

msg221093 - (view)

Author: Demian Brecht (demian.brecht) * (Python triager)

Date: 2014-06-20 14:49

Updated to mmap.PAGESIZE, which seems to be available on all systems.

msg221095 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2014-06-20 15:03

When sending data, the blocksize is currently hardcoded to 8192.

Yeah, same value for io.DEFAULT_BUFFER_SIZE.

It should likely be set to the value of resource.getpagesize().

Could you please elaborate? A page size is 4096 bytes on my Linux. So your page double the number of calls to read(), right?

shutil.copyfileobj() uses a buffer of 16 KB by default.

See also the issue #21679 which adds a private copy of the stat().st_blksize attribute in a FileIO object.

msg221120 - (view)

Author: Demian Brecht (demian.brecht) * (Python triager)

Date: 2014-06-20 19:56

I very well could be missing something here (and, admittedly, my OS knowledge is rusty at best), but for the most part, page sizes are 4096, except for SPARC, which is 8192.

So your page double the number of calls to read(), right?

No. read() is called until EOF. I'm assuming that 8192 may have been used to accommodate worst case architecture?

I'd have to dig through the C side of things (which I haven't done yet) to see what's going on down at that level, but my assumption is that it's allocating 8192 bytes for each read. Should EOF be reached with <= 1 page filled, it'd result in a wasted page.

Far from the end of the world, but just something I noticed in passing.

msg292688 - (view)

Author: Clay Gerrard (Clay Gerrard)

Date: 2017-05-01 18:13

I don't think the default is bad/wrong so much as it's an inflexible magic number.

I think blocksize should be parameterized as an attribute on the connection - or at least a class attribute or module level constant so we can monkey patch it without having to create a subclass and redefine the entire send method (!?).

Not having a good string to pull at here is causing a non-trivial amount of pain:

https://github.com/sigmavirus24/requests-toolbelt/pull/84/files https://github.com/sigmavirus24/requests-toolbelt/issues/75 https://webcache.googleusercontent.com/search?q=cache:WPSngjyUzg4J:https://technology.jana.com/2015/12/30/dont-use-python-httplib-or-any-library-that-uses-it-to-transfer-large-files-especially-to-china/+&cd=5&hl=en&ct=clnk&gl=us https://bugs.launchpad.net/python-swiftclient/+bug/1671621

msg292708 - (view)

Author: Martin Panter (martin.panter) * (Python committer)

Date: 2017-05-02 01:26

For plain-text (non-SSL) HTTP uploads, perhaps using “socket.sendfile” would help: Issue 13559.

Another idea would be to expose and document the low-level socket (or SSL wrapper) and let the user copy data with whatever chunk size they desire.

msg305612 - (view)

Author: Martin Panter (martin.panter) * (Python committer)

Date: 2017-11-05 23:28

Issue 31945 proposes adding a “blocksize” parameter to HTTPConnection objects, so I suggest to closing in favour of that one.

History

Date

User

Action

Args

2022-04-11 14:58:05

admin

set

github: 65989

2017-11-08 16:17:02

berker.peksag

set

status: open -> closed
stage: resolved

2017-11-05 23:28:20

martin.panter

set

superseder: Configurable blocksize in HTTP(S)Connection
resolution: rejected
messages: +

2017-05-02 01:26:19

martin.panter

set

nosy: + martin.panter
messages: +

2017-05-01 18:13:15

Clay Gerrard

set

nosy: + Clay Gerrard
messages: +

2014-06-20 19:56:49

demian.brecht

set

messages: +

2014-06-20 15:03:11

vstinner

set

nosy: + vstinner
messages: +

2014-06-20 14:49:06

demian.brecht

set

files: + issue21790.patch

messages: +

2014-06-17 20:47:43

Claudiu.Popa

set

nosy: + Claudiu.Popa
messages: +

2014-06-17 15:40:24

demian.brecht

create