msg250401 - (view) |
Author: desbma (desbma) * |
Date: 2015-09-10 18:38 |
Sorry if it has already been discussed, or if this is a stupid idea. By looking at the signature, my thought was that the only use of shutil.copyfileobj was to wrap the use of sendfile, and use a fallback if it is not available on the system or not usable with "fake" Python files (not having a file descriptor, eg. like gzip.GzipFile). By looking at the implementation, I was surprised that it does not try to call os.sendfile. |
|
|
msg250406 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2015-09-10 19:10 |
No, it's to copy file like objects that aren't necessarily true file system objects. The only requirement is that they support a minimal file-like object interface. There is no suggestion or requirement that the object supports file descriptors (which would make it a real os file object). I think changing it to use os.sendfile if possible would be too risky in terms of causing unexpected breakage (objects that look like they have a file descriptor but really don't), especially since handling file system files is explicitly not the purpose of copyfileobj. |
|
|
msg250426 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2015-09-10 22:31 |
I agree it is a nice idea, but have the same concern as David. Many pseudo file objects wrap real file descriptors. E.g. BufferedReader hides unread data in a buffer, GzipFile decompresses data, HTTPResponse decodes chunks. Detecting when sendfile() is appropriate would be hard to do. And I understand some platforms only support sockets. Also I notice that 3.5 has a new socket.sendfile() method (Issue 17552), which partly overlaps with this proposal. BTW the 3.4 os documentation should not mention the socket method, but it does! |
|
|
msg250503 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2015-09-11 23:56 |
New changeset 3d9cbbad8a04 by Martin Panter in branch '3.4': Issue #25063: socket.sendfile() does not exist in 3.4 https://hg.python.org/cpython/rev/3d9cbbad8a04 |
|
|
msg250578 - (view) |
Author: desbma (desbma) * |
Date: 2015-09-13 17:18 |
If I understand you correctly, a naive implementation like this: use_chunk_copy = False try: fdsrc = fsrc.fileno() fddst = fdst.fileno() except OSError: # fallback on current chunk based Python copy use_chunk_copy = True else: # we have 2 fd, try sendfile try: os.sendfile(fdsrc, fddst) except AttributeError: # sendfile not available use_chunk_copy = True if use_chunk_copy: # use current Python based chunk copy ...would not work because some file like objects expose fileno(), even though it does not accurately represent the file because they wrap it to add special behavior (like GzipFile). I don't understand, if they alter the file behavior, why do they expose fileno()? |
|
|
msg250586 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2015-09-13 20:50 |
I don’t know the exact reasoning, but fileno() could be useful for other purposes, e.g. passing to stat() to find the last modification time, passing to select() to see if data is available. |
|
|
msg250657 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2015-09-14 12:02 |
I'm going to reject this. If you think there is enough value in this to warrant the complications that would be required to ensure backward compatibility (assuming that is even possible), you could bring it up again on python-ideas. |
|
|