Issue 34088: [EASY] sndhdr.what() throws exceptions on unknown files (original) (raw)

Created on 2018-07-10 20:50 by Barro, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 8319 open davidawad,2018-07-17 18:12
Messages (9)
msg321396 - (view) Author: Jussi Judin (Barro) Date: 2018-07-10 20:50
sndhdr.what() function throws several types of exceptions on unknown files instead of returning None (as documentation says). Following code can replicate these crashes: ``` import sndhdr import sys sndhdr.what(sys.argv[1]) ``` First crash is from wave or chunk module (input data is base64 encoded in the echo command): ``` $ echo UklGRjAwMDBXQVZFZm10IDAwMDABADAwMDAwMDAwMDAwMDAw | python3.7 -mbase64 -d > in.file $ python3.7 sndhdr/test.py in.file Traceback (most recent call last): File "sndhdr/test.py", line 4, in sndhdr.what(sys.argv[1]) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 54, in what res = whathdr(filename) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 63, in whathdr res = tf(h, f) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 163, in test_wav w = wave.open(f, 'r') File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 510, in open return Wave_read(f) File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 164, in __init__ self.initfp(f) File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 153, in initfp chunk.skip() File "/tmp/python-3.7-bin/lib/python3.7/chunk.py", line 160, in skip self.file.seek(n, 1) File "/tmp/python-3.7-bin/lib/python3.7/chunk.py", line 113, in seek raise RuntimeError RuntimeError ``` Second crash comes from sndhdr module itself (again base64 encoded data is first decoded on command line): ``` $ echo AAA= python3.7 -mbase64 -d > in.file $ python3.7 sndhdr/test.py in.fileTraceback (most recent call last): File "sndhdr/test.py", line 4, in sndhdr.what(sys.argv[1]) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 54, in what res = whathdr(filename) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 63, in whathdr res = tf(h, f) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 192, in test_sndr rate = get_short_le(h[2:4]) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 213, in get_short_le return (b[1] << 8) b[0] IndexError: index out of range ```
msg321457 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-11 14:32
I agree that sndhdr.what() should return None and not raise an exception if the file format is not supported or the file is invalid.
msg321543 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2018-07-12 12:09
Will you make a PR or shall I try to fix this?
msg321545 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-12 12:19
If someone writes a PR, I can review it ;-)
msg321573 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2018-07-12 18:20
Upon checking to see where the RuntimeError is coming from, I noticed that there is some notorious behaviour in chunk.py. If one runs python with the exact same parameters as Jussi Judin did in their first case and after stepping through the call stack until Chunk/skip is reached, there is a call to Chunk/seek. On the time of the call size_read has the value 16, whereas within the seek method it has the value 28. Is this intended? If so, where is that implemented? I cannot figure out why this is happening. The following is my pdb output. > cpython/Lib/chunk.py(160)skip() -> self.file.seek(n, 1) (Pdb) p self.size_read 16 (Pdb) s --Call-- > cpython/Lib/chunk.py(98)seek() -> def seek(self, pos, whence=0): (Pdb) p self.size_read 28
msg321738 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2018-07-16 13:22
Hey Lysandros. Take a look at https://github.com/python/cpython/blob/master/Lib/wave.py#L126-L139 Notice how there's a Chunk made from the `file` argument but then another Chunk created using the previous chunk as a file. While it might seem like self.size_read is magically changing between the steps, the thing to realize there is that self.file is another chunk and when you step, you're stepping into the seek of another Chunk instance, which has its own self.size_read
msg321849 - (view) Author: david awad (davidawad) * Date: 2018-07-17 18:18
Hey, I saw the issue and spent some time piecing together a PR for it. I've never contributed to Python before but this seemed like a good place to start! I posted these questions on the PR but I'll echo them here as well. - Should we only handle the specific exceptions that are created in these two examples? or is there a cleaner, more pythonic way to handle it? - It doesn't appear that test_sndhdr.py does any testing of bad inputs. Can I add test cases to the test_sndhdr.py file? Thanks! I'm here to learn so if there's anything I missed please feel free to let me know.
msg321850 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2018-07-17 18:49
Hey David, thanks for your contribution, I've added some feedback in context of the code on Github.
msg341235 - (view) Author: david awad (davidawad) * Date: 2019-05-01 22:40
It's been a while since then but I really think we can get this fix merged in the next two weeks or so. Can someone give this another look? I had a quick question for Serhiy Here: https://github.com/python/cpython/pull/8319#discussion_r273539593 If we could get that resolved I think this can finally be done and I can try to help out on a different PR and do a better job in the future. Thanks!
History
Date User Action Args
2022-04-11 14:59:03 admin set github: 78269
2019-05-01 22:40:51 davidawad set messages: +
2018-07-28 12:49:02 steve.dower set keywords: - easy
2018-07-17 18:49:51 ammar2 set messages: +
2018-07-17 18🔞03 davidawad set nosy: + davidawadmessages: +
2018-07-17 18:12:42 davidawad set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest7854>
2018-07-16 13:22:27 ammar2 set nosy: + ammar2messages: +
2018-07-12 18:20:48 lys.nikolaou set messages: +
2018-07-12 12:19:55 vstinner set messages: +
2018-07-12 12:09:08 lys.nikolaou set nosy: + lys.nikolaoumessages: +
2018-07-11 14:32:01 vstinner set title: sndhdr.what() throws exceptions on unknown files -> [EASY] sndhdr.what() throws exceptions on unknown filesnosy: + vstinnermessages: + keywords: + easy
2018-07-10 20:50:06 Barro create