Result.serialize()
path handling is broken for windows paths and some other cases · Issue #2067 · RDFLib/rdflib (original) (raw)
I have been trying to figure out what is happening with these xfails:
if sys.platform == "win32": |
---|
xfails[("csv", DestinationType.STR_PATH, "utf-8")] = pytest.mark.xfail( |
raises=FileNotFoundError, |
reason="string path handling does not work on windows", |
) |
xfails[("csv", DestinationType.STR_PATH, "utf-16")] = pytest.mark.xfail( |
raises=FileNotFoundError, |
reason="string path handling does not work on windows", |
) |
xfails[("json", DestinationType.STR_PATH, "utf-8")] = pytest.mark.xfail( |
raises=FileNotFoundError, |
reason="string path handling does not work on windows", |
) |
xfails[("json", DestinationType.STR_PATH, "utf-16")] = pytest.mark.xfail( |
raises=FileNotFoundError, |
reason="string path handling does not work on windows", |
) |
xfails[("xml", DestinationType.STR_PATH, "utf-8")] = pytest.mark.xfail( |
raises=FileNotFoundError, |
reason="string path handling does not work on windows", |
) |
xfails[("xml", DestinationType.STR_PATH, "utf-16")] = pytest.mark.xfail( |
raises=FileNotFoundError, |
reason="string path handling does not work on windows", |
) |
The problem is with the approach to path handling:
location = cast(str, destination) |
---|
scheme, netloc, path, params, query, fragment = urlparse(location) |
if netloc != "": |
print( |
"WARNING: not saving as location" + "is not a local file reference" |
) |
return None |
fd, name = tempfile.mkstemp() |
stream = os.fdopen(fd, "wb") |
serializer.serialize(stream, encoding=encoding, **args) |
stream.close() |
shutil.move(name, path) |
The problem with this approach is that file URIs and OS paths are quite different, for one, with windows OS paths, e.g. C:\Users\runneradmin\AppData\Local\Temp\pytest-of-unknown\pytest-0\test_select_result_serialize_p6\file-DestinationType.STR_PATH
, the drive letter gets interpreted as the URL scheme:
$ python3 -c 'from urllib.parse import urlparse; print(urlparse(r"C:\Users\runneradmin\AppData\Local\Temp\pytest-of-unknown\pytest-0\test_select_result_serialize_p6\file-DestinationType.STR_PATH"))' ParseResult(scheme='c', netloc='', path='\Users\runneradmin\AppData\Local\Temp\pytest-of-unknown\pytest-0\test_select_result_serialize_p6\file-DestinationType.STR_PATH', params='', query='', fragment='')
Furthermore, URIs support percent encoding, while OS paths do not.
Here is an example of things going wrong (from here)
------------------------------ Captured log call ------------------------------
2022-07-30T12:11:21.926 ERROR root test_result.py:317:test_select_result_serialize_parse destination = C:\Users\runneradmin\AppData\Local\Temp\pytest-of-unknown\pytest-0\test_select_result_serialize_p6\file-DestinationType.STR_PATH
2022-07-30T12:11:21.926 ERROR root test_result.py:318:test_select_result_serialize_parse format = csv
2022-07-30T12:11:21.926 ERROR root test_result.py:319:test_select_result_serialize_parse encoding = utf-16
___________ test_select_result_serialize_parse[csv-STR_PATH-utf-8] ____________
Traceback (most recent call last):
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 566, in move
os.rename(src, real_dst)
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpgk0vyq6q' -> '\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-unknown\\pytest-0\\test_select_result_serialize_p7\\file-DestinationType.STR_PATH'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\a\rdflib\rdflib\test\test_sparql\test_result.py", line 323, in test_select_result_serialize_parse
encoding=encoding,
File "D:\a\rdflib\rdflib\rdflib\query.py", line 283, in serialize
shutil.move(name, path)
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 580, in move
copy_function(src, real_dst)
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 266, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\shutil.py", line 121, in copyfile
with open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-unknown\\pytest-0\\test_select_result_serialize_p7\\file-DestinationType.STR_PATH'
I think the best we can do to fix the path handling is to do the same as what happens in Graph.serialize
if isinstance(destination, pathlib.PurePath): |
---|
location = str(destination) |
else: |
location = cast(str, destination) |
scheme, netloc, path, params, _query, fragment = urlparse(location) |
if netloc != "": |
raise ValueError( |
f"destination {destination} is not a local file reference" |
) |
fd, name = tempfile.mkstemp() |
stream = os.fdopen(fd, "wb") |
serializer.serialize(stream, base=base, encoding=encoding, **args) |
stream.close() |
dest = url2pathname(path) if scheme == "file" else location |
shutil.move(name, dest) |
This will fill relative path handling in some cases also, however it will break relative URI handling.