msg288355 - (view) |
Author: Mba (mba) |
Date: 2017-02-22 12:24 |
ino_t is of type 'unsigned long' or 'unsigned long long'. Casting it to signed type means that negative values are incorrectly returned for large inodes. |
|
|
msg289213 - (view) |
Author: Xiang Zhang (xiang.zhang) *  |
Date: 2017-03-08 07:36 |
Any reason our _Py_stat_struct on Windows uses a signed type to represent st_ino? |
|
|
msg289216 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-03-08 08:07 |
On Windows, _Py_attribute_data_to_stat() converts BY_HANDLE_FILE_INFORMATION to _Py_stat_struct, and then _pystat_fromstructstat() creates Python objects. The file index in BY_HANDLE_FILE_INFORMATION is made of two DWORD, so yes, it's unsigned. On Linux, stat.st_ino type is ino_t which is unsigned too. So I created a pull request to fix the bug, even if I don't think that a filesystem produce inodes larger than 2^63-1. Not sure if it's worth it to backport the fix to Python 2.7, 3.5 and 3.6? |
|
|
msg289221 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-08 09:24 |
I think it is worth to backport this at least to 3.6. |
|
|
msg289288 - (view) |
Author: Mba (mba) |
Date: 2017-03-09 12:52 |
> I don't think that a filesystem produce inodes larger than 2^63-1 The problem is real: the large inode number was assigned to a file created by MacOS on a share exported via CIFS, and then stated by Linux using NFS export. |
|
|
msg290247 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-03-24 22:37 |
New changeset 68d29809405dc766966b2b973b8597212fbc3dbd by Victor Stinner in branch '3.6': bpo-29619: Convert st_ino using unsigned integer (#557) (#584) https://github.com/python/cpython/commit/68d29809405dc766966b2b973b8597212fbc3dbd |
|
|
msg290250 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-03-24 22:38 |
New changeset 0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c by Victor Stinner in branch 'master': bpo-29619: Convert st_ino using unsigned integer (#557) https://github.com/python/cpython/commit/0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c |
|
|
msg292141 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-04-22 22:26 |
The compilation of Modules/posixmodule.c fails now on Android architecture 'x86' at api level 21, after the above change 0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c: /opt/android-ndk/toolchains/llvm/prebuilt/linux-x86_64/bin/clang --sysroot=/opt/android-ndk/platforms/android-21/arch-x86 -target i686-none-linux-androideabi -gcc-toolchain /opt/android-ndk/toolchains/x86-4.9/prebuilt/linux-x86_64 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wno-unused-value -Wno-empty-body -Qunused-arguments -Wno-parentheses-equality -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -IObjects -IInclude -IPython -I. -I./Include -I/opt/android-ndk/platforms/android-21/arch-x86/usr/include -DPy_BUILD_CORE -c ./Modules/posixmodule.c -o Modules/posixmodule.o ./Modules/posixmodule.c:1935:5: error: array size is negative Py_BUILD_ASSERT(sizeof(unsigned long) >= sizeof(st->st_ino)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./Include/pymacro.h:43:15: note: expanded from macro 'Py_BUILD_ASSERT' (void)Py_BUILD_ASSERT_EXPR(cond); \ ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./Include/pymacro.h:40:19: note: expanded from macro 'Py_BUILD_ASSERT_EXPR' (sizeof(char [1 - 2*!(cond)]) - 1) ^~~~~~~~~~~~~ 1 error generated. make: *** [Makefile:1720: Modules/posixmodule.o] Error 1 The following code: #include <stdio.h> #include <sys/stat.h> int main() { struct stat *st; printf("sizeof(unsigned long): %d - sizeof(st->st_ino): %d\n", sizeof(unsigned long), sizeof(st->st_ino)); return 0; } prints the following result on this same system: sizeof(unsigned long): 4 - sizeof(st->st_ino): 8 |
|
|
msg292142 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-04-22 22:48 |
Would it work with unsigned long long? What is the value of HAVE_LARGEFILE_SUPPORT? Maybe the code should be simplified to always use unsigned long long. |
|
|
msg292174 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-04-23 17:18 |
sizeof(unsigned long long) is 8 on Android x86 and HAVE_LARGEFILE_SUPPORT is undefined. According to HAVE_LARGEFILE_SUPPORT is also undefined on Android x86_64 (and also on Linux x86_64). > Maybe the code should be simplified to always use unsigned long long. Yes. If I understand correcty, one could only use 'unsigned long long' to assert at build time and use the smallest of the two unsigned long that fits with the size of st->st_ino at runtime. |
|
|
msg293859 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-17 16:11 |
In configure.ac, HAVE_LARGEFILE_SUPPORT is defined if (off_t > long && longlong >= off_t) and thus, when it is not defined, it means that off_t <= long (since longlong < off_t is false) so an off_t should fit into a long. But on Android the size of off_t is different from the size of ino_t so one cannot use the value of HAVE_LARGEFILE_SUPPORT to know if an ino_t would fit into an unsigned long. On Android architecture 'x86' at api level 21: sizeof(st->st_ino) = 8 and from the output of configure: checking size of off_t... 4 checking size of long... 4 checking size of long long... 8 |
|
|
msg293873 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-05-17 18:48 |
Before 0f6d73343d342c106cda2219ebb8a6f0c4bd9b3c, the code used PyLong_FromLong((long)st->st_ino), so the change doesn't introduce a regression but detected a real bug, right? > On Android architecture 'x86' at api level 21: > sizeof(st->st_ino) = 8 > and from the output of configure: > checking size of off_t... 4 > checking size of long... 4 > checking size of long long... 8 What is the type of st_ino? It's not off_t? A workaround would be to force the usage unsigned long long on Android, right? Can you please try the following patch on Android, and propose a PR if it works? diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c index e8c15a9..78b3cb9 100644 --- a/Modules/posixmodule.c +++ b/Modules/posixmodule.c @@ -1927,7 +1927,7 @@ _pystat_fromstructstat(STRUCT_STAT *st) return NULL; PyStructSequence_SET_ITEM(v, 0, PyLong_FromLong((long)st->st_mode)); -#if defined(HAVE_LARGEFILE_SUPPORT) | |
defined(MS_WINDOWS) +#if defined(HAVE_LARGEFILE_SUPPORT) |
|
msg293883 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-17 19:55 |
> What is the type of st_ino? It's not off_t? The type of st_ino is unsigned long long as defined in sys/stat.h for x86 and off_t is defined in sys/types.h this way;, quoting: /* This historical accident means that we had a 32-bit off_t on 32-bit architectures. */ typedef __kernel_off_t off_t; There is also a related section in the bionic github repo [1] quoted here: ========================= This probably belongs in the NDK documentation rather than here, but these are the known ABI bugs in the 32-bit ABI: time_t is 32-bit. http://b/5819737. In the 64-bit ABI, time_t is 64-bit. off_t is 32-bit. There is off64_t, and in newer releases there is almost-complete support for _FILE_OFFSET_BITS. Unfortunately our stdio implementation uses 32-bit offsets and -- worse -- function pointers to functions that use 32-bit offsets, so there's no good way to implement the last few pieces http://b/24807045. In the 64-bit ABI, off_t is off64_t. sigset_t is too small on ARM and x86 (but correct on MIPS), so support for real-time signals is broken. http://b/5828899 In the 64-bit ABI, sigset_t is the correct size for every architecture. ========================= I will test your patch on x86 and x86_64 on Android and propose a patch if it works. [1] https://android.googlesource.com/platform/bionic.git/#32_bit-ABI-bugs |
|
|
msg293919 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-18 11:39 |
With the patch proposed by Victor in the cross-compilation of posixmodule.c succeeds on android-21-x86 and android-21-x86_64. Following my suggestion in , I propose the following patch instead: diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c index e8c15a9473..d7ff3c78bd 100644 --- a/Modules/posixmodule.c +++ b/Modules/posixmodule.c @@ -1927,14 +1927,12 @@ _pystat_fromstructstat(STRUCT_STAT *st) return NULL; PyStructSequence_SET_ITEM(v, 0, PyLong_FromLong((long)st->st_mode)); -#if defined(HAVE_LARGEFILE_SUPPORT) | |
defined(MS_WINDOWS) Py_BUILD_ASSERT(sizeof(unsigned long long) >= sizeof(st->st_ino)); - PyStructSequence_SET_ITEM(v, 1, + if (sizeof(unsigned long) >= sizeof(st->st_ino)) + PyStructSequence_SET_ITEM(v, 1, PyLong_FromUnsignedLong(st->st_ino)); + else + PyStructSequence_SET_ITEM(v, 1, PyLong_FromUnsignedLongLong(st->st_ino)); -#else - Py_BUILD_ASSERT(sizeof(unsigned long) >= sizeof(st->st_ino)); - PyStructSequence_SET_ITEM(v, 1, PyLong_FromUnsignedLong(st->st_ino)); -#endif #ifdef MS_WINDOWS PyStructSequence_SET_ITEM(v, 2, PyLong_FromUnsignedLong(st->st_dev)); #else This patch removes the conditional on HAVE_LARGEFILE_SUPPORT (explicit is better than implicit) and fixes the problem for any operating system where the size of off_t and the size of st_ino are different. The compiler removes the dead branch of the <if (sizeof(unsigned long)...> conditional, so actually it compiles to the same binary as the other patch (gcc does this without any optimization configured on the command line, I didn't check this for clang). The same kind of fix could also be applied elsewhere in posixmodule.c where behind the use of HAVE_LARGEFILE_SUPPORT there is an incorrect assumption that the size of off_t is the same as the size of another type: * In _pystat_fromstructstat() - a little bit further down this patch - where the other type is st_size. * In os_DirEntry_inode_impl() where the other type is ino_t. Maybe this should be done in a new issue then ? |
|
msg293920 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-18 11:45 |
> The same kind of fix could also be applied elsewhere in posixmodule.c By also adding a Py_BUILD_ASSERT() statement which is a welcome improvement on the previous code that allowed to pinpoint the problem on Android. |
|
|
msg293924 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-05-18 14:51 |
Please propose the change as a PR. |
|
|
msg293959 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-19 15:24 |
PR 1666 added. With Unified Headers introduced in android-ndk-r14, the size of off_t for 32 bits Android devices is 8 instead of 4 and the cross-compilation of Python on Android does not fail in this case, see . |
|
|
msg293987 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2017-05-20 00:09 |
What is the advantage of compiling calls to both Long and LongLong converters? Isn’t it simpler to blindly call PyLong_FromUnsignedLongLong or similar? |
|
|
msg294030 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-20 14:12 |
Good point. PyLong_FromLong() is faster than PyLong_FromLongLong(), IMO the fastest should be used when it is possible. PyLong_FromUnsignedLong() and PyLong_FromUnsignedLongLong() are almost identical, they implement the same algorithm. The first one uses unsigned long (on the stack and for a register) while the other uses unsigned long long, so the difference in performance may be in the order of nano seconds on current processors. Should only PyLong_FromUnsignedLongLong() be used then or should this be left as is for consistency with the handling of PyLong_FromLong() ? Another point (not related) is that the d_ino member of the DirEntry structure is of type ino_t which is unsigned and PR 1666 should be corrected to use unsigned instead in the last modifications made by the PR in os_DirEntry_inode_impl(). I will push this change when a decision has been taken on the point raised by Martin. |
|
|
msg294034 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-05-20 15:23 |
I'm fine with always using unsigned long long. I don't think that it's even possible to measure the perf difference and if it's doable, I expect don't expect it to be significant... especially compared to the cost of the stat () syscall! Well, most computers are 64-bit. Even smartphones, no? |
|
|
msg294135 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2017-05-22 09:15 |
New changeset 50e86033de85294d87b7e942701d456342abde8e by xdegaye in branch 'master': bpo-29619: Do not use HAVE_LARGEFILE_SUPPORT for type conversions (GH-1666). https://github.com/python/cpython/commit/50e86033de85294d87b7e942701d456342abde8e |
|
|
msg300412 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-08-17 12:33 |
New changeset ffbb6f7334ccf54f09dcc9e760766d861928f13e by Victor Stinner in branch '3.6': bpo-29619: Do not use HAVE_LARGEFILE_SUPPORT for type conversions (GH-1666) (#3102) https://github.com/python/cpython/commit/ffbb6f7334ccf54f09dcc9e760766d861928f13e |
|
|