[Python-Dev] Compiling Python on Linux with Intel's icc (original) (raw)

Alex Leach albl500 at york.ac.uk
Thu Mar 1 19:39:19 CET 2012


Dear Python Devs,

I've been attempting to compile a fully functional version of Python 2.7 using Intel's C compiler, having built supposedly optimal versions of numpy and scipy, using Intel Composer XE and Intel's Math Kernel Library. I can build a working Python binary, but I'd really appreciate if someone could check my compile options, and perhaps suggest ways I could further optimise the build.

*** COMPILE FAILURE - ffi64.c ***

I've managed to compile everything in the python distribution except for Modules/_ctypes/libffi/src/x86/ffi64.c. So to get the compilation to actually work, I've had to use the config option '--with-system-ffi'. If someone could suggest a patch for ffi64.c, I'd happily test it, as I've been unable to fix the code myself! The problem is with register_args, which uses GCC's __int128_t, but this doesn't exist when using icc.

The include guard to use could be:- #ifdef __INTEL_COMPILER ... #else ... #endif

I've tried using this guard around the register_args struct, at the top of ffi64.c, and where I see register_args used, around lines 592-616, according to the suggestion at http://software.intel.com/en- us/forums/showthread.php?t=56652, but have been unable to get a working solution... A patch would be appreciated!

*** Tests ***

After compilation, there's a few tests that are consistently failing, mainly involved with floating point precision: test_cmath, test_math and test_float.
Also, I wrote a very short script to test the time of for loop execution and integer multiplication. This script (below) has nearly always completed faster using the default Ubuntu Python rather than my own build.

Obviously, I was hoping to get a faster python, but the size of the final binary is almost twice the size of the default Ubuntu version (5.2MB cf. 2.7MB), which I thought might cause a startup overhead that leads to slower execution times when running such a basic script.

*** TEST SCRIPT *** $ cat ~/bin/timetest.py

RANGE = 10000

print "running {0}^2 = {1} for loop iterations".format( RANGE,RANGE**2 )

for i in xrange(RANGE): for j in xrange(RANGE): i * j

*** TIMES ***

ICC-compiled python

$ time ./python ~/bin/timetest.py running 10000^2 = 100000000 for loop iterations

real 0m2.767s user 0m2.720s sys 0m0.008s

System python

$ time python ~/bin/timetest.py running 10000^2 = 100000000 for loop iterations

real 0m2.781s user 0m2.776s sys 0m0.000s

Oh... My python appears to run faster than gcc's now - checked this a few times now, mine's staying faster... :) I've compiled and re-compiled python dozens of times now, but it's still failing some tests...

*** Build Environment ***

Ubuntu 10.10 server kernel (uname -r=3.0.0-16-server) with KDE 4.7.4

$ tail ~/.bashrc

Custom Commands

export PATH=$PATH:/usr/local/cuda/bin:$HOME/bin export PYTHONPATH=$HOME/bin:/usr/lib/pymodules/python2.7 export PYTHONSTARTUP=$HOME/.pystartup export LD_LIBRARY_PATH=/lib64:/usr/lib64:/usr/local/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib

Load Intel compiler and library variables.

source /usr/intel/bin/compilervars.sh intel64 source /usr/intel/impi/4.0.3/bin/mpivars.sh intel64 source /usr/intel/tbb/bin/tbbvars.sh intel64

$ env | grep 'PATH|FLAGS' MANPATH=/usr/intel/impi/4.0.3.008/man:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/impi/4.0.3.008/man:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/impi/4.0.3.008/man:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/local/man:/usr/local/share/man:/usr/share/man:/usr/intel/man::: LIBRARY_PATH=/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/../compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/mkl/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21 FPATH=/usr/intel/composer_xe_2011_sp1.9.293/mkl/include:/usr/intel/composer_xe_2011_sp1.9.293/mkl/include LD_LIBRARY_PATH=/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21:/usr/intel/impi/4.0.3.008/ia32/lib:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/../compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/mkl/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21:/biol/arb/lib:/lib64:/usr/lib64:/usr/local/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib:/usr/intel/composer_xe_2011_sp1.9.293/debugger/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/mpirt/lib/intel64 CPATH=/usr/intel/composer_xe_2011_sp1.9.293/tbb/include:/usr/intel/composer_xe_2011_sp1.9.293/mkl/include:/usr/intel/composer_xe_2011_sp1.9.293/tbb/include NLSPATH=/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/locale/%l_%t/%N:/usr/intel/composer_xe_2011_sp1.9.293/ipp/lib/intel64/locale/%l_%t/%N:/usr/intel/composer_xe_2011_sp1.9.293/mkl/lib/intel64/locale/%l_%t/%N:/usr/intel/composer_xe_2011_sp1.9.293/debugger/intel64/locale/%l_%t/%N PATH=/usr/intel/impi/4.0.3.008/ia32/bin:/usr/intel/composer_xe_2011_sp1.9.293/bin/intel64:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/intel/bin:/usr/local/cuda/bin:/usr/local/cuda/bin:/usr/intel/composer_xe_2011_sp1.9.293/mpirt/bin/intel64 PYTHONPATH=/usr/lib/pymodules/python2.7/ WINDOWPATH=7 QT_PLUGIN_PATH=$HOME/.kde/lib/kde4/plugins/:/usr/lib/kde4/plugins/

*** Download, configure and Build instructions *** $ hg clone -r 2.7 http://hg.python.org/cpython Since... $ hg update -r 2.7

*** Generate Profile-Guided Optimisation stuff with first build *** $ make distclean && mkdir PGO $ CC=icc AR=xiar LD=xild CXX=icpc
CPPFLAGS+="-I/usr/include
-I/usr/include/x86_86-linux-gnu
-I/usr/src/linux-headers-3.0.0-16-server/include/"
CFLAGS+="-O3
-fomit-frame-pointer
-shared-intel
-fpic
-prof-gen
-prof-dir $PWD/PGO
-fp-model precise
-fp-model source
-xHost
-ftz" ./configure --with-system-ffi --with-libc="-lirc" --with-libm="-limf" $ make -j9

*** Use the PGO-generated information in new build *** $ make clean $ CC=icc AR=xiar LD=xild CXX=icpc
CPPFLAGS+="-I/usr/include
-I/usr/include/x86_86-linux-gnu
-I/usr/src/linux-headers-3.0.0-16-server/include/"
CFLAGS+="-O3
-fomit-frame-pointer
-shared-intel
-fpic
-prof-use
-prof-dir $PWD/PGO
-fp-model precise
-fp-model source
-xHost
-ftz
-fomit-frame-pointer"
./configure --with-system-ffi --with-libc="-lirc" --with-libm="-limf" $ make -j9 ...

$ make test building dbm using gdbm

Python build finished, but the necessary bits to build these modules were not found: _bsddb bsddb185 dl
imageop sunaudiodev
To find the necessary bits, look in setup.py in detect_modules() for the module's name.

find ./Lib -name '*.py[co]' -print | xargs rm -f ./python -Wd -3 -E -tt ./Lib/test/regrtest.py -l /usr/local/src/pysrc/cpython/Lib/unittest/util.py:2: ImportWarning: Not importing directory '/usr/local/src/pysrc/cpython/Lib/collections': missing init.py from collections import namedtuple, OrderedDict == CPython 2.7.3rc1 (2.7:5c52e7c6d868+, Feb 29 2012, 22:10:22) [GCC Intel(R) C++ gcc 4.6 mode] == Linux-3.0.0-16-server-x86_64-with-debian-wheezy-sid little-endian == /usr/local/src/pysrc/cpython/build/test_python_16278 Testing with flags: sys.flags(debug=0, py3k_warning=1, division_warning=1, division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=1, tabcheck=2, verbose=0, unicode=0, bytes_warning=0, hash_randomization=0)

.........

test_cmath test test_cmath failed -- Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_cmath.py", line 352, in test_specific_values msg=error_message) File "/usr/local/src/pysrc/cpython/Lib/test/test_cmath.py", line 94, in rAssertAlmostEqual 'got {!r}'.format(a, b)) AssertionError: acos0000: acos(complex(0.0, 0.0)) Expected: complex(1.5707963267948966, -0.0) Received: complex(1.5707963267948966, 0.0) Received value insufficiently close to expected value. ... test_curses skipped -- Use of the `curses' resource not enabled ... test_float test test_float failed -- Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_float.py", line 1273, in test_from_hex self.identical(fromHex('0x0.ffffffffffffd6p-1022'), MIN-3*TINY) File "/usr/local/src/pysrc/cpython/Lib/test/test_float.py", line 983, in identical self.fail('%r not identical to %r' % (x, y)) AssertionError: 0.0 not identical to 2.2250738585072014e-308 ..... test test_strtod failed -- multiple errors occurred; run in verbose mode for details ......

347 tests OK. 5 tests failed: test_cmath test_float test_gdb test_math test_strtod 1 test altered the execution environment: test_distutils 37 tests skipped: test_aepack test_al test_applesingle test_bsddb test_bsddb185 test_bsddb3 test_cd test_cl test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_curses test_dl test_gl test_imageop test_imgfile test_kqueue test_linuxaudiodev test_macos test_macostools test_msilib test_ossaudiodev test_scriptpackages test_smtpnet test_socketserver test_startfile test_sunaudiodev test_timeout test_tk test_ttk_guionly test_urllib2net test_urllibnet test_winreg test_winsound test_zipfile64 4 skips unexpected on linux2: test_bsddb test_bsddb3 test_tk test_ttk_guionly make: *** [test] Error 1

*** Drill down to test_strtod error ***

$ ./python Python 2.7.3rc1 (2.7:5c52e7c6d868+, Feb 29 2012, 22:10:22) [GCC Intel(R) C++ gcc 4.6 mode] on linux2 Type "help", "copyright", "credits" or "license" for more information.

from test import teststrtod teststrtod.testmain() test_bigcomp (test.test_strtod.StrtodTests) ... FAIL test_boundaries (test.test_strtod.StrtodTests) ... FAIL test_halfway_cases (test.test_strtod.StrtodTests) ... ok test_parsing (test.test_strtod.StrtodTests) ... FAIL test_particular (test.test_strtod.StrtodTests) ... FAIL test_short_halfway_cases (test.test_strtod.StrtodTests) ... ok test_underflow_boundary (test.test_strtod.StrtodTests) ... FAIL

====================================================================== FAIL: test_bigcomp (test.test_strtod.StrtodTests)

Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 214, in test_bigcomp self.check_strtod(s) File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in check_strtod "expected {}, got {}".format(s, expected, got)) AssertionError: Incorrectly rounded str->float conversion for 81608e-328: expected 0x0.0000000000002p-1022, got 0x0.0p+0

====================================================================== FAIL: test_boundaries (test.test_strtod.StrtodTests)

Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 191, in test_boundaries self.check_strtod(s) File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in check_strtod "expected {}, got {}".format(s, expected, got)) AssertionError: Incorrectly rounded str->float conversion for 22250738585072002149149e-330: expected 0x0.ffffffffffffep-1022, got 0x0.0p+0

====================================================================== FAIL: test_parsing (test.test_strtod.StrtodTests)

Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 243, in test_parsing self.check_strtod(s) File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in check_strtod "expected {}, got {}".format(s, expected, got)) AssertionError: Incorrectly rounded str->float conversion for -6.E-310: expected -0x0.06e7344a56502p-1022, got -0x0.0p+0

====================================================================== FAIL: test_particular (test.test_strtod.StrtodTests)

Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 393, in test_particular self.check_strtod(s) File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in check_strtod "expected {}, got {}".format(s, expected, got)) AssertionError: Incorrectly rounded str->float conversion for 12579816049008305546974391768996369464963024663104e-357: expected 0x0.90bbd7412d19fp-1022, got 0x0.0p+0

====================================================================== FAIL: test_underflow_boundary (test.test_strtod.StrtodTests)

Traceback (most recent call last): File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 205, in test_underflow_boundary self.check_strtod(s) File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in check_strtod "expected {}, got {}".format(s, expected, got)) AssertionError: Incorrectly rounded str->float conversion for 24703282292062327208828439643411068618252990130716238221279284125033775363572e-400: expected 0x0.0000000000001p-1022, got 0x0.0p+0


Ran 7 tests in 0.280s

FAILED (failures=5) Traceback (most recent call last): File "", line 1, in File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 396, in test_main test_support.run_unittest(StrtodTests) File "/usr/local/src/pysrc/cpython/Lib/test/test_support.py", line 1094, in run_unittest _run_suite(suite) File "/usr/local/src/pysrc/cpython/Lib/test/test_support.py", line 1077, in _run_suite raise TestFailed(err) test.test_support.TestFailed: multiple errors occurred

*** Binary size and linked libraries ***

My Intel build

$ ls -l ./python && ldd ./python -rwxrwxr-x 1 user user 5.2M 2012-02-29 22:10 ./python linux-vdso.so.1 => (0x00007fffde1ec000) libirc.so => /usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/libirc.so (0x00007fe5f0f30000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe5f0cde000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe5f0ada000) libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fe5f08d7000) libimf.so => /usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/libimf.so (0x00007fe5f050b000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe5f0287000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe5f0071000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe5efcd1000) /lib64/ld-linux-x86-64.so.2 (0x00007fe5f107e000) libintlc.so.5 => /usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/libintlc.so.5 (0x00007fe5efb85000)

System build

$ ls -lhH /usr/bin/python && ldd /usr/bin/python -rwxr-xr-x 1 root root 2.7M 2011-10-04 22:26 /usr/bin/python linux-vdso.so.1 => (0x00007fff509ff000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3e339b0000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3e337ab000) libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f3e335a8000) libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007f3e33357000) libcrypto.so.1.0.0 => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007f3e32fa7000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f3e32d8f000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3e32b0b000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3e3276b000) /lib64/ld-linux-x86-64.so.2 (0x00007f3e33c03000)

*** Conclusion (finally!) *** The Intel Python build looks very promising, but I don't yet trust it to the extent that I'd to go ahead and install it or use it in place of the system build. None of the errors look too alarming though, so I'm confident that I could actually get this to work, with the right help.

If someone could help me pass these final tests and compile the ffi64.c module, that'd be amazing!

I hope to hear back from you, Kind regards, Alex

ps. Sorry how long this email turned out! pps. I'd be happy to write up the fully working solution on a wiki or somewhere, if anyone has any suggestions where?



More information about the Python-Dev mailing list