GCC 4.4 Release Series — Changes, New Features, and Fixes
GCC 4.4 Release Series
Changes, New Features, and Fixes
The latest release in the 4.4 release series isGCC 4.4.7.
Caveats
__builtin_stdarg_start
has been completely removed from GCC. Support for<varargs.h>
had been deprecated since GCC 4.0. Use__builtin_va_start
as a replacement.- Some of the errors issued by the C++ front end that could be downgraded to warnings in previous releases by using
-fpermissive
are now warnings by default. They can be converted into errors by using-pedantic-errors
. - Use of the cpp assertion extension will now emit a warning when
-Wdeprecated
or-pedantic
is used. This extension has been deprecated for many years, but never warned about. - Packed bit-fields of type
char
were not properly bit-packed on many targets prior to GCC 4.4. On these targets, the fix in GCC 4.4 causes an ABI change. For example there is no longer a 4-bit padding between fielda
andb
in this structure:
struct foo
{
char a:4;
char b:8;
} attribute ((packed));
There is a new warning to help identify fields that are affected:
foo.c:5: note: Offset of packed bit-field 'b' has changed in GCC 4.4
The warning can be disabled with-Wno-packed-bitfield-compat
. - On ARM EABI targets, the C++ mangling of the
va_list
type has been changed to conform to the current revision of the EABI. This does not affect the libstdc++ library included with GCC. - The SCOUNT and POS bits of the MIPS DSP control register are now treated as global. Previous versions of GCC treated these fields as call-clobbered instead.
- The MIPS port no longer recognizes the
h
asm
constraint. It was necessary to remove this constraint in order to avoid generating unpredictable code sequences.
One of the main uses of theh
constraint was to extract the high part of a multiplication on 64-bit targets. For example:
asm ("dmultu\t%1,%2" : "=h" (result) : "r" (x), "r" (y));
You can now achieve the same effect using 128-bit types:
typedef unsigned int uint128_t attribute((mode(TI)));
result = ((uint128_t) x * y) >> 64;
The second sequence is better in many ways. For example, ifx
andy
are constants, the compiler can perform the multiplication at compile time. Ifx
andy
are not constants, the compiler can schedule the runtime multiplication better than it can schedule anasm
statement. - Support for a number of older systems and recently unmaintained or untested target ports of GCC has been declared obsolete in GCC 4.4. Unless there is activity to revive them, the next release of GCC will have their sources permanentlyremoved.
The following ports for individual systems on particular architectures have been obsoleted:- Generic a.out on IA32 and m68k (i[34567]86-*-aout*, m68k-*-aout*)
- Generic COFF on ARM, H8300, IA32, m68k and SH (arm-*-coff*, armel-*-coff*, h8300-*-*, i[34567]86-*-coff*, m68k-*-coff*, sh-*-*). This does not affect other more specific targets using the COFF object format on those architectures, or the more specific H8300 and SH targets (h8300-*-rtems*, h8300-*-elf*, sh-*-elf*, sh-*-symbianelf*, sh-*-linux*, sh-*-netbsdelf*, sh-*-rtems*, sh-wrs-vxworks).
- 2BSD on PDP-11 (pdp11-*-bsd)
- AIX 4.1 and 4.2 on PowerPC (rs6000-ibm-aix4.[12]*, powerpc-ibm-aix4.[12]*)
- Tuning support for Itanium1 (Merced) variants. Note that code tuned for Itanium2 should also run correctly on Itanium1.
- The
protoize
andunprotoize
utilities have been obsoleted and will be removed in GCC 4.5. These utilities have not been installed by default since GCC 3.0. - Support has been removed for all the configurations obsoleted in GCC 4.3.
- Unknown
-Wno-*
options are now silently ignored by GCC if no other diagnostics are issued. If other diagnostics are issued, then GCC warns about the unknown options. - More information on porting to GCC 4.4 from previous versions of GCC can be found in the porting guide for this release.
General Optimizer Improvements
A new command-line switch
-findirect-inlining
has been added. When turned on it allows the inliner to also inline indirect calls that are discovered to have known targets at compile time thanks to previous inlining.A new command-line switch
-ftree-switch-conversion
has been added. This new pass turns simple initializations of scalar variables in switch statements into initializations from a static array, given that all the values are known at compile time and the ratio between the new array size and the original switch branches does not exceed the parameter--param switch-conversion-max-branch-ratio
(default is eight).A new command-line switch
-ftree-builtin-call-dce
has been added. This optimization eliminates unnecessary calls to certain builtin functions when the return value is not used, in cases where the calls can not be eliminated entirely because the function may seterrno
. This optimization is on by default at-O2
and above.A new command-line switch
-fconserve-stack
directs the compiler to minimize stack usage even if it makes the generated code slower. This affects inlining decisions.When the assembler supports it, the compiler will now emit unwind information using assembler
.cfi
directives. This makes it possible to use such directives in inline assembler code. The new option-fno-dwarf2-cfi-asm
directs the compiler to not use.cfi
directives.The Graphite branch has been merged. This merge has brought in a new framework for loop optimizations based on a polyhedral intermediate representation. These optimizations apply to all the languages supported by GCC. The following new code transformations are available in GCC 4.4:
-floop-interchange
performs loop interchange transformations on loops. Interchanging two nested loops switches the inner and outer loops. For example, given a loop like:
DO J = 1, M
DO I = 1, N
A(J, I) = A(J, I) * C
ENDDO
ENDDO
loop interchange will transform the loop as if the user had written:
DO I = 1, N
DO J = 1, M
A(J, I) = A(J, I) * C
ENDDO
ENDDOwhich can be beneficial when
N
is larger than the caches, because in Fortran, the elements of an array are stored in memory contiguously by column, and the original loop iterates over rows, potentially creating at each access a cache miss.-floop-strip-mine
performs loop strip mining transformations on loops. Strip mining splits a loop into two nested loops. The outer loop has strides equal to the strip size and the inner loop has strides of the original loop within a strip. For example, given a loop like:
DO I = 1, N
A(I) = A(I) + C
ENDDO
loop strip mining will transform the loop as if the user had written:
DO II = 1, N, 4
DO I = II, min (II + 3, N)
A(I) = A(I) + C
ENDDO
ENDDO-floop-block
performs loop blocking transformations on loops. Blocking strip mines each loop in the loop nest such that the memory accesses of the element loops fit inside caches. For example, given a loop like:
DO I = 1, N
DO J = 1, M
A(J, I) = B(I) + C(J)
ENDDO
ENDDO
loop blocking will transform the loop as if the user had written:
DO II = 1, N, 64
DO JJ = 1, M, 64
DO I = II, min (II + 63, N)
DO J = JJ, min (JJ + 63, M)
A(J, I) = B(I) + C(J)
ENDDO
ENDDO
ENDDO
ENDDOwhich can be beneficial when
M
is larger than the caches, because the innermost loop will iterate over a smaller amount of data that can be kept in the caches.A new register allocator has replaced the old one. It is called integrated register allocator (IRA) because coalescing, register live range splitting, and hard register preferencing are done on-the-fly during coloring. It also has better integration with the reload pass. IRA is a regional register allocator which uses modern Chaitin-Briggs coloring instead of Chow's priority coloring used in the old register allocator. More info about IRA internals and options can be found in the GCC manuals.
A new instruction scheduler and software pipeliner, based on the selective scheduling approach, has been added. The new pass performs instruction unification, register renaming, substitution through register copies, and speculation during scheduling. The software pipeliner is able to pipeline non-countable loops. The new pass is targeted at scheduling-eager in-order platforms. In GCC 4.4 it is available for the Intel Itanium platform working by default as the second scheduling pass (after register allocation) at the
-O3
optimization level.When using
-fprofile-generate
with a multi-threaded program, the profile counts may be slightly wrong due to race conditions. The new-fprofile-correction
option directs the compiler to apply heuristics to smooth out the inconsistencies. By default the compiler will give an error message when it finds an inconsistent profile.The new
-fprofile-dir=PATH
option permits setting the directory where profile data files are stored when using-fprofile-generate
and friends, and the directory used when reading profile data files using-fprofile-use
and friends.
New warning options
- The new
-Wframe-larger-than=NUMBER
option directs GCC to emit a warning if any stack frame is larger thanNUMBER
bytes. This may be used to help ensure that code fits within a limited amount of stack space. - The command-line option
-Wlarger-than-N
is now written as-Wlarger-than=N
and the old form is deprecated. - The new
-Wno-mudflap
option disables warnings about constructs which can not be instrumented when using-fmudflap
.
New Languages and Language specific improvements
- Version 3.0 of the OpenMP specification is now supported for the C, C++, and Fortran compilers.
- New character data types, per TR 19769: New character types in C, are now supported for the C compiler in
-std=gnu99
mode, as__CHAR16_TYPE__
and__CHAR32_TYPE__
, and for the C++ compiler in-std=c++0x
and-std=gnu++0x
modes, aschar16_t
andchar32_t
too.
C family
- A new
optimize
attribute was added to allow programmers to change the optimization level and particular optimization options for an individual function. You can also change the optimization options via theGCC optimize
pragma for functions defined after the pragma. TheGCC push_options
pragma and theGCC pop_options
pragma allow you temporarily save and restore the options used. TheGCC reset_options
pragma restores the options to what was specified on the command line. - Uninitialized warnings do not require enabling optimization anymore, that is,
-Wuninitialized
can be used together with-O0
. Nonetheless, the warnings given by-Wuninitialized
will probably be more accurate if optimization is enabled. -Wparentheses
now warns about expressions such as(!x | y)
and(!x & y)
. Using explicit parentheses, such as in((!x) | y)
, silences this warning.-Wsequence-point
now warns withinif
,while
,do while
andfor
conditions, and withinfor
begin/end expressions.- A new option
-dU
is available to dump definitions of preprocessor macros that are tested or expanded.
C++
- Improved experimental support for the upcoming ISO C++ standard, C++0x. Including support for
auto
, inline namespaces, generalized initializer lists, defaulted and deleted functions, new character types, and scoped enums. - Those errors that may be downgraded to warnings to build legacy code now mention
-fpermissive
when-fdiagnostics-show-option
is enabled. -Wconversion
now warns if the result of astatic_cast
to enumeral type is unspecified because the value is outside the range of the enumeral type.-Wuninitialized
now warns if a non-static reference or non-staticconst
member appears in a class without constructors.- G++ now properly implements value-initialization, so objects with an initializer of
()
and an implicitly defined default constructor will be zero-initialized before the default constructor is called.
Runtime Library (libstdc++)
- Improved experimental support for the upcoming ISO C++ standard, C++0x, including:
- Support for , <condition_variable>, , <forward_list>, <initializer_list>, , , <system_error>, and .
unique_ptr
, additions, exception propagation, and support for the new character types in and .- Existing facilities now exploit initializer lists, defaulted and deleted functions, and the newly implemented core C++0x features.
- Some standard containers are more efficient together with stateful allocators, i.e., no allocator is constructed on the fly at element construction time.
- Experimental support for non-standard pointer types in containers.
- The long standing libstdc++/30928 has been fixed for targets running glibc 2.10 or later.
- As usual, many small and larger bug fixes, in particular quite a few corner cases in .
Fortran
- GNU Fortran now employs libcpp directly instead of using cc1 as an external preprocessor. The -cpp option was added to allow manual invocation of the preprocessor without relying on filename extensions.
- The -Warray-temporaries option warns about array temporaries generated by the compiler, as an aid to optimization.
- The -fcheck-array-temporaries option has been added, printing a notification at run time, when an array temporary had to be created for an function argument. Contrary to
-Warray-temporaries
the warning is only printed if the array is noncontiguous. - Improved generation of DWARF debugging symbols
- If using an intrinsic not part of the selected standard (via
-std=
and-fall-intrinsics
) gfortran will now treat it as if this procedure were declaredEXTERNAL
and try to link to a user-supplied procedure.-Wintrinsics-std
will warn whenever this happens. The now-useless option-Wnonstd-intrinsic
was removed. - The flag
-falign-commons
has been added to control the alignment of variables in COMMON blocks, which is enabled by default in line with previous GCC version. Using-fno-align-commons
one can force commons to be contiguous in memory as required by the Fortran standard, however, this slows down the memory access. The option-Walign-commons
, which is enabled by default, warns when padding bytes were added for alignment. The proper solution is to sort the common objects by decreasing storage size, which avoids the alignment problems. - Fortran 2003 support has been extended:
- Wide characters (ISO 10646, UCS-4,
kind=4
) and UTF-8 I/O is now supported (except internal reads from/writes to wide strings). -fbackslash now supports also\u_nnnn_
and\U_nnnnnnnn_
to enter Unicode characters. - Asynchronous I/O (implemented as synchronous I/O) and the
decimal=
,size=
,sign=
,pad=
,blank=
, anddelim=
specifiers are now supported in I/O statements. - Support for Fortran 2003 structure constructors and for array constructor with typespec has been added.
- Procedure Pointers (but not yet as component in derived types and as function results) are now supported.
- Abstract types, type extension, and type-bound procedures (both
PROCEDURE
andGENERIC
but not as operators). Note: AsCLASS
/polymorphyic types are not implemented, type-bound procedures withPASS
accept as non-standard extensionTYPE
arguments.
- Wide characters (ISO 10646, UCS-4,
- Fortran 2008 support has been added:
- The
-std=f2008
option and support for the file extensions.f2008
and.F2008
has been added. - The g0 format descriptor is now supported.
- The Fortran 2008 mathematical intrinsics
ASINH
,ACOSH
,ATANH
,ERF
,ERFC
,GAMMA
,LOG_GAMMA
,BESSEL_*
,HYPOT
, andERFC_SCALED
are now available (some of them existed as GNU extension before). Note: The hyperbolic functions are not yet supporting complex arguments and the three- argument version ofBESSEL_*N
is not available. - The bit intrinsics
LEADZ
andTRAILZ
have been added.
- The
Java (GCJ)
Ada
- The Ada runtime now supports multilibs on many platforms including x86_64, SPARC and PowerPC. Their build is enabled by default.
New Targets and Target Specific Improvements
ARM
- GCC now supports optimizing for the Cortex-A9, Cortex-R4 and Cortex-R4F processors and has many other improvements to optimization for ARM processors.
- GCC now supports the VFPv3 variant with 16 double-precision registers with
-mfpu=vfpv3-d16
. The option-mfpu=vfp3
has been renamed to-mfpu=vfpv3
. - GCC now supports the
-mfix-cortex-m3-ldrd
option to work around an erratum on Cortex-M3 processors. - GCC now supports the
__sync_*
atomic operations for ARM EABI GNU/Linux. - The section anchors optimization is now enabled by default when optimizing for ARM.
- GCC now uses a new EABI-compatible profiling interface for EABI targets. This requires a function
__gnu_mcount_nc
, which is provided by GNU libc versions 2.8 and later.
AVR
- The
-mno-tablejump
option has been deprecated because it has the same effect as the-fno-jump-tables
option. - Added support for these new AVR devices:
- ATA6289
- ATtiny13A
- ATtiny87
- ATtiny167
- ATtiny327
- ATmega8C1
- ATmega16C1
- ATmega32C1
- ATmega8M1
- ATmega16M1
- ATmega32M1
- ATmega32U4
- ATmega16HVB
- ATmega4HVD
- ATmega8HVD
- ATmega64C1
- ATmega64M1
- ATmega16U4
- ATmega32U6
- ATmega128RFA1
- AT90PWM81
- AT90SCR100
- M3000F
- M3000S
- M3001B
IA-32/x86-64
- Support for Intel AES built-in functions and code generation is available via
-maes
. - Support for Intel PCLMUL built-in function and code generation is available via
-mpclmul
. - Support for Intel AVX built-in functions and code generation is available via
-mavx
. - Automatically align the stack for local variables with alignment requirement.
- GCC can now utilize the SVML library for vectorizing calls to a set of C99 functions if
-mveclibabi=svml
is specified and you link to an SVML ABI compatible library. - On x86-64, the ABI has been changed in the following cases to conform to the x86-64 ABI:
- Passing/returning structures with flexible array member:
struct foo
{
int i;
int flex[];
}; - Passing/returning structures with complex float member:
struct foo
{
int i;
complex float f;
}; - Passing/returning unions with long double member:
union foo
{
int x;
long double ld;
};
- Passing/returning structures with flexible array member:
Code built with previous versions of GCC that uses any of these is not compatible with code built with GCC 4.4.0 or later.
- A new
target
attribute was added to allow programmers to change the target options like-msse2
or-march=k8
for an individual function. You can also change the target options via theGCC target
pragma for functions defined after the pragma. - GCC can now be configured with options
--with-arch-32
,--with-arch-64
,--with-cpu-32
,--with-cpu-64
,--with-tune-32
and--with-tune-64
to control the default optimization separately for 32-bit and 64-bit modes.
IA-32/IA64
- Support for
__float128
(TFmode) IEEE quad type and corresponding TCmode IEEE complex quad type is available via the soft-fp library onIA-32/IA64
targets. This includes basic arithmetic operations (addition, subtraction, negation, multiplication and division) on__float128
real and TCmode complex values, the full set of IEEE comparisons between__float128
values, conversions to and fromfloat
,double
andlong double
floating point types, as well as conversions to and fromsigned
orunsigned
integer,signed
orunsigned long
integer andsigned
orunsigned
quad (TImode,IA64
only) integer types. Additionally, all operations generate the full set of IEEE exceptions and support the full set of IEEE rounding modes.
M68K/ColdFire
- GCC now supports instruction scheduling for ColdFire V1, V3 and V4 processors. (Scheduling support for ColdFire V2 processors was added in GCC 4.3.)
- GCC now supports the
-mxgot
option to support programs requiring many GOT entries on ColdFire. - The m68k-*-linux-gnu target now builds multilibs by default.
MIPS
- MIPS Technologies have extended the original MIPS SVR4 ABI to include support for procedure linkage tables (PLTs) and copy relocations. These extensions allow GNU/Linux executables to use a significantly more efficient code model than the one defined by the original ABI.
GCC support for this code model is available via a new command-line option,-mplt
. There is also a new configure-time option,--with-mips-plt
, to make-mplt
the default.
The new code model requires support from the assembler, the linker, and the runtime C library. This support is available in binutils 2.19 and GLIBC 2.9. - GCC can now generate MIPS16 code for 32-bit GNU/Linux executables and 32-bit GNU/Linux shared libraries. This feature requires GNU binutils 2.19 or above.
- Support for RMI's XLR processor is now available through the
-march=xlr
and-mtune=xlr
options. - 64-bit targets can now perform 128-bit multiplications inline, instead of relying on a
libgcc
function. - Native GNU/Linux toolchains now support
-march=native
and-mtune=native
, which select the host processor. - GCC now supports the R10K, R12K, R14K and R16K processors. The canonical
-march=
and-mtune=
names for these processors arer10000
,r12000
,r14000
andr16000
respectively. - GCC can now work around the side effects of speculative execution on R10K processors. Please see the documentation of the
-mr10k-cache-barrier
option for details. - Support for the MIPS64 Release 2 instruction set has been added. The option
-march=mips64r2
enables generation of these instructions. - GCC now supports Cavium Networks' Octeon processor. This support is available through the
-march=octeon
and-mtune=octeon
options. - GCC now supports STMicroelectronics' Loongson 2E/2F processors. The canonical
-march=
and-mtune=
names for these processors areloongson2e
andloongson2f
.
picochip
Picochip is a 16-bit processor. A typical picoChip contains over 250 small cores, each with small amounts of memory. There are three processor variants (STAN, MEM and CTRL) with different instruction sets and memory configurations and they can be chosen using the -mae
option.
This port is intended to be a "C" only port.
Power Architecture and PowerPC
- GCC now supports the e300c2, e300c3 and e500mc processors.
- GCC now supports Xilinx processors with a single-precision FPU.
- Decimal floating point is now supported for e500 processors.
S/390, zSeries and System z9/z10
- Support for the IBM System z10 EC/BC processor has been added. When using the
-march=z10
option, the compiler will generate code making use of instructions provided by the General-Instruction-Extension Facility and the Execute-Extension Facility.
VxWorks
- GCC now supports the thread-local storage mechanism used on VxWorks.
Xtensa
- GCC now supports thread-local storage (TLS) for Xtensa processor configurations that include the Thread Pointer option. TLS also requires support from the assembler and linker; this support is provided in the GNU binutils beginning with version 2.19.
Documentation improvements
Other significant improvements
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.1 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).
GCC 4.4.2
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.2 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).
GCC 4.4.3
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.3 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).
GCC 4.4.4
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.4 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).
GCC 4.4.5
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.5 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).
GCC 4.4.6
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.6 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).
GCC 4.4.7
This is the list of problem reports (PRs) from GCC's bug tracking system that are known to be fixed in the 4.4.7 release. This list might not be complete (that is, it is possible that some PRs that have been fixed are not listed here).