avr-gcc - GCC Wiki (original) (raw)

Contents

  1. ABI
    1. Type Layout
      1. Deviations from the Standard
      2. Fixed-Point Support
    2. Register Layout
      1. Fixed Registers
      2. Call-Used Registers
      3. Call-Saved Registers
    3. Frame Layout
    4. Calling Convention
      1. Exceptions to the Calling Convention
    5. Reduced Tiny
  2. Extensions
    1. Types
    2. Attributes
    3. Pragmas
    4. Address Spaces
    5. Inline Assembly
      1. Constraint Modifiers
      2. Constraints
      3. Print Operand Modifiers
      4. Special Sequences
      5. Assembly Operand Modifiers
  3. Using avr-gcc
    1. Locating .rodata in Flash for AVR64* and AVR128* Devices
    2. Supporting "unsupported" Devices
      1. avr-gcc v5 and newer
      2. avr-gcc v4.9 and below
  4. Libf7
    1. Implementation
    2. Known Problems
    3. Using 64-bit long double without proper AVR-LibC
    4. Shortcomings
    5. Other Implementations

ABI

Application Binary Interface and implementation defined behaviour of avr-gcc. Object format bits are not discussed here. See also C Implementation-defined behaviour.

Type Layout

Endianess: Little

default sizeof Note
char 1 signed
short 2
int 2
long 4
long long 8
size_t 2 unsigned int
ptrdiff_t 2 int
void* 2
float 4
double 4,8 depends on configuration and command line options
long double 8,4 depends on configuration and command line options
wchar_t 2

Deviations from the Standard

double

long double

In avr-gcc up to v9, double and long double are only 32 bits wide and implemented in the same way as float.

In avr-gcc v10 and higher, the layout of double and long double are determined by configure options --with-double= and --with-long-double=, respectively. The default layout of double is like float, and the default layout of long double is a 64-bit IEEE format, see GCC configure options for details. Depending on the configuration, command line options -mdouble=32 and -mdouble=64 are available so that the type layout of double can be chosen at compile time, similar for -mlong-double=32 and -mlong-double=64 for long double. In order to test in a program which type layout has been chosen, GCC built-in macros __SIZEOF_DOUBLE__ and __SIZEOF_LONG_DOUBLE__ can be used.

8-bit int with -mint8

With -mint8 int is only 8 bits wide which does not comply to the C standard. Notice that -mint8 is not a multilib option and neither supported by AVR-LibC (except stdint.h) nor by newlib.

Fixed-Point Support

avr-gcc 4.8 and up supports fixed point arithmetic according to ISO/IEC TR 18037. The support is not complete. The type layouts are as follows:

Type sizeof unsigned signed Note
_Fract
short 1 0.8 ±.7
2 0.16 ±.15
long 4 0.32 ±.31
long long 8 0.64 ±.63 GCC extension
_Accum
short 2 8.8 ±8.7
4 16.16 ±16.15
long 8 32.32 ±32.31
long long 8 16.48 ±16.47 GCC extension

Overflow behaviour of the non-saturated arithmetic is unspecified.

Please notice that some private ports found on the web implement different layouts.

Register Layout

Values that occupy more than one 8-bit register start in an even register.

Fixed Registers

Fixed Registers are registers that won't be allocated by GCC's register allocator. Registers R0 and R1 are fixed and used implicitly while printing out assembler instructions:

R0

is used as scratch register that need not to be restored after its usage. It must be saved and restored in interrupt service routine's (ISR) prologue and epilogue. In inline assembler you can use __tmp_reg__ for the scratch register.

R1

always contains zero. During an insn the content might be destroyed, e.g. by a MUL instruction that uses R0/R1 as implicit output register. If an insn destroys R1, the insn must restore R1 to zero afterwards. This register must be saved in ISR prologues and must then be set to zero because R1 might contain values other than zero. The ISR epilogue restores the value. In inline assembler you can use __zero_reg__ for the zero register.

T

the T flag in the status register (SREG) is used in the same way like the temporary scratch register R0.

User-defined global registers by means of global register asm and / or -ffixed-n won't be saved or restored in function pro- and epilogue.

Call-Used Registers

The call-used or call-clobbered general purpose registers (GPRs) are registers that might be destroyed (clobbered) by a function call.

R18–R27, R30, R31

These GPRs are call clobbered. An ordinary function may use them without restoring the contents. Interrupt service routines (ISRs) must save and restore each register they use.

R0, T-Flag

The temporary register and the T-flag in SREG are also call-clobbered, but this knowledge is not exposed explicitly to the compiler (R0 is a fixed register).

Call-Saved Registers

R2–R17, R28, R29

The remaining GPRs are call-saved, i.e. a function that uses such a registers must restore its original content. This applies even if the register is used to pass a function argument.

R1

The zero-register is implicity call-saved (implicit because R1 is a fixed register).

Frame Layout

During compilation the compiler may come up with an arbitrary number of pseudo registers which will be allocated to hard registers during register allocation.

Calling Convention

For example, suppose a function with the following prototype:

then

Exceptions to the Calling Convention

GCC comes with libgcc, a runtime support library. This library implements functions that are too complicated to be emit inline by GCC. What functions are used when depends on the target architecture, what instructions are available, how expensive they are and on the optimization level.

Functions in libgcc are implemented in C or hand-written assembly. In the latter case, some functions use a special ABI that allows better code generation by the compiler.

For example, the function that computes unsigned 8-bit quotient and remainder, __udivmodqi4, just returns the quotient and the remainder and clobbers R22 and R23. The compiler knows that the function does not destroy R30, for example, and may hold a value in R30 across the function call. This reduces the register pressure in functions that call __udivmodqi4.

Function Availability Operation Clobbers Description
__umulhisi3 4.7+ && MUL SI:22 = HI:26 * HI:18 Rtmp Multiply 2 unsigned 16-bit integers to a 32-bit result
__mulhisi3 4.7+ && MUL SI:22 = HI:26 * HI:18 Rtmp Multiply 2 signed 16-bit integers to a 32-bit result
__usmulhisi3 4.7+ && MUL SI:22 = HI:26 * HI:18 Rtmp Multiply the signed 16-bit integer in R26 with the unsigned 16-bit integer in R18 to a 32-bit result
__muluhisi3 4.7+ && MUL SI:22 = HI:26 * SI:18 Rtmp Multiply an unsigned 16-bit integer with a 32-bit integer to a 32-bit result
__mulshisi3 4.7+ && MUL SI:22 = HI:26 * SI:18 Rtmp Multiply a signed 16-bit integer with a 32-bit integer to a 32-bit result
__udivmodqi4 QI:24 = QI:24 / QI:22 QI:25 = QI:24 % QI:22 R23 Unsigned 8-bit integer quotient and remainder
__divmodqi4 QI:24 = QI:24 / QI:22 QI:25 = QI:24 % QI:22 R23, Rtmp, T Signed 8-bit integer quotient and remainder
__udivmodhi4 HI:22 = HI:24 / HI:22 HI:24 = HI:24 % HI:22 R21, R26...27 Unsigned 16-bit integer quotient and remainder
__divmodhi4 HI:22 = HI:24 / HI:22 HI:24 = HI:24 % HI:22 R21, R26...27, Rtmp, T Signed 16-bit integer quotient and remainder

The Operation column uses GCC's machine modes to describe how values in registers are interpreted.

Machine Modes Qarter, 8 bit Half, 16 bit Single, 32 bit Double, 64 bit Partial Single, 24 bit
Integer QI HI SI DI PSI
Float SF DF
Signed _Accum HA SA DA
Signed _Fract (Q-Format) QQ HQ SQ DQ
Unsigned _Accum UHA USA UDA
Unsigned _Fract (Q-Format) UQQ UHQ USQ UDQ

Reduced Tiny

On the Reduced Tiny cores (16 GPRs only) several modifications to the ABI above apply:

There is only limited library support both from libgcc and AVR-LibC, for example there is no float support and no printf support.

Extensions

Types

Attributes

Pragmas

Address Spaces

Address spaces are supported as part of GNU-C. They are not supported for ISO C, not supported on Reduced Tiny, and are not supported for C++.

avr-gcc puts objects in __flash into section .progmem.data, and objects in __memx or __flashx into section .progmemx.data. These sections are handled in the default linker description file and need no further attention. This is different for the address spaces __flashN, where objects are put into section .progmemN.data but are not mentioned in the linker script, because there is no one-fits-all memory layout. This means you have to provide location directives for these sections. Suppose for example that an application uses address space __flash2, and therefore it has to locate the respective section somewhere in the range 0x20000-0x2ffff of flash memory. One way to achieve it is to use the following linker script augmentation:

SECTIONS { .text : { . = MAX (ABSOLUTE(0x20000), ABSOLUTE(.)); . = ALIGN(2); __progmem2_start = .; (.progmem2.text) (.progmem2.data) __progmem2_end = .;

    ASSERT (__progmem2_start == __progmem2_end || __progmem2_end <= ABSOLUTE(0x30000),
            ".progmem2.data exceeds 0x30000");
}

}

INSERT AFTER .text

Store this text in a file flash2.ld and link with -T flash2.ld. This will locate in the order text-progmem2-data where text refers to "ordinary" code (startup-code, vector table, functions, progmem, jump-tables, etc.) and data refers to the data from which the startup-code initialized non-zero objects in static storage. If you want the order to be text-data-progmem2 instead, then you would use INSERT AFTER .data in the snippet above.

Inline Assembly

For introductions and tutorials on inline assembly, see

Constraint Modifiers

Modifier Meaning
= An output operand, like in "=r". Without &, the operand may overlap with input operands.
& An output operand that may not overlap with any input operand, like in "=&r". Referred to as "early-clobber".
+ An output operand that is also an input operand, like in "+r".

Constraints

Constraint Register Range
a Simple upper registers that support FMUL R16...R23
b Base registers Y and Z R28...R31
d Upper registers that support LDI, ORI, etc. R16...R31
e Pointer registers X, Y, and Z R26...R31
l Lower registers, empty on Reduced Tiny R2...R15
r General purpose registers R2...R31
w Registers for ADIW and SBIW R24...R31
x X register R26...R27
y Y register R28...R29
z Z register R30...R31
Constraint Constant Range
n Compile-time constant
s Symbolic operand known at link-time Address of a function or static variable
i Immediate operand known at link-time Same as "sn"
I Unsigned 6-bit integer constant 0...63
J Negative 6-bit integer constant −63...0
M Unsigned 8-bit integer constant 0...255
E 32-bit IEEE floating-point constant
F 64-bit IEEE floating-point constant
Ynn Fixed-point or integer constant
Constraint Explanation
m Memory
X Matches anything
0...9 Matches respective operand number
Modifier Number ofArguments Explanation SuitableConstraints
%a0 1 Print pointer register as address X, Y or Z, like in "LD r0, %a0+" x, y, z, b, e
%i0 1 Print compile-time RAM address as I/O address, like in "OUT %i0, r0" with argument "n"(&RAMPZ) n
%n0 1 Print the negative of a compile-time constant n
%r0 1 Print the register number of a register, like in "CLR %r0+7" for the MSB of a 64-bit register reg
%x0 1 Print a function name without gs() modifier, like in "%~CALL %x0" with argument "s"(main) s
%A0 1 Add 0 to the register number (no effect) reg
%B0 1 Add 1 to the register number reg
%C0 1 Add 2 to the register number reg
%D0 1 Add 3 to the register number reg
%T0%t1 2 Print the register that holds bit number %1 of register %0 reg + n
%T0%T1 2 Print operands suitable for BLD/BST, like in "BST %T0%T1", including the required , reg + n

Special Sequences

Squence Meaning
%~ "" or "r": "%~call" yields "call" on devices with CALL, and "rcall" on devices without CALL
%! "" or "e": "%!icall" yields "eicall" on devices with EICALL, and "icall" on devices without EICALL
%= A number that's unique for this inline assembly snip and the compilation unit. Used to compose unique local labels
%% Insert a %, provided the inline asm has arguments
\n Insert a line break
\t Insert a TAB
\" Insert a "
\\ Insert a \
$ Logical line separator, like in "LDI %A0,1 $ LDI %B0,2"
__zero_reg__ The register containing zero, see section Register Layout
__tmp_reg__ The scratch register, see section Register Layout

Moreover, the following I/O addresses are defined provided the device supports the respective SFR: __SREG__, __SP_L__, __SP_H__, __CCP__, __RAMPX__, __RAMPY__, __RAMPZ__, __RAMPD__.

Assembly Operand Modifiers

Modifier Explanation Purpose
lo8() 1st Byte of a link-time constant, bits 0...7 Getting partsof a byte-address
hi8() 2nd Byte of a link-time constant, bits 8...15
hlo8() 3rd Byte of a link-time constant, bits 16...23
hhi8() 4th Byte of a link-time constant, bits 24...31
hh8() Same like hlo8
pm_lo8() 1st Byte of a link-time constant divided by 2, bits 1...8 Getting partsof a word-address
pm_hi8() 2nd Byte of a link-time constant divided by 2, bits 9...16
pm_hh8() 3rd Byte of a link-time constant divided by 2, bits 17...24
pm() Link-time constant divided by 2 in order to get a program memory (word) address, like in lo8(pm(main)). word-address
gs() Function address divided by 2 in order to get a (word) addresses, like in lo8(gs(main)). Generate stub (trampoline) as needed. This is needed when computing the address of a function on devices with more than 128KiB of program memory that's supposed to be used in EICALL. For rationale, see GCC documentation. On devices with less program memory, gs() behaves like pm(). function address for [E]ICALL

When the argument of a modifier is not computable at assembler-time, the assembler has to encode the expression in an abstract form using relocs. Consequence is that only a very limited number of argument expressions is supported when they are not computable at assembler-time.

Using avr-gcc

Locating .rodata in Flash for AVR64* and AVR128* Devices

Supporting "unsupported" Devices

avr-gcc v5 and newer

In contrast to older versions of the compiler that support -mmcu= natively, avr-gcc v5+ comes with a bunch of spec files in ./lib/gcc/avr//device-specs. These files are generated when the compiler is built and are part of each distribution since then. Spec files specify substitution and transformation rules for command line options for the compiler proper and for subprograms like assembler and linker.

Adding support for a new device consists in writing a new spec file for that device and supply it by means of

where is a directory containing a folder named device-specs which contains a file named specs-. As a blue print, start with an already existing spec file for a device as closely related to as possible. Also read the comments in that spec file.

Just like with older versions, you have to get the device headers which are realm of AVR-LibC from somewhere; same applies for the startup code in crt.o and for the device library lib.a. If you do not need or have a device library, -nodevicelib will do, but note that some non-standard functionality like EEPROM support is missing then.

Spec files allow to add support for new devices without the need to change the binares of the compiler, the assembler or the linker. Spec files may depend on the versions of GCC and Binutils, and using an incompatible spec file may lead to errors or wrong or sub-optimal code. For example, this is the case when newer tool versions support more or different options, but a spec file doesn't reflect that.

As the tools evolve, new features and command line options are added. When porting a device-specs file across one of the following features and versions, extra care must be taken:

Notice that the compiler behaves differently depending on the Binutils features it finds during configuration.

Using .atpack Device Pack Files from Atmel / Microchip

To make your life easier, Atmel / Microchip provides device-pack files at http://packs.download.atmel.com and https://packs.download.microchip.com. The files have extension .atpack but apart from that, they are just ZIP files, so you can unzip them and use them. These files contain all you need: Device header and device lib, startup-code, specs-file. Suppose you unzipped the pack to a folder , then amongst others, following folders and files are present:

|--include | +--avr | +--io*.h +--gcc +--dev +-- |--device-specs | +--specs- +-- |--lib.a +--crt.o

Where comes from -mmcu=, and is the multilib-path as printed by avr-gcc -mmcu= -print-multi-directory. This means we can support a device like, say, ATtiny424 by means of:

avr-gcc -mmcu=attiny424 -B /gcc/dev/attiny424 -isystem /include ...

Known Issues

avr-gcc v4.9 and below

avr-gcc and avr-as support the -mmcu= command line option to generate code for a specific device . Currently (2012), there are more than 200 known AVR devices and the hardware vendor keeps releasing new devices. If you need support for such a device and don't want to rebuild the tools, you can

  1. Sit and wait until support for your -mmcu= is added to the tools.
  2. Use appropriate command line options to compile for your favourite .

Approach 1 is comfortable but slow. Lazy developers that don't care for time-to-market will use it.

Approach 2 is preferred if you want to start development as soon as possible and don't want to wait until the tool chain with respective device support is released. This approach is only possible if the compiler and Binutils already come with support for the core architecture of your device.

When you feed code into the compiler and compile for a specific device, the compiler will only care for the respective core; it won't care for the exact device. It does not matter to the compiler how many I/O pins the device has, at what voltage it operates, how much RAM is present, how many timers or UARTs are on the silicon or in what package it is shipped. The only thing the compiler does with -mmcu= is to build-in define a specific macro and to call the linker in a specific way, i.e. the compiler driver behaves a bit differently, but the sub-tools like compiler proper and assembler will generate exactly the same code.

Thus, you can support your device by setting these options by hand.

Additionally, we need the following to compile a C program:

This header and its subheaders contain almost all information about a particular device like SFR addresses, size of the interrupt table and interrupt names, etc.

After all, it's just text and you can write it yourself. Find a device that is already supported by AVR-LibC and that is similar enough to your new device to serve as a reasonable starting point for the new device description.

If you are lucky, the device is already supported by AVR-LibC but not yet by the compiler. In that case, you can use verbatim copies from AVR-LibC.

Yet another approach is to write the file from scratch or not to use avr/io.h like headers at all. I that case, you provide all needed definitions like, say, SP and size of the vector table yourself.

If your toolchain is distributed with AVR-LibC then avr/io.h is located in the installation directory at ./avr/include i.e. you find a file io.h in ./avr/include/avr. In that file you find the lines:

#if defined (AVR_AT94K)

include <avr/ioat94k.h>

#elif defined (AVR_AT43USB320)

include <avr/io43u32x.h>

/* many many more entries */

#else

if !defined(COMPILING_AVR_LIBC)

warning "device type not defined"

endif

#endif

Add an entry for __AVR_mydevice__ and include your new file avr/iomydevice.h.

If you don't want to change the existing avr/io.h then copy it to a new directory and add that directory as system search path by means of -isystem whenever you compile or preprocess a C or assembler source that shall include the extended avr/io.h. Notice that the new directory will contain a subdirectory named avr.

Compiling the Code

Let's start with a simple C program, source.c:

#include <avr/io.h>

int var;

int main (void) { return var + SP; }

Your source directory then contains the following files:

The startup code gcrt1.S and macros.inc are verbatim copies from AVR-LibC.

sectionname.h is included by macros.inc but we don't need it: Simply provide sectionname.h as an empty file.

For the matter of simplicity, we show how to compile for a device that is similar to ATmega8 so that we don't need to extend avr/io.h to show the work flow. In the case you copied avr/io.h to a new place, don't forget to add respective -isystem to the first two commands for source.c and gcrt1.S.

ATmega8 is a device in core family avr4, thus we compile and assemble our source.c for that core architecture. __AVR_ATmega8__ stands for the subheader selector you added to avr/io.h.

Similarly, we assemble the startup code for our device by means of:

Finally, we link the stuff together to get a working source.elf (assuming that RAM starts at address 0x124):

Voilà!

Libf7

Libf7 is an ad-hoc, AVR-specific, 64-bit floating point emulation written in GNU-C and (inline) assembly. It is hosted and deployed as part of libgcc. Hence, it will be part of any avr-gcc distribution from v10 onwards without any further ado.

Implementation

Known Problems

The following long standing patches to AVR-LibC are needed:

Without these additions to AVR-LibC, 64-bit double cannot work correctly and you will get non-working programs. The AVR-LibC patches were integrated February 2022 and should be available in AVR-LibC v2.2 or newer. Or you can build / use AVR-LibC from git master.

Using 64-bit long double without proper AVR-LibC

Even without the mentioned AVR-LibC patches, you can use 64-bit long double arithmetic if:

#if WITH_LIBF7_MATH_SYMBOLS != 1
#error Using 64-bit double requires avr-gcc v10+ and --with-libf7=math-symbols.
#endif
#if SIZEOF_LONG_DOUBLE != 8
#error Only 64-bit long double is supported without the AVR-LibC patches.
#endif
#if WITH_DOUBLE_COMPARISON != 2
#error Wrong configuration of long double comparison.
#endif

Shortcomings

Libf7 is incomplete:

Other Implementations