(original) (raw)

header

(float.h)

Characteristics of floating-point types

This header describes the characteristics of floating types for the specific system and compiler implementation used.

A floating-point number is composed of four elements:

a sign: either negative or non-negative
a base (or radix): which expresses the different numbers that can be represented with a single digit (2 for binary, 10 for decimal, 16 for hexadecimal, and so on...)
a significand (or mantissa): which is a series of digits of the aforementioned base. The number of digits in this series is what is known as precision.
an exponent (also known as characteristic, or scale): which represents the offset of the significand, affecting the value in the following way:
value of floating-point = significand x base exponent, with its corresponding sign.

Macro constants

The following panel shows the name of the different values defined in this header and their minimal or maximal values for all implementations (each implementation may set these values as either greater or smaller than this minimum or maximum, as specified):

When a group of macros exists prefixed by FLT_, DBL_ and LDBL_, the one beginning with FLT_ applies to the float type, the one with DBL_ to double and the one with LDBL_ to long double.

name	value	stands for	expresses
FLT_RADIX	samp>2 /samp> or greater	RADIX	Base for all floating-point types (float, double and long double).
FLT_MANT_DIG DBL_MANT_DIG LDBL_MANT_DIG		MANTissa DIGits	Precision of significand, i.e. the number of digits that conform the significand.
FLT_DIG DBL_DIG LDBL_DIG	6 or greater 10 or greater 10 or greater	DIGits	Number of decimal digits that can be rounded into a floating-point and back without change in the number of decimal digits.
FLT_MIN_EXP DBL_MIN_EXP LDBL_MIN_EXP		MINimum EXPonent	Minimum negative integer value for the exponent that generates a normalized floating-point number.
FLT_MIN_10_EXP DBL_MIN_10_EXP LDBL_MIN_10_EXP	-37 or smaller -37 or smaller -37 or smaller	MINimum base-10 EXPonent	Minimum negative integer value for the exponent of a base-10 expression that would generate a normalized floating-point number.
FLT_MAX_EXP DBL_MAX_EXP LDBL_MAX_EXP		MAXimum EXPonent	Maximum integer value for the exponent that generates a normalized floating-point number.
FLT_MAX_10_EXP DBL_MAX_10_EXP LDBL_MAX_10_EXP	37 or greater 37 or greater 37 or greater	MAXimum base-10 EXPonent	Maximum integer value for the exponent of a base-10 expression that would generate a normalized floating-point number.
FLT_MAX DBL_MAX LDBL_MAX	1E+37 or greater 1E+37 or greater 1E+37 or greater	MAXimum	Maximum finite representable floating-point number.
FLT_EPSILON DBL_EPSILON LDBL_EPSILON	1E-5 or smaller 1E-9 or smaller 1E-9 or smaller	EPSILON	Difference between 1 and the least value greater than 1 that is representable.
FLT_MIN DBL_MIN LDBL_MIN	1E-37 or smaller 1E-37 or smaller 1E-37 or smaller	MINimum	Minimum representable positive floating-point number.
FLT_ROUNDS	ROUND	Rounding behavior. Possible values: -1 undetermined 0 toward zero 1 to nearest 2 toward positive infinity 3 toward negative infinityApplies to all floating-point types (float, double and long double).
FLT_EVAL_METHOD	EVALuation METHOD	Properties of the evaluation format. Possible values: -1 undetermined 0 evaluate just to the range and precision of the type 1 evaluate float and double as double, and long double as long double. 2 evaluate all as long doubleOther negative values indicate an implementation-defined behavior.Applies to all floating-point types (float, double and long double).
DECIMAL_DIG	DECIMAL DIGits	Number of decimal digits that can be rounded into a floating-point type and back again to the same decimal digits, without loss in precision.

Compatibility

FLT_EVAL_METHOD and DECIMAL_DIG are defined for libraries complying with the C standard of 1999 or later (which only includes the C++ standard since 2011: C++11).

(original) (raw)

(float.h)

Macro constants

Compatibility

See also