Microchip Technology SW006023-2N Data Sheet

MPLAB

XC32 C/C++ Compiler User’s Guide

DS51686E-page 96

 2012 Microchip Technology Inc.

6.5

FLOATING-POINT DATA TYPES

The compiler uses the IEEE-754 floating-point format. Detail regarding the
implementation limits is available to a translation unit in float.h.

Variables may be declared using the float, double and long double keywords,
respectively, to hold values of these types. Floating-point types are always signed and
the unsigned keyword is illegal when specifying a floating-point type. All floating-point
values are represented in little endian format with the Least Significant Byte at the
lower address.

This format is described in Table 6-1, where:

• Sign is the sign bit which indicates if the number is positive or negative

• For 32-bit floating point values, the exponent is 8 bits which is stored as excess

127 (i.e. an exponent of 0 is stored as 127).

• For 64-bit floating point values, the exponent is 11 bits which is stored as excess

1023 (i.e. an exponent of 0 is stored as 1023).

• Mantissa is the mantissa, which is to the right of the radix point. There is an

implied bit to the left of the radix point which is always 1 except for a zero value,
where the implied bit is zero. A zero value is indicated by a zero exponent.

The value of this number for 32-bit floating point values is:

(-1)

sign

x 2

(exponent-127)

x 1. mantissa

and for 64-bit values

(-1)

sign

x 2

(exponent-1023)

x 1. mantissa.

Here is an example of the IEEE 754 32-bit format shown in Table 6-1. Note that the
Most Significant bit of the mantissa column (i.e. the bit to the left of the radix point) is
the implied bit, which is assumed to be 1 unless the exponent is zero (in which case
the float is zero).

The example in Table 6-1 can be calculated manually as follows.

The sign bit is zero; the biased exponent is 251, so the exponent is 251-127=124. Take
the binary number to the right of the decimal point in the mantissa. Convert this to dec-
imal and divide it by 2

where 23 is the number of bits taken up by the mantissa, to

give 0.302447676659. Add 1 to this fraction. The floating-point number is then given
by:

-1

2

124

1.302447676659

which becomes:

2.126764793256e+371.302447676659

which is approximately equal to:

2.77000e+37

Type

Bits

float

double

long double

TABLE 6-1:

FLOATING-POINT FORMAT EXAMPLE IEEE 754

Format

Number

Biased exponent

1.mantissa

Decimal

32-bit

7DA6B69Bh

11111011b

1.0100110101101101
0011011b

2.77000e+37

(251)

(1.302447676659)

—