1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
---|
2 | <html>
|
---|
3 | <!-- Copyright (C) 2022 Richard Stallman and Free Software Foundation, Inc.
|
---|
4 |
|
---|
5 | (The work of Trevis Rothwell and Nelson Beebe has been assigned or
|
---|
6 | licensed to the FSF.)
|
---|
7 |
|
---|
8 | Permission is granted to copy, distribute and/or modify this document
|
---|
9 | under the terms of the GNU Free Documentation License, Version 1.3 or
|
---|
10 | any later version published by the Free Software Foundation; with the
|
---|
11 | Invariant Sections being "GNU General Public License," with the
|
---|
12 | Front-Cover Texts being "A GNU Manual," and with the Back-Cover
|
---|
13 | Texts as in (a) below. A copy of the license is included in the
|
---|
14 | section entitled "GNU Free Documentation License."
|
---|
15 |
|
---|
16 | (a) The FSF's Back-Cover Text is: "You have the freedom to copy and
|
---|
17 | modify this GNU manual. Buying copies from the FSF supports it in
|
---|
18 | developing GNU and promoting software freedom." -->
|
---|
19 | <!-- Created by GNU Texinfo 6.7, http://www.gnu.org/software/texinfo/ -->
|
---|
20 | <head>
|
---|
21 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
---|
22 | <title>Fused Multiply-Add (GNU C Language Manual)</title>
|
---|
23 |
|
---|
24 | <meta name="description" content="Fused Multiply-Add (GNU C Language Manual)">
|
---|
25 | <meta name="keywords" content="Fused Multiply-Add (GNU C Language Manual)">
|
---|
26 | <meta name="resource-type" content="document">
|
---|
27 | <meta name="distribution" content="global">
|
---|
28 | <meta name="Generator" content="makeinfo">
|
---|
29 | <link href="index.html" rel="start" title="Top">
|
---|
30 | <link href="Symbol-Index.html" rel="index" title="Symbol Index">
|
---|
31 | <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
|
---|
32 | <link href="Floating-Point-in-Depth.html" rel="up" title="Floating Point in Depth">
|
---|
33 | <link href="Error-Recovery.html" rel="next" title="Error Recovery">
|
---|
34 | <link href="Significance-Loss.html" rel="prev" title="Significance Loss">
|
---|
35 | <style type="text/css">
|
---|
36 | <!--
|
---|
37 | a.summary-letter {text-decoration: none}
|
---|
38 | blockquote.indentedblock {margin-right: 0em}
|
---|
39 | div.display {margin-left: 3.2em}
|
---|
40 | div.example {margin-left: 3.2em}
|
---|
41 | div.lisp {margin-left: 3.2em}
|
---|
42 | kbd {font-style: oblique}
|
---|
43 | pre.display {font-family: inherit}
|
---|
44 | pre.format {font-family: inherit}
|
---|
45 | pre.menu-comment {font-family: serif}
|
---|
46 | pre.menu-preformatted {font-family: serif}
|
---|
47 | span.nolinebreak {white-space: nowrap}
|
---|
48 | span.roman {font-family: initial; font-weight: normal}
|
---|
49 | span.sansserif {font-family: sans-serif; font-weight: normal}
|
---|
50 | ul.no-bullet {list-style: none}
|
---|
51 | -->
|
---|
52 | </style>
|
---|
53 |
|
---|
54 |
|
---|
55 | </head>
|
---|
56 |
|
---|
57 | <body lang="en">
|
---|
58 | <span id="Fused-Multiply_002dAdd"></span><div class="header">
|
---|
59 | <p>
|
---|
60 | Next: <a href="Error-Recovery.html" accesskey="n" rel="next">Error Recovery</a>, Previous: <a href="Significance-Loss.html" accesskey="p" rel="prev">Significance Loss</a>, Up: <a href="Floating-Point-in-Depth.html" accesskey="u" rel="up">Floating Point in Depth</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Symbol-Index.html" title="Index" rel="index">Index</a>]</p>
|
---|
61 | </div>
|
---|
62 | <hr>
|
---|
63 | <span id="Fused-Multiply_002dAdd-1"></span><h3 class="section">28.10 Fused Multiply-Add</h3>
|
---|
64 | <span id="index-fused-multiply_002dadd-in-floating_002dpoint-computations"></span>
|
---|
65 | <span id="index-floating_002dpoint-fused-multiply_002dadd"></span>
|
---|
66 |
|
---|
67 | <p>In 1990, when IBM introduced the POWER architecture, the CPU
|
---|
68 | provided a previously unknown instruction, the <em>fused
|
---|
69 | multiply-add</em> (FMA). It computes the value <code>x * y + z</code> with
|
---|
70 | an <strong>exact</strong> double-length product, followed by an addition with a
|
---|
71 | <em>single</em> rounding. Numerical computation often needs pairs of
|
---|
72 | multiply and add operations, for which the FMA is well-suited.
|
---|
73 | </p>
|
---|
74 | <p>On the POWER architecture, there are two dedicated registers that
|
---|
75 | hold permanent values of <code>0.0</code> and <code>1.0</code>, and the
|
---|
76 | normal <em>multiply</em> and <em>add</em> instructions are just
|
---|
77 | wrappers around the FMA that compute <code>x * y + 0.0</code> and
|
---|
78 | <code>x * 1.0 + z</code>, respectively.
|
---|
79 | </p>
|
---|
80 | <p>In the early days, it appeared that the main benefit of the FMA
|
---|
81 | was getting two floating-point operations for the price of one,
|
---|
82 | almost doubling the performance of some algorithms. However,
|
---|
83 | numerical analysts have since shown numerous uses of the FMA for
|
---|
84 | significantly enhancing accuracy. We discuss one of the most
|
---|
85 | important ones in the next section.
|
---|
86 | </p>
|
---|
87 | <p>A few other architectures have since included the FMA, and most
|
---|
88 | provide variants for the related operations <code>x * y - z</code>
|
---|
89 | (FMS), <code>-x * y + z</code> (FNMA), and <code>-x * y - z</code> (FNMS).
|
---|
90 | </p>
|
---|
91 | <p>The functions <code>fmaf</code>, <code>fma</code>, and <code>fmal</code> implement fused
|
---|
92 | multiply-add for the <code>float</code>, <code>double</code>, and <code>long
|
---|
93 | double</code> data types. Correct implementation of the FMA in software is
|
---|
94 | difficult, and some systems that appear to provide those functions do
|
---|
95 | not satisfy the single-rounding requirement. That situation should
|
---|
96 | change as more programmers use the FMA operation, and more CPUs
|
---|
97 | provide FMA in hardware.
|
---|
98 | </p>
|
---|
99 | <p>Use the <samp>-ffp-contract=fast</samp> option to allow generation of FMA
|
---|
100 | instructions, or <samp>-ffp-contract=off</samp> to disallow it.
|
---|
101 | </p>
|
---|
102 |
|
---|
103 | <hr>
|
---|
104 | <div class="header">
|
---|
105 | <p>
|
---|
106 | Next: <a href="Error-Recovery.html" accesskey="n" rel="next">Error Recovery</a>, Previous: <a href="Significance-Loss.html" accesskey="p" rel="prev">Significance Loss</a>, Up: <a href="Floating-Point-in-Depth.html" accesskey="u" rel="up">Floating Point in Depth</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Symbol-Index.html" title="Index" rel="index">Index</a>]</p>
|
---|
107 | </div>
|
---|
108 |
|
---|
109 |
|
---|
110 |
|
---|
111 | </body>
|
---|
112 | </html>
|
---|