source: public/doc/gnu-c/Fused-Multiply_002dAdd.html@ 02598c2

Last change on this file since 02598c2 was 02598c2, checked in by Mikhail Kirillov <w96k@…>, on Oct 6, 2022 at 12:36:29 PM

Add gnu-c

  • Property mode set to 100644
File size: 5.5 KB
Line 
1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html>
3<!-- Copyright (C) 2022 Richard Stallman and Free Software Foundation, Inc.
4
5(The work of Trevis Rothwell and Nelson Beebe has been assigned or
6licensed to the FSF.)
7
8Permission is granted to copy, distribute and/or modify this document
9under the terms of the GNU Free Documentation License, Version 1.3 or
10any later version published by the Free Software Foundation; with the
11Invariant Sections being "GNU General Public License," with the
12Front-Cover Texts being "A GNU Manual," and with the Back-Cover
13Texts as in (a) below. A copy of the license is included in the
14section entitled "GNU Free Documentation License."
15
16(a) The FSF's Back-Cover Text is: "You have the freedom to copy and
17modify this GNU manual. Buying copies from the FSF supports it in
18developing GNU and promoting software freedom." -->
19<!-- Created by GNU Texinfo 6.7, http://www.gnu.org/software/texinfo/ -->
20<head>
21<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
22<title>Fused Multiply-Add (GNU C Language Manual)</title>
23
24<meta name="description" content="Fused Multiply-Add (GNU C Language Manual)">
25<meta name="keywords" content="Fused Multiply-Add (GNU C Language Manual)">
26<meta name="resource-type" content="document">
27<meta name="distribution" content="global">
28<meta name="Generator" content="makeinfo">
29<link href="index.html" rel="start" title="Top">
30<link href="Symbol-Index.html" rel="index" title="Symbol Index">
31<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
32<link href="Floating-Point-in-Depth.html" rel="up" title="Floating Point in Depth">
33<link href="Error-Recovery.html" rel="next" title="Error Recovery">
34<link href="Significance-Loss.html" rel="prev" title="Significance Loss">
35<style type="text/css">
36<!--
37a.summary-letter {text-decoration: none}
38blockquote.indentedblock {margin-right: 0em}
39div.display {margin-left: 3.2em}
40div.example {margin-left: 3.2em}
41div.lisp {margin-left: 3.2em}
42kbd {font-style: oblique}
43pre.display {font-family: inherit}
44pre.format {font-family: inherit}
45pre.menu-comment {font-family: serif}
46pre.menu-preformatted {font-family: serif}
47span.nolinebreak {white-space: nowrap}
48span.roman {font-family: initial; font-weight: normal}
49span.sansserif {font-family: sans-serif; font-weight: normal}
50ul.no-bullet {list-style: none}
51-->
52</style>
53
54
55</head>
56
57<body lang="en">
58<span id="Fused-Multiply_002dAdd"></span><div class="header">
59<p>
60Next: <a href="Error-Recovery.html" accesskey="n" rel="next">Error Recovery</a>, Previous: <a href="Significance-Loss.html" accesskey="p" rel="prev">Significance Loss</a>, Up: <a href="Floating-Point-in-Depth.html" accesskey="u" rel="up">Floating Point in Depth</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Symbol-Index.html" title="Index" rel="index">Index</a>]</p>
61</div>
62<hr>
63<span id="Fused-Multiply_002dAdd-1"></span><h3 class="section">28.10 Fused Multiply-Add</h3>
64<span id="index-fused-multiply_002dadd-in-floating_002dpoint-computations"></span>
65<span id="index-floating_002dpoint-fused-multiply_002dadd"></span>
66
67<p>In 1990, when IBM introduced the POWER architecture, the CPU
68provided a previously unknown instruction, the <em>fused
69multiply-add</em> (FMA). It computes the value <code>x * y + z</code> with
70an <strong>exact</strong> double-length product, followed by an addition with a
71<em>single</em> rounding. Numerical computation often needs pairs of
72multiply and add operations, for which the FMA is well-suited.
73</p>
74<p>On the POWER architecture, there are two dedicated registers that
75hold permanent values of <code>0.0</code> and <code>1.0</code>, and the
76normal <em>multiply</em> and <em>add</em> instructions are just
77wrappers around the FMA that compute <code>x * y + 0.0</code> and
78<code>x * 1.0 + z</code>, respectively.
79</p>
80<p>In the early days, it appeared that the main benefit of the FMA
81was getting two floating-point operations for the price of one,
82almost doubling the performance of some algorithms. However,
83numerical analysts have since shown numerous uses of the FMA for
84significantly enhancing accuracy. We discuss one of the most
85important ones in the next section.
86</p>
87<p>A few other architectures have since included the FMA, and most
88provide variants for the related operations <code>x * y - z</code>
89(FMS), <code>-x * y + z</code> (FNMA), and <code>-x * y - z</code> (FNMS).
90</p>
91<p>The functions <code>fmaf</code>, <code>fma</code>, and <code>fmal</code> implement fused
92multiply-add for the <code>float</code>, <code>double</code>, and <code>long
93double</code> data types. Correct implementation of the FMA in software is
94difficult, and some systems that appear to provide those functions do
95not satisfy the single-rounding requirement. That situation should
96change as more programmers use the FMA operation, and more CPUs
97provide FMA in hardware.
98</p>
99<p>Use the <samp>-ffp-contract=fast</samp> option to allow generation of FMA
100instructions, or <samp>-ffp-contract=off</samp> to disallow it.
101</p>
102
103<hr>
104<div class="header">
105<p>
106Next: <a href="Error-Recovery.html" accesskey="n" rel="next">Error Recovery</a>, Previous: <a href="Significance-Loss.html" accesskey="p" rel="prev">Significance Loss</a>, Up: <a href="Floating-Point-in-Depth.html" accesskey="u" rel="up">Floating Point in Depth</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Symbol-Index.html" title="Index" rel="index">Index</a>]</p>
107</div>
108
109
110
111</body>
112</html>
Note: See TracBrowser for help on using the repository browser.