Arithmetic algorithms for extended precision using floating-point expansions

Mioara Joldes; Olivier Marty; Jean-Michel Muller; Valentina Popescu

doi:10.1109/TC.2015.2441714

Article Dans Une Revue IEEE Transactions on Computers Année : 2016

Arithmetic algorithms for extended precision using floating-point expansions

(1) , (2) , (3, 4) , (5, 3)

1
2
3
4
5

Mioara Joldes

Fonction : Auteur
PersonId : 735561
IdHAL : mioara-joldes
IdRef : 156990415

Équipe Méthodes et Algorithmes en Commande

Olivier Marty

Fonction : Auteur

École normale supérieure - Cachan

Jean-Michel Muller

Fonction : Auteur
PersonId : 4171
IdHAL : jean-michel-muller
ORCID : 0000-0003-3588-0047
IdRef : 029762871

Arithmetic and Computing

Centre National de la Recherche Scientifique

Valentina Popescu

Fonction : Auteur

École normale supérieure de Lyon

Arithmetic and Computing

Résumé

Many numerical problems require a higher computing precision than the one offered by standard floating-point (FP) formats. One common way of extending the precision is to represent numbers in a multiple component format. By using the so-called floating-point expansions, real numbers are represented as the unevaluated sum of standard machine precision FP numbers. This representation offers the simplicity of using directly available, hardware implemented and highly optimized FP operations and is used by multiple-precision libraries such as Bailey's QD or the analogue Graphics Processing Units (GPU) tuned version, GQD. In this article we revisit algorithms for adding and multiplying FP expansions, then we introduce and prove new algorithms for normalizing, dividing and square rooting of FP expansions. The new method used for computing the reciprocal and the square root of a FP expansion is based on an adapted Newton-Raphson iteration where the intermediate calculations are done using "truncated" operations (additions, multiplications) involving FP expansions. We give here a thorough error analysis showing that it allows very accurate computations. More precisely, after q iterations, the computed FP expansion x=x_0+\ldots+x_{2^q-1} satisfies, for the reciprocal algorithm, the relative error bound: |(x-1/a)*a| <= 2^{-2^q(p-3)-1} and, respectively, for the square root one: |x-1/sqrt(a)| <= 2^{-2^q(p-3)-1}/sqrt(a), where p>2 is the precision of the FP representation used (p=24 for single precision and p=53 for double precision).

Mots clés

floating-point expansions floating-point arithmetic reciprocal multiple-precision arithmetic division high precision arithmetic Newton-Raphson iteration square root

Domaines

Arithmétique des ordinateurs

Fichier principal

newton-raph.pdf (1.85 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Valentina Popescu : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01111551

Soumis le : mardi 2 juin 2015-15:13:51

Dernière modification le : lundi 20 novembre 2023-11:44:22

Archivage à long terme le : mardi 25 avril 2017-00:03:26

Dates et versions

hal-01111551 , version 1 (30-01-2015)

hal-01111551 , version 2 (02-06-2015)

Identifiants

HAL Id : hal-01111551 , version 2
DOI : 10.1109/TC.2015.2441714

Citer

Mioara Joldes, Olivier Marty, Jean-Michel Muller, Valentina Popescu. Arithmetic algorithms for extended precision using floating-point expansions. IEEE Transactions on Computers, 2016, 65 (4), pp.1197 - 1210. ⟨10.1109/TC.2015.2441714⟩. ⟨hal-01111551v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-LYON UNIV-TLSE2 CNRS INRIA UNIV-LYON1 ENS-CACHAN INSA-TOULOUSE LAAS LAAS-MAC UT1-CAPITOLE LAAS-DECISION-ET-OPTIMISATION INRIA2 UNIV-PARIS-SACLAY INSA-GROUPE UDL ANR ENS-PARIS-SACLAY TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

961 Consultations

915 Téléchargements

Arithmetic algorithms for extended precision using floating-point expansions

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager