How to calculate the value of a polynomial function at an arbitrary point?

Asked

Viewed 1,397 times

23

I need to calculate the value of a polynomial function of arbitrary grade in arbitrary values of the realm.

I would like to know how to do it in a way that is not naive when doing the floating point calculation, considering that in the polynomial all coefficients and values of evaluation are relatively normal and well behaved (nothing of NaN, infty, denormalized numbers or close to denormalization - no need to worry about these validations).

  • It cannot be assumed that the polynomial will be sparse (many coefficients equal to 0) or that it will be full (all coefficients different from 0). The solution is expected to be able to handle both situations;

    • Example of sparse polynomial: f(x)=2x^123-x^30+5.366x
    • Example of full polynomial: f(x)=x^5-4x^4+3x^3-2x^2+x-9
  • If your algorithm gets an absurdly efficient and precise behavior depending on these conditions, it is worth noting that maybe, in an adaptive algorithm, it is worth trying to run your code; that is, you don’t necessarily need to use the same algorithm to solve sparse, full polynomials, as long as the adaptation of the algorithm is justified;

  • In cases of adaptive algorithm, it is expected that an algorithm will be used to validate the condition and it will be shown that validating the condition costs less than the performance gain obtained (precision gain is difficult to measure, but I accept suggestions for measures);

I would like to know what mistake the strategy adopted expected, too. For example, distinct results can be obtained by analyzing the polynomial from left to right f(x) = x^30+x^2+3000x compared to the polynomial f(x) = 3000x+x^2+x^30, even though mathematically they are identical, and can worsen to very large values of x (x >> 0). How your algorithm will behave in these cases?

I would also like to know, taking into account the reference location, the calculation performance of this algorithm.

Test cases

  • Sparse polynomial: f(x)=2x^123-x^30+5.366x

    • f(x=1) = 6.366
    • f(x=10) = 2e123
    • f(x=0.01) = 0.05366
  • Full polynomial: f(x)=x^5-4x^4+3x^3-2x^2+x-9

    • f(x=1) = -10
    • f(x=10) = 62,801
    • f(x=0.01) = -8.9902

Not getting the exact value, the error must be justified and described if it was predicted due to the behavior of the algorithm.

  • Girard relations for a polynomial of degree n.

  • 4

    @Augustovasques, but I’m not interested in roots. I only know the Girard relations applied to the roots of polynomials.

  • I would like to know how to improve the question. Who negatived can give me the honor?

  • 2

    Those who negatively misunderstood your question. Just to clarify, the question is about minimum polynomial, the guy has zillions of daily entries on a website and to analyze this data there is no processing time available so he puts this data as a matrix of polynomial coefficients and finds the auto-vector that is a calculable representativeness of the original data, but it consumes a lot of processing time.

  • @Augustovasques What? Minimum polynomial? Zillions of entries? Web site? Eigenvector? Matrix of polynomial coefficients? We are talking about the same question?

  • I just wanted to calculate f(x) = x^4 - x^2 + 1...

  • That’s what I understood.

  • Then use Girard Relations.

  • So the guy wronged you.

  • 6

    Girard relations determine the roots of the polynomial, but what if x not root? He wants to analyze the polynomial at a certain point in the curve, not check the roots of it. "[...] calculate the value of an arbitrary degree polynomial function at arbitrary values of the realm". Unless your suggestion is to use them to determine the roots, simplify the polynomial and, consequently, simplify the calculation in arbitrary values...

  • @Augustovasques you know that x == x + 1.0 is true for some values of x, no?

  • 1

    I guess I was rude. Because Girard relations, because it has an analysis theorem that says that between two consecutive real roots there will always be an inflection point and the derivates of the curve in these roots determine the magnitude of the inflection. Endowed with this information you can approximate the curve with a Berstain polynomial.

  • @Augustovasques There it seems to be a valid xD approach It will be interesting to implement it to compare with other possible solutions.

  • Out of curiosity, can we consider only real coefficients? xD

  • @Andersoncarloswoss yes, nothing imaginary.

  • What do you want to know? Want to know how to store a polynomial and use the storage structure to compute from an argument to x? Or want to know how to compute from a pre-specified structure, type function that takes as arguments an array of monomials (polynomial) and the value of x? Or wonder what function I would do to calculate f(x)=2x 123-x 30+5.366x and what function I would do to calculate f(x)=x 5-4x 4+3x 3-2x 2+x-9 separately?

  • @RHERWOLF, the case of one or another specific polynomial is "easy" the algorithm (despite concerns about losing precision in the last decimal places). The creation of the data structure that represents the polynomial is implicit, but did not want to hold as it is to give greater possibility of answers, since some structure can make better use of sparse polynomials than the vector of coefficients. The assembly of the polynomial can be magical, as long as its use is explicit in the algorithm, as well as the assumptions of what a well-assembled structure would be (but n needs to validate the structure)

  • About receiving the array of monomials and calculating them individually, you have that the most naive strategy of it would do ~~ n^2/2 + n multiplications and n - 1 sums for a degree polynomial n. Not to mention that the sum order would cause possible loss of precision

  • I’m building an answer.

Show 14 more comments

2 answers

13

Answer still incomplete, but already has content on full polynomials.

Complete Polynomials

By complete polynomials we mean the polynomials that can be represented by

inserir a descrição da imagem aqui

Being thei other than zero; that is, no coefficient is null and therefore the polynomial will have all the terms relative to the powers of x.

Mathematical Considerations

Let us take as an example polynomial a complete polynomial represented by

inserir a descrição da imagem aqui

Where p(x) is the polynomial, thei is the true non-zero grade coefficient i and $x i$ a ith power of x. If we expand this sum, we’ll get:

inserir a descrição da imagem aqui

We can observe that, except for the term $0, all terms have x and we know that given the associative property we can put the term in common in evidence, obtaining:

inserir a descrição da imagem aqui

If we do this recursively with the terms within the parentheses we will get:

inserir a descrição da imagem aqui

Thus, to calculate the value of p(x), we can solve:

  • bn = ton
  • bn-1 = ton-1 + bn x
  • bn-2 = ton-2 + bn-1 x
  • ...
  • b1 = to1 + b2 x
  • b0 = to0 + b1 x

Thus resulting in p(x) = b0, so that we reduce the resolution of a polynomial from degree n to n resolutions of first-degree polynomials. This is the horner’s method.

Demeanor

While for the calculation of a point of the polynomial of degree n in the traditional method requires n(n+1)/2 multiplications and n additions, with the Horner method the number of operations is reduced to n multiplications and n additions and we know that the fewer operations not only the calculation tends to get faster but will be less susceptible to errors.

A possible implementation of the Horner method is by calculating p(x=1) = x5 - 4x4 + 3x3 - 2x2 + x - 9:

float x = 1;
float coefficients[] = {1, -4, 3, -2, 1, -9};
float result = 0;

for (int i = 0; i < sizeof(coefficients)/sizeof(float); i++) {
    result = coefficients[i] + result*x;
}

printf("%f\n", result);

Although we have gained it with the reduction of operations necessary for the calculation, we still have considerations to make.

First, the intrinsic problem of computation in working with floating point number. The type float of the C language is represented according to the IEEE 754 specification using 32 bits, being 1 bit signal, 8 bits to represent the exponent and 23 bits to represent the mantissa, which gives us an accuracy of only 7 decimal places. Still, the larger the mantissa, the fewer decimal places we will have available for the decimal part, so it is expected that if the value grows too much the decimal part will be lost and therefore our result will suffer truncation.

Second, our result will be defined by coefficients[i] + result*x. From mathematics, we know that a + b will tend to a if a is much larger than b, a >> b. For example, 10000 + 1 is approximately 10000. This will imply that if result*x is much larger than coefficients[i] the value of the latter will be negligible, also making the truncation of the value.

For the purpose of comparison, the execute the code snippet above, to x = 1, we get the -10 result, which is what is expected. Already, when run to x = 1000, we get the result 996003050160128, while the expected was 996002998000991.

  • I think it’s worth pointing out here that if 1 > |x| > 0, the trend is that the error is minimal. Ok, it’s more like my conjecture, but I don’t see losing precision due to the "big numbers"

10

Full Polynomials

In cases of full polynomials, I believe it is more recommended to structure the polynomial in the form of an array of coefficients, making each cell a component monomial of the polynomial and for each cell the index corresponds to the exponent of the parameter and the element stored there, the coefficient. For example, the array 5,4,-1,3 corresponds to (5)*x°+(4)*x¹+(-1)*x²+(3)*x³, therefore 5+4x-x²+3x³.

P(c0,c1,c2,...,cn-2,cn-1,cn)(x) = c0 + c1*x + c2*x² + ... + cn-2*xn-2 + cn-1*xn-1 + cn*xn

The reason is that this takes advantage of the full polynomial characteristic in at least two aspects: (1) it makes it possible to store only one coefficient per monomial and no other data, not even exponent and (2) allows calculating the value of the polynomial given an argument using only a sum and a multiplication by monomial.

P(c0,c1,c2,...,cn-2,cn-1,cn)(x) =

cn*xn + cn-1*xn-1 + cn-2*xn-2 + ... + c2*x2 + c1*x1 + cn-0*x0 =

( cn*x + cn-1 )*xn-1 + cn-2*xn-2 + ... + c2*x2 + c1*x1 + cn-0*x0 =

P(c0,c1,c2,...,cn-2, cn*x + cn-1 )(x)

One perceives a recursive relation, right? Determining an argument for the parameter, one can with a sum and a multiplication (cn*x+cn-1) restructure the polynomial so that the values of the last two monomials are united as if they were one. This can be repeated until only one monomial value remains, a constant that is the result.

Polinômio( [c[0],c[1],...,c[n-1],c[n]] , x )
    Se n<1 faça:
        Se n<0 faça:
            Retorna 0
        Se n=0 faça:
            Retorna c[0]
    Retorna Polinômio( [c[0],c[1],...,c[n-1]+x*c[n]] , x )

In order not to make recursive call, one can implement the following algorithm.

Polinômio( [c[0],c[1],...,c[n-1],c[n]] , x )
    v <- 0
    Para i=n,n-1,...,1,0 faça:
        v <- v*x+c[i]
    retorna v

Note that the second algorithm only has a sum and a multiplication in a loop that executes n+1 times, that is, the number of monomials of the polynomial. Thus, the cost in number of mathematical operations is O(n). Note also that in case of zeroed coefficients in the structure, even representing a null monomial, it is also calculated.

Polinomiaos Esparsos

In cases of sparse polynomials, I believe that the same solution adopted for full polynomials will have both unnecessarily costly storage and unnecessarily reduced performance in specific cases, after all are many zeros that could be simplified. An alternative is to structure with an array of nonzero monomials, where each monomial is stored with the power exponent (e) and the coefficient (c). For example, (0,4),(9,3),(99,2),(999,1) corresponds to 4+3x9+2x99+x999.

One can make an obvious algorithm that calculates the value of each term and sums them, whether the calculated powers using strategic products or exponential function. However, it is possible, alternatively, to simulate the second full polynomial algorithm in a way that simplifies the excessive products. This makes it possible to achieve the result by means of power calculations with exponents smaller than those of the monomials, which requires fewer products (but if it is exponential, depending on how it is implemented, whatever). Also, if you immediately associate an exponent with a means of calculating power, it may be that this algorithm is suitable even in full polynomials.

P( (and0,c0) , (and1,c1) , (and2,c2) , ... , (andn-2,cn-2) , (andn-1,cn-1) , (andn,cn) )(x) =

cn*xandn + cn-1*xandn-1 + cn-2*xandn-2 + ... + c2*xand2 + c1*xand1 + cn-0*xand0 =

( cn*xandn-andn-1 + cn-1 )*xandn-1 + cn-2*xandn-2 + ... + c2*xand2 + c1*xand1 + cn-0*xand0 =

P( (and0,c0) , (and1,c1) , (and2,c2) , ... , (andn-2,cn-2) , (andn-1, cn*xandn-andn-1 + cn-1 ) )(x)

In C/C++, it is possible by intrinsic functions to compute logarithms at absurd speed and manually implement extremely fast exponentials such that x24 calculated by the formula exp(24*ln(x)) is as efficient as x*=( x*=( x*=( x*=x*x ) ) ) or even more. In fact, the decision as to how to calculate the powers is not easy. There is a universe of options.

Algorithm Chosen

It was decided to use an approach based on the simulation of the second algorithm aimed at full polynomials with powers performed via products. The algorithm requires the array of exponents and coefficients to be ordered increasing by exponent (so that andi-andi-1 always be non-negative) in the same way as the version for full polynomials (the difference is always +1).

It is known that, reflecting writing on the basis of two, a positive integer can be portrayed as a sum of base two powers and integer exponent (such as 33=32+1), so integer exponent powers can be divided into products of the same base power and exponents that are power of two (such as x³³=x³²*x). Having pre-calculated exponent power which is power of two, your calculation is saved.

I will assume the easy forwarding to the power calculation procedure from "switch-case". The algorithm takes into account that one begins knowing that x°=1, x¹=x, x²=x*x, x³=x*x² and x^(4+n)=(x²)*(x²)*( x^n ). It can extend as far as you like (x^8, x^16, x^32, etc), but the code becomes extensive and perhaps complicated to understand, so I will restrict the knowledge of x, and (x²)², as well as the procedures as described.

Polinômio( [( e[0],c[0] ),( e[1],c[1] ),...,( e[n-1],c[n-1] ),( e[n],c[n] )] , x )
    Se n<0 faça:
        retorna 0
    v <- c[n]
    x2 <- x*x
    x4 <- x2*x2
    Para i=n-1,...,1,0 faça:
        d <- e[i+1]-e[i]
        switch d:
            caso seja 0:
                v <- v+c[i]
            caso seja 1:
                v <- v*x+c[i]
            caso seja 2:
                v <- v*x2+c[i]
            caso seja 3:
                v <- v*x2*x+c[i]
            caso seja outro:
                v <- v*x4
                d <- d-4
                vá para: switch d
    d <- e[0]
    switch d:
        caso seja 0:
            retorna v
        caso seja 1:
            retorna v*x
        caso seja 2:
            retorna v*x2
        caso seja 3:
            retorna v*x2*x
        caso seja outro:
            v <- v*x4
            d <- d-4
            vá para: switch d
    retorna v

To accelerate a little more, one can do so (know definitive calculations not only up to x (4+i) but up to x (8+i)).

Polinômio( [( e[0],c[0] ),( e[1],c[1] ),...,( e[n-1],c[n-1] ),( e[n],c[n] )] , x )
    Se n<0 faça:
        retorna 0
    v <- c[n]
    x2 <- x*x
    x4 <- x2*x2
    Para i=n-1,...,1,0 faça:
        d <- e[i+1]-e[i]
        switch d:
            caso seja 0:
                v <- v+c[i]
            caso seja 1:
                v <- v*x+c[i]
            caso seja 2:
                v <- v*x2+c[i]
            caso seja 3:
                v <- v*x2*x+c[i]
            caso seja 4:
                v <- v*x4+c[i]
            caso seja 5:
                v <- v*x4*x+c[i]
            caso seja 6:
                v <- v*x4*x2+c[i]
            caso seja 7:
                v <- v*x4*x2*x+c[i]
            caso seja outro:
                v <- v*x4*x4
                d <- d-8
                vá para: switch d
    d <- e[0]
    switch d:
        caso seja 0:
            retorna v
        caso seja 1:
            retorna v*x
        caso seja 2:
            retorna v*x2
        caso seja 3:
            retorna v*x2*x
        caso seja 4:
            retorna v*x4
        caso seja 5:
            retorna v*x4*x
        caso seja 6:
            retorna v*x4*x2
        caso seja 7:
            retorna v*x4*x2*x
        caso seja outro:
            v <- v*x4*x4
            d <- d-8
            vá para: switch d
    retorna v

Note that, in relation to the full polynomial version, it adds two products additionally and uses a switch to select a calculation procedure that can reduce up to 75% (3/4) of the multiplication operations. If you store x 8, you can reduce up to 87.5% (7/8) and even more.

Running this module in C++ calculates the test cases provided in the statement using this algorithm, both with floats and with doubles,

# include <iostream>
# include <string>



float p( float x , float *c , int *e , int n ){
    float x2=x*x , x4=x2*x2 , v=c[n] ;
    int i , d ;
    for( i=n-1 ; i>=0 ; i-- ){
        d = e[i+1]-e[i] ;
        _1: switch( d ){
            case 0: v = v+c[i] ; break ;
            case 1: v = v*x+c[i] ; break ;
            case 2: v = v*x2+c[i] ; break ;
            case 3: v = v*x2*x+c[i] ; break ;
            case 4: v = v*x4+c[i] ; break ;
            case 5: v = v*x4*x+c[i] ; break ;
            case 6: v = v*x4*x2+c[i] ; break ;
            case 7: v = v*x4*x2*x+c[i] ; break ;
            default: v = v*x4*x4 ; d-=8 ; goto _1 ;
        }
    }
    d = e[0] ;
    _2: switch( d ){
        case 0: return v ;
        case 1: return v*x ;
        case 2: return v*x2 ;
        case 3: return v*x2*x ;
        case 4: return v*x4 ;
        case 5: return v*x4*x ;
        case 6: return v*x4*x2 ;
        case 7: return v*x4*x2*x ;
        default: v = v*x4*x4 ; d-=8 ; goto _2 ;
    }
}



double p( double x , double *c , int *e , int n ){
    double x2=x*x , x4=x2*x2 , v=c[n] ;
    int i , d ;
    for( i=n-1 ; i>=0 ; i-- ){
        d = e[i+1]-e[i] ;
        _1: switch( d ){
            case 0: v = v+c[i] ; break ;
            case 1: v = v*x+c[i] ; break ;
            case 2: v = v*x2+c[i] ; break ;
            case 3: v = v*x2*x+c[i] ; break ;
            case 4: v = v*x4+c[i] ; break ;
            case 5: v = v*x4*x+c[i] ; break ;
            case 6: v = v*x4*x2+c[i] ; break ;
            case 7: v = v*x4*x2*x+c[i] ; break ;
            default: v = v*x4*x4 ; d-=8 ; goto _1 ;
        }
    }
    d = e[0] ;
    _2: switch( d ){
        case 0: return v ;
        case 1: return v*x ;
        case 2: return v*x2 ;
        case 3: return v*x2*x ;
        case 4: return v*x4 ;
        case 5: return v*x4*x ;
        case 6: return v*x4*x2 ;
        case 7: return v*x4*x2*x ;
        default: v = v*x4*x4 ; d-=8 ; goto _2 ;
    }
}



int main(){
  float ch_cf[]={ -9 , 1 , -2 , 3 , -4 , 1 } , es_cf[]={ 5.366f , -1.0f , 2.0f } ;
  double ch_cd[]={ -9 , 1 , -2 , 3 , -4 , 1 } , es_cd[]={ 5.366 , -1.0 , 2.0 } ;
  int E[]={ 0 , 1 , 2 , 3 , 4 , 5 } , e[]={ 1 , 30 , 123 } ;

  std::cout.precision(12) ;

  std::cout << std::fixed << "PoliCh( 10.0 ) = " << p( 10 , ch_cf , E , 5 ) << " [" << p( 10 , ch_cd , E , 5 ) ;
  std::cout << std::scientific << "]    PoliEs( 10.0 ) =    " << p( 10 , es_cf , e , 2 ) << "    [" << p( 10 , es_cd , e , 2 ) << "]\n" ;

  std::cout << std::fixed << "PoliCh( 1.00 ) =   " << p( 1 , ch_cf , E , 5 ) << " [  " << p( 1 , ch_cd , E , 5 ) ;
  std::cout << "]    PoliEs( 1.00 ) = " << p( 1 , es_cf , e , 2 ) << " [" << p( 1 , es_cd , e , 2 ) << "]\n" ;

  std::cout << "PoliCh( 0.01 ) =    " << p( 0.01f , ch_cf , E , 5 ) << " [   " << p( 0.01 , ch_cd , E , 5 ) ;
  std::cout << "]    PoliEs( 0.01 ) = " << p( 0.01f , es_cf , e , 2 ) << " [" << p( 0.01 , es_cd , e , 2 ) << "]\n" ;
}

this is printed. Note that in all cases there is a relative approximation error of less than 0.000003% (3e-8, about 2-25), which is a strong indication that errors are the consequence of basic operations rounding, not mathematical formula approximation errors.

PoliCh( 10.0 ) = 62801.000000000000 [62801.000000000000]    PoliEs( 10.0 ) =    inf    [2.000000000000e+123]
PoliCh( 1.00 ) =   -10.000000000000 [  -10.000000000000]    PoliEs( 1.00 ) = 6.366000175476 [6.366000000000]
PoliCh( 0.01 ) =    -8.990197181702 [   -8.990197039900]    PoliEs( 0.01 ) = 0.053660001606 [0.053660000000]

It is important to note that not only PoliEs(10) but also PoliEs(0.01) is case of sum of numbers of very different magnitudes. It is as 1e99+1e9, 1e99+(-1e9), -1e99+1e9 and -1e99+(-1e9) which has a difference of 90 of order of magnitude, therefore even with twenty decimal places one does not feel the modification of the larger number with the addition or subtraction of the smaller one. In practice, one rounds the lowest absolute value to zero of so negligible relative to the highest by not even reaching the last digit in double precision.

Still, there are no major errors in these operations because the relative exact change in value is tiny ((( 1e99+1e9 )-( 1e99 ))/1e99 = 1e9/1e99 = 1e-90 ~ 0), So to despise it makes insignificant mistakes. Usually errors are observed when the magnitudes of the operated ones are more balanced, so that it can even affect the last digit stored.

In addition, if we observe larger errors is when the value structure is of natural low precision (fewer digits stored), the numbers involved in the calculation steps are "broken", there is accumulation of operations (therefore of rounding, which generate errors) and mathematical errors, such as formula errors that mathematically approximate other, shortened Taylor series and interpolations.

Since the algorithm uses all monomials provided without mathematical simplifications and does not use approximation formulas (such as power using exponential approximation), no mathematical motivation errors are observed. Errors resulting from computational limitations in the storage of values found throughout the execution can be observed depending on the type of value (float/double) used and especially when using full and large polynomials with non-representable coefficients and arguments with binary accuracy.

For example, executing the code set to the polynomial
inserir a descrição da imagem aqui
we found, compared to the exact values f(0.9)=1.80596532 and f(1.1)=-0.97304332, with floats the values f(0.9)=1.80596613884 and f(1.1)=-0.97304391861 (with, respectively, 8.2e-7 and 6e-7 of absolute errors and 4.5e-7 and 6.2e-7 relative errors), which indicates errors around the 20º bit (between 23). Already with doubles are observed the values f(0.9)=1.80596531999999943 and f(1.1)=-0.97304332000000215 (with, respectively, 5.7e-16 and 2.2e-15 of absolute errors and 3.2e-16 and 2.2e-15 relative errors), indicating error of f(0.9) around 51º bit (between 52) and f(1.1) around the 48th bit.

Improved Algorithm

Also taking into account Jefferson’s comment on the power algorithm o(log(e)), I decided to reimplement the previous algorithm using it in cases of large exponents. It prints the same result.

# include <iostream>
# include <string>



float p( float x , float *c , int *e , int n ){
    float x2=x*x , x4=x2*x2 ;
    float xe[2]={1} , v=c[n] ;
    int i , d , de ;
    for( i=n-1 ; i>=0 ; i-- ){
        d = e[i+1]-e[i] ;
        _1: switch( d ){
            case 0: v = v+c[i] ; break ;
            case 1: v = v*x+c[i] ; break ;
            case 2: v = v*x2+c[i] ; break ;
            case 3: v = v*x2*x+c[i] ; break ;
            case 4: v = v*x4+c[i] ; break ;
            case 5: v = v*x4*x+c[i] ; break ;
            case 6: v = v*x4*x2+c[i] ; break ;
            case 7: v = v*x4*x2*x+c[i] ; break ;
            default:
                de = d>>3 ;
                xe[1] = x4 ;
                do {
                    xe[1] *= xe[1] ;
                    v *= xe[de&1] ;
                } while( de>>=1 ) ;
                d &= 7 ;
                goto _1 ;
        }
    }
    d = e[0] ;
    _2: switch( d ){
        case 0: return v ;
        case 1: return v*x ;
        case 2: return v*x2 ;
        case 3: return v*x2*x ;
        case 4: return v*x4 ;
        case 5: return v*x4*x ;
        case 6: return v*x4*x2 ;
        case 7: return v*x4*x2*x ;
        default:
            de = d>>3 ;
            xe[1] = x4 ;
            do {
                xe[1] *= xe[1] ;
                v *= xe[de&1] ;
            } while( de>>=1 ) ;
            d &= 7 ;
            goto _2 ;
    }
}



double p( double x , double *c , int *e , int n ){
    double x2=x*x , x4=x2*x2 ;
    double xe[2]={1} , v=c[n] ;
    int i , d , de ;
    for( i=n-1 ; i>=0 ; i-- ){
        d = e[i+1]-e[i] ;
        _1: switch( d ){
            case 0: v = v+c[i] ; break ;
            case 1: v = v*x+c[i] ; break ;
            case 2: v = v*x2+c[i] ; break ;
            case 3: v = v*x2*x+c[i] ; break ;
            case 4: v = v*x4+c[i] ; break ;
            case 5: v = v*x4*x+c[i] ; break ;
            case 6: v = v*x4*x2+c[i] ; break ;
            case 7: v = v*x4*x2*x+c[i] ; break ;
            default:
                de = d>>3 ;
                xe[1] = x4 ;
                do {
                    xe[1] *= xe[1] ;
                    v *= xe[de&1] ;
                } while( de>>=1 ) ;
                d &= 7 ;
                goto _1 ;
        }
    }
    d = e[0] ;
    _2: switch( d ){
        case 0: return v ;
        case 1: return v*x ;
        case 2: return v*x2 ;
        case 3: return v*x2*x ;
        case 4: return v*x4 ;
        case 5: return v*x4*x ;
        case 6: return v*x4*x2 ;
        case 7: return v*x4*x2*x ;
        default:
            de = d>>3 ;
            xe[1] = x4 ;
            do {
                xe[1] *= xe[1] ;
                v *= xe[de&1] ;
            } while( de>>=1 ) ;
            d &= 7 ;
            goto _2 ;
    }
}



int main(){
  float ch_cf[]={ -9 , 1 , -2 , 3 , -4 , 1 } , es_cf[]={ 5.366f , -1.0f , 2.0f } ;
  double ch_cd[]={ -9 , 1 , -2 , 3 , -4 , 1 } , es_cd[]={ 5.366 , -1.0 , 2.0 } ;
  int E[]={ 0 , 1 , 2 , 3 , 4 , 5 } , e[]={ 1 , 30 , 123 } ;

  std::cout.precision(12) ;

  std::cout << std::fixed << "PoliCh( 10.0 ) = " << p( 10 , ch_cf , E , 5 ) << " [" << p( 10 , ch_cd , E , 5 ) ;
  std::cout << std::scientific << "]    PoliEs( 10.0 ) =    " << p( 10 , es_cf , e , 2 ) << "    [" << p( 10 , es_cd , e , 2 ) << "]\n" ;

  std::cout << std::fixed << "PoliCh( 1.00 ) =   " << p( 1 , ch_cf , E , 5 ) << " [  " << p( 1 , ch_cd , E , 5 ) ;
  std::cout << "]    PoliEs( 1.00 ) = " << p( 1 , es_cf , e , 2 ) << " [" << p( 1 , es_cd , e , 2 ) << "]\n" ;

  std::cout << "PoliCh( 0.01 ) =    " << p( 0.01f , ch_cf , E , 5 ) << " [   " << p( 0.01 , ch_cd , E , 5 ) ;
  std::cout << "]    PoliEs( 0.01 ) = " << p( 0.01f , es_cf , e , 2 ) << " [" << p( 0.01 , es_cd , e , 2 ) << "]\n" ;
}

Additionally, in order to reduce waste in full polynomials, one can add to the algorithm a check of the degree of sparsity with additional operations of order O(1) from the highest degree of monomial and the number of terms to determine which is the most appropriate procedure for the case.

  • 1

    Well reminded of the exponentiation of the logarithm, I had totally forgotten that

  • Note on the strategy of using the exponential of the logarithm: causes a very boring precision loss =/ f(x) = (1.0 + x)^32.0 - e^(ln(1.0 + x) * 32.0) is equal to f(x) = 0 for x > -1. However, you end up losing some precision. For example, for a sparse polynomial of degree 32, the strategy of exp(log(x)*coef) already begins to lose precision when x >= 2. Python uses the functions of libmath used by C/C++, so you can extrapolate this example to C. But this does not take away the merits of his answer =D

  • Jefferson, in case you work with double, yes. But in the case of work with float you can internally calculate logarithm as double and get exponential converted to float, which depending on how you do gives almost accuracy and is still possible to be quite efficient. In other words, it depends on many factors.

  • It has a strategy that makes the exponentiation per integer in o(log2(exp)), being exp the exponent of the monomial. It is similar to its strategy, but memoizes dynamically or results instead of using a predefined one. I really liked the representation of the sparse polynomial, I’m seeing if I can turn this into an adaptive algorithm, haha: I’m going to go into the code to take a look at how many floating point operations were done.

  • Yes, it’s the one that works with the binary code of the integer to calculate, right? It does the same as I mentioned above, sees the whole exponent e as e0*2°+e1*2¹+e2*2²+e3*2³+... and therefore x^e as x^(e0+e1*2+e2*4+e3*8+...)=( x^1 )^e0*( x^2 )^e1*( x^4 )^e2*( x^8 )^e3*..., correct? The difference with what I implemented is that this recalculates the powers "square over square", already my pre-calculates to a certain extent and does not recalculate. Each one has its advantages.

  • I also thought of an algorithm that starts in an excerpt of code that knows only up to x 1, then if necessary calculates x 2 and goes to an excerpt that assumes the knowledge of x 2, if necessary calculates x 4 and goes to an excerpt that assumes the knowledge of x 4, up to a maximum, maybe x 32 or x 64 and from there uses some simplification to calculate higher powers if you need (type to which you mentioned). It’s just confusing, it must take a lot of crazy "goto".

  • I updated the code with the improvement you mentioned. It prints the same result.

  • @Jeffersonquesado, do you have something you want added to the answer? Or is it satisfying?

  • In fact, yes, I felt a lack at some points. Especially in the study of the loss of precision accompanying the rarefaction of the polynomial, as the behavior of the long Tail of double/float according to the value of X (apart from the fact that some number would naturally not fit within the single or double floating point domain space). The cost of accessing memory by navigating backwards vs forwards (I think the spatial location takes more information forward than backwards, but I have no solid foundation). Yeah, I’m a real pain in the ass XD But congratulations on earning the 50 points from Bounty

  • You’ve earned the fullest answer. Ah, I also missed tests with values and polynomials other than the ones provided, your tests that you know are corner case and that with the wrong treatment would give some performance problem or accuracy.

  • It also has the difference between being two vectors (one of coefficients and the other of exponents) vs a vector of structures, including both exponent and coefficient. Finally, some details that I find crucial, even more when in the question asks to take into account the reference location to presume/measure the performance

  • I’ve improved the issue of mistakes.

Show 7 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.