c++ - simd code to check if two double values are sufficiently different -


suppose have 2 double values, old , new. implement vectorized function returns old if abs(x-y) < p, , new otherwise.

here code (test.cpp):

#include <emmintrin.h> #include <iostream>  #define array_length 2  int main(void) {     // x = old value, y = new value, res = result     double *x, *y, *res;     posix_memalign((void **)&x, 16, sizeof(double) * array_length);     posix_memalign((void **)&y, 16, sizeof(double) * array_length);     posix_memalign((void **)&res, 16, sizeof(double) * array_length);      double p = 1e-4; // precision     __m128d sp = _mm_set1_pd(p);     x[0] = 1.5; y[0] = 1.50011; // x - old value, y - new value     x[1] = 2.; y[1] = 2.0000001;      __m128d sx = _mm_load_pd(x);     __m128d sy = _mm_load_pd(y);      // sign mask compute fabs()     __m128d sign_mask = _mm_set1_pd(-0.);     // |x-y|     __m128d absval = _mm_andnot_pd(sign_mask, _mm_sub_pd(sx, sy) );     // mask of |x-y| < p     __m128d mask = _mm_cmplt_pd(absval, sp);     // sres = |x-y| < p ? x : y;     __m128d sres = _mm_or_pd(             _mm_and_pd(mask, sx), _mm_andnot_pd(mask, sy) );     _mm_store_pd(res, sres);     std::cerr << "res=" << res[0] << "," << res[1] << std::endl;     return 0; } 

to build:

g++ -std=c++11 -msse4 test.cpp 

we first compute fabs(x-y), compare p, , combine x, y using obtained mask.

does see more efficient way code this? thanks.

there way make algoritm little faster, can decrease accuracy:

// d = x - y; __m128d diff = _mm_sub_pd(sx, sy); // mask of |y - x| < p __m128d mask = _mm_cmplt_pd(_mm_andnot_pd(sign_mask, diff), sp); // sres = y + (|y - x| < p) ? (x - y) : 0; __m128d sres = _mm_add_pd(sy, _mm_and_pd(mask, diff)); 

another way - using of avx or/and single precision.


Comments

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -