Return NaN coordinate immediately when receiving NaN input #949

kbevers · 2018-04-23T20:41:14Z

Ideally NaNs shouldn't get special treatment but since double to int
conversion is undefined behavior we are forced to treat NaNs as
something special. Therefor we return a NaN coordinate when ever we
encounter a NaN in one of the dimensions of an input coordinate. The
error level is not changed since this all math operations on NaNs should
return a NaN.

This policy of returning NaNs immediately should also eliminate a bunch
of potential double to int conversion bugs.

Alternative version of #943

cffk · 2018-04-24T07:56:38Z

C99 has the function, lround, which rounds a double to a long returning
an "implementation-defined value" (LONG_MIN on my system) if the double
does not fit in a long. Thus if we just replaced

int(floor(double))

with

lround(floor(double))

we would no longer have to include special tests for NaN. Of course for
non-C99 systems we would need to define lround

long lround(double x) {
  x = x < 0 ? ceil(x - 0.5) : floor(x + 0.5);
  return fabs(x) < -(double)LONG_MIN ? (long)x : LONG_MIN;
}

@schwehr Could you verify that lround(NaN) is safe (doesn't cause the
program to abort)?

kbevers · 2018-04-24T08:07:24Z

You mean (int)floor(double), right? We would have to find all occurrences of that to be on the safe side. I can easily grep a handful of those but I don't find all of them, for instance these lines from PJ_isea.c (that sparked the whole issue in the first place):

    rx = floor(x + 0.5);
    ix = (int)rx;
    ry = floor(y + 0.5);
    iy = (int)ry;
    rz = floor(z + 0.5);
    iz = (int)rz;

I am good with the lround solution granted that we can create a safe non-C99 version of it.

cffk · 2018-04-24T08:11:49Z

Yes, I meant (inf)floor(double). (I was thinking C++.) I think that double to int conversions are only needed in a handful of situations: computing a UTM zone, getting the right face of an icosahedron...

kbevers · 2018-04-24T08:57:46Z

I get these cases from grepping a bit:

λ grep "(int)[[:space:]]*floor" *.c
PJ_robin.c:    i = isnan(lp.phi) ? -1 : (int)floor(dphi * C1);
PJ_robin.c:        i = isnan(lp.phi) ? -1 : (int)floor(lp.phi * NODES);
PJ_unitconvert.c:    year = (int)floor(decimalyear);
PJ_unitconvert.c:    int year  = (int)floor( yyyymmdd / 10000 );
PJ_unitconvert.c:    int month = (int)floor((yyyymmdd - year*10000) / 100);
PJ_unitconvert.c:    int day   = (int)floor( yyyymmdd - year*10000 - month*100);
nad_intr.c:     indx.lam = isnan(t.lam) ? 0 : (int)floor(t.lam);
nad_intr.c:     indx.phi = isnan(t.phi) ? 0 : (int)floor(t.phi);
pj_apply_vgridshift.c:        grid_ix = (int) floor(grid_x);
pj_apply_vgridshift.c:        grid_iy = (int) floor(grid_y);

and then the rest of the (int) typecasts:

λ grep -v "(int)[[:space:]]*floor" *.c | grep (int)
PJ_healpix.c:        return pnpoly((int)sizeof(healpixVertsJit)/
PJ_healpix.c:        return pnpoly((int)sizeof(rhealpixVertsJit)/
PJ_horner.c:    int n = (int)horner_number_of_coefficients(order);
PJ_horner.c:        n = 2*(int)order + 2;
PJ_horner.c:    h->order = (int)order;
PJ_isea.c:    ix = (int)rx;
PJ_isea.c:    iy = (int)ry;
PJ_isea.c:    iz = (int)rz;
PJ_isea.c:     * (int)sidelength * 2 + 1 might be just as good
PJ_isea.c:    maxcoord = (int) (sidelength * 2.0 + 0.5);
PJ_isea.c:        sidelength = (int) (pow(g->aperture, g->resolution / 2.0) + 0.5);
PJ_isea.c:    hexes = (int) (pow(g->aperture, g->resolution) + 0.5);
PJ_isea.c:        height = (int) (pow(g->aperture, (g->resolution - 1) / 2.0));
PJ_isea.c:        sn = ((int) di->x) * height;
PJ_isea.c:        sn += ((int) di->y) / height;
PJ_isea.c:        sidelength = (int) (pow(g->aperture, g->resolution / 2.0) + 0.5);
PJ_isea.c:        sn = (int) ((quad - 1) * hexes + sidelength * di->x + di->y + 2);
PJ_isea.c:    hex->x = ((int)v.x << 4) + quad;
PJ_isea.c:    d = (int)v.x;
PJ_isea.c:    i = (int)v.y;
PJ_isea.c:        int offset = (int)(pow(3.0, g->resolution - 1) + 0.5);
PJ_isea.c:    sidelength = (int) (pow(g->aperture, g->resolution / 2.0) + 0.5);
PJ_pipeline.c:    P->opaque->steps = (int)steps;
PJ_pipeline.c:    argc = (int)argc_params (P->params);
PJ_unitconvert.c:    day = (int)(mjd - mjd_iter + 1);
cct.c:            print (PJ_LOG_ERROR, "Read error in record %d\n", (int) o->record_index);
cct.c:            print (PJ_LOG_NONE, "# Record %d UNREADABLE: %s", (int) o->record_index, buf);
cct.c:                   (int) o->record_index, buf, pj_strerrno (proj_errno(P)));
geod_set.c:                     n_S = (int)(geod_S / del_S + .5);
geodesic.c:  q = r == r ? (int)(floor(r / 90 + (real)(0.5))) : 0;
gie.c:    ntrips = (int) (endp==args? 100: fabs(ans));
gie.c:        fprintf (T.fout, "     FAILURE in %s(%d):\n", opt_strip_path (T.curr_file), (int) F->lineno);
gie.c:    fprintf (T.fout, "     FAILURE in %s(%d):\n", opt_strip_path (T.curr_file), (int) F->lineno);
gie.c:        fprintf (T.fout, "     FAILURE in %s(%d):\n     Too few args: %s\n", opt_strip_path (T.curr_file), (int) F->lineno, args);
gie.c:    fprintf (T.fout, "     FAILURE in %s(%d):\n",    opt_strip_path (T.curr_file), (int) F->lineno);
gie.c:            delim, (int) T.operation_lineno, pj_strerrno(proj_errno(T.P)),
gie.c:            delim, (int) T.operation_lineno
gie.c:    int i = (int) proj_atof (args);
gie.c:    ret = (int) pj_atof (err_const);
gie.c:    if (0==fgets (G->next_args, (int) G->next_args_size - 1, G->f))
gie.c:        if (0==fgets (G->next_args, (int) G->next_args_size - 1, G->f))
mk_cheby.c:    if (!(ncu = (int *)vector1(nu + nv, sizeof(int)))) {
nad_init.c:    for( id_end = (int)strlen(ct->id)-1; id_end > 0; id_end-- )
nad_init.c:        swap_words( ct->cvs, 4, (int)a_size * 2 );
nad_init.c:    for( id_end = (int)strlen(ct->id)-1; id_end > 0; id_end-- )
pj_gridinfo.c:                (int)header_size );
pj_malloc.c:    p->baz = pj_calloc (10, sizeof(int));
pj_open_lib.c:            fname[n = (int)strlen(fname)] = DIR_CHAR;
pj_open_lib.c:        fname[n = (int)strlen(fname)] = DIR_CHAR;
pj_param.c:    l = (int) strlen(opt);
pj_pr_list.c:                   l = (int)strlen(t->param) + 1;
pj_pr_list.c:        l = (int)strlen(t->param) + 1;
proj_4D_api.c:    P = pj_init_ctx (ctx, (int) argc, argv);
proj_etmerc.c:        zone = (int)(floor ((adjlon (P->lam0) + M_PI) * 30. / M_PI));
proj_strtod.c:    printf ("res = %20.15g. Rest = [%s],  errno = %d\n", res, endptr, (int) errno);
rtodms.c:       min = (int)fmod(r, 60.);
rtodms.c:       deg = (int)r;

It is not a huge task to go through each of them and change them to behave as we want them to. I can do that one of the coming days.

cffk · 2018-04-24T13:00:01Z

You will note that several of these (int) casts are actually rounds. Thus in geodesic.c

q = r == r ? (int)(floor(r / 90 + (real)(0.5))) : 0

becomes the simpler

q = lround(r / 90);

You still need to worry about how ties are handled (no problem in geodesic.c, however). And in this case, I don't need to worry about what arbitrary value gets returned with a NaN.

cffk · 2018-04-24T17:51:12Z

Here's my implementation of lround...

#include <math.h>
#include <limits.h>

double roundx(double x) {
  /* The handling of corner cases is copied from boost; see
   *   /~https://github.com/boostorg/math/pull/8
   * with improvements to return -0.0 when appropriate */
  double t;
  if (x == 0)
    return x;               /* Retain sign of 0 */
  else if (0 < x && x <  0.5)
    return +0.0;
  else if (0 > x && x > -0.5)
    return -0.0;
  else if (x > 0) {
    t = ceil(x);
    return 0.5 < t - v ? t - 1 : t;
  } else {                  /* Includes NaN */
    t = floor(x);
    return 0.5 < x - t ? t + 1 : t;
  }
}

long lroundx(double x) {
  /* Default value for overflow + NaN + (x == LONG_MIN) */
  long r = LONG_MIN;
  x = roundx(x);
  if (fabs(x) < -(double)LONG_MIN) /* Assume (double)LONG_MIN is exact */
    r = (int)x;
  return r;
}

…n NaN input. Added test to check for those cases.

kbevers · 2018-05-23T16:32:27Z

@cffk I finally got round to fixing the build errors in this. I plan on merging this tomorrow so that it can be included in PROJ 5.1.0RC1 (which I intend to release tomorrow afternoon CEST). If you have the time, could you give this a quick look and check if everything is sane?

There's some stupid things regarding pj_isnan because it is used by gie and hence need to be exported in the Windows DLL. That makes it necessary to declare the function at all times and not only when C99 math is unavailable.

cffk · 2018-05-23T16:59:54Z

Replacing (int)floor(x) by lround(floor(x)) is fine. But there are cases, e.g., maxcoord = (int) (sidelength * 2.0 + 0.5); where the use of lround changes from truncation to rounding. If you know the quantity is non-negative, you can fix by throwing in a call to floor.

kbevers · 2018-05-24T12:44:22Z

With http://c-faq.com/fp/round.html in mind, I think the cases in PJ_isea can handled by removing the + 0.5 and using lround, so that

int offset = (int)(pow(3.0, g->resolution - 1) + 0.5)

becomes

long offset = lround(pow(3.0, g->resolution - 1))

The original intension must have been to round the numbers to the nearest integer value.

cffk · 2018-05-24T12:51:55Z

Indeed.. The problem is now how ties are treated: lround(x) rounds away from zero, floor(x+0.5) rounds up. These are the same if x >= 0. So the exercise is to check each case. (And perhaps in some cases, it's immaterial how ties are treated.)

Originally the code was doing double to int conversions like y = (int)(x + 0.5) which results in rounding when typecasting. In an earlier attempt to avoid buffer overflows in integer typecasts this was changed to y = lround(x + 0.5) which doesn't give the origial results. We fix that here by instead doing y = lround(x) It is safe to so as long as x is positive.

kbevers · 2018-05-24T13:14:06Z

I have checked all of the cases and I believe they are all cases of positive numbers, so this should be safe.

cffk · 2018-05-24T13:33:49Z

OK, looks good.

kbevers · 2018-05-24T13:39:31Z

OK, looks good.

Awesome. Thanks for supervising this!

kbevers force-pushed the return-nans-quickly branch from dd1b2be to a111d43 Compare April 23, 2018 21:07

kbevers mentioned this pull request Apr 23, 2018

Fix float-cast-overflow in PJ_isea.c:hexbin2 #943

Closed

kbevers force-pushed the return-nans-quickly branch 2 times, most recently from f760ded to ff89a5e Compare April 24, 2018 07:51

kbevers force-pushed the return-nans-quickly branch 12 times, most recently from e3df7a1 to d9f0d6c Compare May 8, 2018 06:11

kbevers added 4 commits May 8, 2018 09:13

Add NAN definition to proj_math.h

78f6a82

Add round() and lround() to proj_math.h

987603a

Handle HUGE_VAL input better in fwd and inv functions

8fef212

Replace int typecasts with calls to lround to avoid bad conversions o…

58cbb9f

…n NaN input. Added test to check for those cases.

kbevers force-pushed the return-nans-quickly branch from d9f0d6c to 58cbb9f Compare May 8, 2018 07:16

Merge branch 'master' into return-nans-quickly

f5c8188

kbevers force-pushed the return-nans-quickly branch 2 times, most recently from 7b7804b to dc1802e Compare May 23, 2018 14:02

Add pj_isnan to proj.def to avoid linking errors for some MSVC versions

6db260f

kbevers force-pushed the return-nans-quickly branch from dc1802e to 6db260f Compare May 23, 2018 15:20

kbevers added this to the 5.1.0 milestone May 24, 2018

kbevers merged commit a45ad8c into OSGeo:master May 24, 2018

kbevers mentioned this pull request May 29, 2018

Add test framework, C++11 compilation mode and warning fixes #1021

Merged

kbevers deleted the return-nans-quickly branch July 9, 2018 08:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return NaN coordinate immediately when receiving NaN input #949

Return NaN coordinate immediately when receiving NaN input #949

kbevers commented Apr 23, 2018

cffk commented Apr 24, 2018

kbevers commented Apr 24, 2018

cffk commented Apr 24, 2018

kbevers commented Apr 24, 2018 •

edited

Loading

cffk commented Apr 24, 2018

cffk commented Apr 24, 2018

kbevers commented May 23, 2018

cffk commented May 23, 2018

kbevers commented May 24, 2018 •

edited

Loading

cffk commented May 24, 2018

kbevers commented May 24, 2018

cffk commented May 24, 2018

kbevers commented May 24, 2018

Return NaN coordinate immediately when receiving NaN input #949

Return NaN coordinate immediately when receiving NaN input #949

Conversation

kbevers commented Apr 23, 2018

cffk commented Apr 24, 2018

kbevers commented Apr 24, 2018

cffk commented Apr 24, 2018

kbevers commented Apr 24, 2018 • edited Loading

cffk commented Apr 24, 2018

cffk commented Apr 24, 2018

kbevers commented May 23, 2018

cffk commented May 23, 2018

kbevers commented May 24, 2018 • edited Loading

cffk commented May 24, 2018

kbevers commented May 24, 2018

cffk commented May 24, 2018

kbevers commented May 24, 2018

kbevers commented Apr 24, 2018 •

edited

Loading

kbevers commented May 24, 2018 •

edited

Loading