Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix missing pj_acquire_lock etc. in proj.def #950

Closed
wants to merge 1 commit into from

Conversation

malytomas
Copy link

make functions
pj_acquire_lock, pj_release_lock and pj_cleanup_lock
available to users of the dll on windows builds.

@kbevers
Copy link
Member

kbevers commented Apr 24, 2018

I don't think this is something we would want. Why would you want to use those locks from outside the library?

Note that the header those functions are defined in will soon be removed from the public API. I don't want people to start using those functions and then remove access them to again a short time after it was made available.

@rouault
Copy link
Member

rouault commented Apr 24, 2018

Agreed with @kbevers . Those are proj internal implementation details.

@vaclavblazek
Copy link

If you happen to have a projection with a shift grid then the shift grid is loaded on the first projection which calls pj_acquire_lock. So far so good. However, the lock initialization is lazy which is not thread safe. Here is the offending line.
So if you launch multiple threads, create a PJ object in each and start projecting points (all nicely wrapped in GDAL's OGRCoordinateTransformation) then multiple threads hit pj_acquire_lock for the first time and chances are your program dies on memory access violation caused by multiple threads entering the critical section because more than one of them see the lock uninitialized.

Manual call of the pj_acquire_lock in the main thread before anything else is run is a kind of fix of this problem.

@rouault
Copy link
Member

rouault commented Apr 25, 2018

For the POSIX implementation making core_lock_created volatile should help a lot, since this would prevent compiler code re-ordering, and due to all the pthread_() things happening, this should also prevent CPU out-of-order execution. Or we may use pthread_once()

For Windows, perhaps InitOnceExecuteOnce() ( https://msdn.microsoft.com/en-us/library/ms683493(VS.85).aspx ) could be asolution

@kbevers
Copy link
Member

kbevers commented Apr 25, 2018

So if you launch multiple threads, create a PJ object in each and start projecting points

Do you experience the same behaviour if you also add a context for each thread?

@rouault
Copy link
Member

rouault commented Apr 25, 2018

@kbevers I doubt having a per-thread context will help: grid loading involves global variable (typically grid_list of pj_gridlist.c)

@kbevers
Copy link
Member

kbevers commented Apr 25, 2018

I doubt having a per-thread context will help: grid loading involves global variable (typically grid_list of pj_gridlist.c)

Could that be moved to the context? If not now maybe in version 7 where we are not restricted by keeping the promises made in proj_api.h.

@vaclavblazek
Copy link

vaclavblazek commented Apr 25, 2018

I guess pthread_once would be proper way to handle recursive lock creation. Not sure about volatile.

Re context-per-thread: as mentioned by @rouault, these are global variables. Not to mention that OGRCoordinateTransformation hides PJ creation in its guts.

HA! I've probably found workaround for our workaround.

projCtx pj_get_default_ctx()
{
    pj_acquire_lock();

So, I can call pj_get_default_ctx() instead of pj_acquire_lock() before threads launched.

Anyway, fixing the lock creation would be nice. Even providing some global initialization function would be enough.

@rouault
Copy link
Member

rouault commented Apr 25, 2018

Could that be moved to the context?

That would be a waste of memory and CPU time to have each context load grids.

@kbevers
Copy link
Member

kbevers commented Apr 25, 2018

So, I can call pj_get_default_ctx() instead of pj_acquire_lock() before threads launched.

Aren't your needs met with pj_ctx_alloc() which internally call pj_get_default_ctx() then:

/~https://github.com/OSGeo/proj.4/blob/968fcad7ae5064e733978f47d5e93a9baf6a5c46/src/pj_ctx.c#L104-L114
Or am I missing something here?

@kbevers
Copy link
Member

kbevers commented Apr 25, 2018

That would be a waste of memory and CPU time to have each context load grids.

Okay. It was actually my impression that was what happen already. Oh well, I am not that familiar with this part of the code.

@vaclavblazek
Copy link

Aren't your needs met with pj_ctx_alloc() which internally call pj_get_default_ctx() then:
I do not need extra context, OGR guts use the default one. I need to ensure that the proj's internal lock is initialized before it is used (indirectly) from different threads.

Anyway, thanks for help. I'll update our workaround, however it would be nice to have this whole machinery working correctly out of the box.

@kbevers
Copy link
Member

kbevers commented Apr 25, 2018

I do not need extra context, OGR guts use the default one. I need to ensure that the proj's internal lock is initialized before it is used (indirectly) from different threads.

Anyway, thanks for help. I'll update our workaround, however it would be nice to have this whole machinery working correctly out of the box.

So this is really an issue of how GDAL uses PROJ? We can't solve that here but I am sure Even is willing to accept a PR that make things work more smoothly in GDAL.

@rouault
Copy link
Member

rouault commented Apr 25, 2018

GDAL uses a proj context per OGRCoordinateTransformation object. I believe the issue is more in the potentially unsafe lock initialization in proj itself. So either proj needs to improve this internally (preferred way), or offer an initialization function

@kbevers
Copy link
Member

kbevers commented Apr 25, 2018

@rouault Okay. Do you want to propose a solution for this? Or at least write up an issue that we can return to later? I don't understand the problem well enough to do so myself I am afraid.

@rouault
Copy link
Member

rouault commented Apr 28, 2018

closing that one since we have opened #954 instead

@rouault rouault closed this Apr 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants