Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn when CRS is not WGS84 #349

Open
hadley opened this issue Jan 6, 2017 · 10 comments
Open

Warn when CRS is not WGS84 #349

hadley opened this issue Jan 6, 2017 · 10 comments

Comments

@hadley
Copy link
Member

hadley commented Jan 6, 2017

library(leaflet)
library(sp)

data("meuse", package = "sp", envir = environment())
sp::coordinates(meuse) <- ~x+y
proj4string(meuse) <- CRS("+init=epsg:28992")
meuse <- spTransform(meuse, CRS("+proj=longlat +datum=WGS84"))

leaflet(meuse) %>% 
  addTiles() %>% 
  addCircles()
@jcheng5
Copy link
Member

jcheng5 commented Jan 6, 2017

Considered and rejected a while back #139

@bhaskarvk
Copy link
Collaborator

Automatic reprojecting is a big NO-NO from the GIS guys. Almost everyone I talked to about this said don't do it. I think the reason is that not all data can be projected to WGS84 losslessly. I will let @tim-salabim @mdsumner add their thoughts.

@hadley hadley closed this as completed Jan 6, 2017
@hadley
Copy link
Member Author

hadley commented Jan 6, 2017

I think leaflet should at least give an informative error message. Re-projecting might be bad, but I think it's worse to silently accept an incompatible CRS.

@bhaskarvk
Copy link
Collaborator

That is a fair point. For all the add* methods, we can a) ensure that the shapefile's CRS is epsg4326 OR in case of just lat/lon values ensure that they are within correct lat/lon bounds.

But a major issue at times is that shapefiles don't always have correct CRS defined. So it might say 4326 but the data might not really be in that CRS. That's what Robert is saying in #139. So even if we check the CRS we are still not really sure whether the data is correct or not. That onus still lies on the user.

@hadley
Copy link
Member Author

hadley commented Jan 6, 2017

Yeah, so we'll warn and if they don't like the warning message, they'll have to correct the metadata.

@bhaskarvk
Copy link
Collaborator

One thing to keep in mind is, that for L.CRS.Simple CRS the input values for vector data can be anything, and don't necessarily have to be in lat/long. For all other projections including the default the input values for vector data have to be in lat/long. So the check needs to explicitly exclude L.CRS.Simple.

@bhaskarvk
Copy link
Collaborator

Also my second point was that the shapefile's CRS may say EPSG4326 (lat/long) in which case we won't warn but the actual data might not be in lat/lon and so it will not render correctly. No way to fix that.

@hadley hadley reopened this Jan 6, 2017
@hadley
Copy link
Member Author

hadley commented Jan 6, 2017

Re-open to remind that we need warnings. I think we should also include an example like above to illustrate what people should do

@hadley hadley changed the title Consider reprojecting to WGS84 automatically Warn when CRS is not WGS84 Jan 6, 2017
@mdsumner
Copy link

mdsumner commented Jan 6, 2017

@bhaskarvk thanks for the ping here, I've tried to be measured in what I write below ;)

We absolutely should just reproject when needed. This is the most important thing that is missing across all of R spatial! GIS generally doesn't have reproducible workflows, so you can easily get in trouble for the rare cases when 1) reprojection is applied to the source data without backup 2) the actual transformation is not reversible. The problem of incorrect metadata is a universal issue we face because so much data is sent around with no unit or crs metadata, it's not a reason to hobble modern tools.

To actually reproject sf uses PROJ.4, but this is also included in the proj4 package in much lighter package (no GDAL or GEOS dependency, but also no higher level wrappers for specific types beyond xy.coords).

One of the most powerful things about GIS[1] is having a single map that you can throw everything into and it just puts it in the right place. This is one of the best features of mapview too, you give it metadata-aware objects and it puts it into its metadata-aware viewport. The coordinate system used is a property of the map, and a usage requirement is that data coming in has the right metadata for itself, something that can be transformed to a common crs if needed. This is used extensively in modern systems like GDAL and Manifold GIS, operations don't choke on data in different coordinate systems, or error and say there's no common crs, or silently ignore missing or different crs, they reproject to a common one and finish the task. If the metadata is not set on one obviously that's an error, Manifold notably forces you (softly) to have confirmed that the metadata was correct when first interpreted.

  1. I have to caveat this, there's no such thing as "GIS" as if it's a single thing, it's a hot-bed of dozens of competing softwares and ideas. R currently has the best overall transcendent capability, but it's fragmented and hardly uniform.

The raster package works like this (for the most part, it will warn when reprojecting when doing extractions), and ggplot2 obviously works like this. It's the big hole in R graphics that the viewports don't have a standard metadata registration for the coordinate system in use. Some packages do, I believe mapproj and oce do have a global registration for the current plot, it's something that R is sorely missing.

If our data structures cannot store long-lat and X-Y side by side that's a different kind of problem, but really there should be no issue with reprojecting when needed. Simple features now can store multiple geometries, and for table-forms of data it's trivial to record the original coordinate values in extra columns (that's something that would be valuable for the actual times that reprojection is lossy).

Personally, I see visualization and extractions and geometric/topological modifications as operations that extend and augment a data set - it's the business of the user to decide how to preserve the original data, and there are many options available to them. Putting in limitations to protect users from themselves based on conservative old ideas will just hold us all back. Failing fast is critical, and the best way to see failure is to visualize it.

The alternative, and I'd argue we don't see this often because it's so painful to do and get right is workflows like this:

## PSEUDOCODE


prj <- "+proj=laea +long_0=147 ..." whatever my favourite projection
a <- readshapefile("a.shp")
b <- readshapefile("b.shp")

## hmm which one needs reprojecting?
## no error, no warning
plot(a); plot(b, add = TRUE)

## easy, just use the common one, but awkward to have to transform everything
plot(spTransform(a, prj)); plot(spTransform(b, prj), add = TRUE)

## here we may need to transform to a third common map
plot(somemap); plot(a, add = TRUE); plot(b, add = TRUE)

Mapview makes this all much better.

## PSEUDOCODE

## common, the right place to change the map is in the first mapview()
mapview(projection = prj) %>% mapview(a) %>% mapview(b)

## or at the end as a global viewport transformation (currently not well supported but logically sound)
mapview() %>% mapview(a) %>% mapview(b) %>% coordsys(projection = prj)

We should be using map projections all the time, it's often the right solution that cuts across a lot of other problems. I think it's absolutely tragic that an understanding of the basics of map projections isn't in the standard arsenal for R users. It really matters, but putting in obstacles is not going to foster greater understanding. When the leaflet-web world catches up to modern GIS and we can control the global map as a choice (not just web Mercator) it will be huge advance for many things we do in spatial.

@tim-salabim
Copy link
Contributor

Another aspect of this just came up in an sf issue. This is regarding the fact that proj4strings cannot be checked literally, only sematically as

"+proj=longlat +ellps=WGS84 +no_defs"
is equivalent (sematically valid) to
"+proj=longlat +datum=WGS84 +no_defs"

So, checking only for "+datum=WGS84" is not enough in some cases.
I think leaflet should either check for all possibly valid longlat expressions or not emit a warning at all. See here for a possible way to check for sematically valid lonlat definitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants