Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Sentinel Hub Batch API #52

Closed
3 tasks
echeipesh opened this issue Aug 10, 2020 · 7 comments
Closed
3 tasks

Enable Sentinel Hub Batch API #52

echeipesh opened this issue Aug 10, 2020 · 7 comments

Comments

@echeipesh
Copy link
Contributor

After exploration of #39 it seems like the AWS Sentinel-1 catalogs are not as useful for this project as initially expected.

We may still be able to use it, pending research in #49 ... however looking at Sentinel Hub as Sentinel 1 data source has been suggested by our advisors.

At first glance it looks like a very attractive option.

  • It provides Sentinel-1 GRD orthorectified Sigma0 product.
  • Following the pricing guide making requests for area the size of France would result in cost of 4000 PU out of 70,000 allotment a month
  • There is Batch API available: https://docs.sentinel-hub.com/api/latest/api/batch/

Given that flood events are relatively rare and happen over limited area our data requirements will likely be quite modest even allowing for iteration.

This issue is to more carefully evaluate the option

  • What does integration into our training workflows look like?
  • throughput is limited, is it sufficient?
  • Does it still make sense to index Sentinel-1 scenes based on this requirement?
@echeipesh echeipesh added this to the GT Spring 8/7 - 8/20/2020 milestone Aug 10, 2020
@echeipesh
Copy link
Contributor Author

Screen Shot 2020-08-11 at 11 02 31 AM

@echeipesh echeipesh changed the title Evaluate Sentinel Hub Batch API Enable Sentinel Hub Batch API Aug 14, 2020
@CloudNiner
Copy link
Contributor

[WIP] list of questions to ask as I'm looking at Batch Processing API:

  1. Can we get COGs already reprojected to 4326 in output? Docs state that the output is always local UTM projection because they're conforming to tiling grids. https://docs.sentinel-hub.com/api/latest/api/batch/#processing-results

@CloudNiner
Copy link
Contributor

CloudNiner commented Aug 20, 2020

Initial Research

Goal

Generate orthorectified sigma0 sentinel 1 grd chips in an s3 bucket under Azavea control via the SentinelHub Batch Process API (beta).

Summary of Findings

The API is straightforward to use, and Batch Process jobs (once submitted) only take a few minutes to complete. The Kansas flood used ~850 processing units out of available 30,000 / mo on our current demo plan.

Results are single band COGs of a data mask, VV and VH band for the s1 grd orthorectified sigma0 chips. Chips use one of three processing "grids" that output small chips in local UTM. IIUC, we'd have to post process merge these output chips into a single 4326 COG for input to Raster Vision.

The evalscript v3 spec allows essentially arbitrary per-pixel processing of the input bands mapped to user specified output bands/tifs/images.

tl;dr @jamesmcclain and/or @echeipesh should take a look at the evalscript v3 and tiling grid specs linked in the paragraph above to determine whether this API still meets our needs. Otherwise, processing is pretty quick, straightforward, and won't too quickly run us out of credits. I could see us wrapping the API calls in a reusable Python or Node CLI in a day or two of effort. We'd get somewhere on the order of 30-40 Batch Process jobs / mo on the current plan assuming average size is about the sen1floods11 Kansas bbox.

Process

SentinelHub Batch Process API Workflow

The docs at the link in the goal section above have a nice diagram and description of the steps necessary to perform a batch process request.

High level steps:

  1. Get bbox + time range that contains imagery of interest via their stac search endpoint
  2. Create Batch Process job via API
  3. Analyze Batch Process via API request to determine cost in Processing Units and number of output tiles
  4. Trigger Batch Process job via API
  5. Wait for results, can occasionally query API for status
  6. Inspect results in s3 bucket!

User AWS S3 Bucket Configuration

In order to run Batch Process requests, the API needs to be able to put results into a user-owned S3 bucket. I created and configured the bucket noaafloodmapping-sentinelhub-batch-eu-central-1 in the eu-central-1 region of the noaa-flood AWS account according to https://docs.sentinel-hub.com/api/latest/api/batch/#bucket-settings

Details

Before running a Batch Process request, users can verify that imagery of the desired type (in this case sentinel 1 grd) is available via their STAC search endpoint (or you could skip this and just YOLO it). Here we request the approx bbox and time of the sen1floods11 Kansas flood.

## SentinelHub BBox / Time Range Search
# Search SentinelHub by bounding box and datetime range.
# 
# Endpoint docs: https://docs.sentinel-hub.com/api/latest/api/catalog/examples/#search-with-distinct
curl -X "POST" "https://services.sentinel-hub.com/api/v1/catalog/search" \
     -H 'Content-Type: application/json' \
     -H 'Authorization: Bearer <token>' \
     -d $'{
  "limit": 1,
  "fields": {
    "include": [
      "properties.eo:gsd"
    ]
  },
  "datetime": "2019-05-22T00:00:00Z/2019-05-23T00:00:00Z",
  "collections": [
    "sentinel-1-grd"
  ],
  "bbox": [
    -94.3670654296875,
    38.371808917147554,
    -95.712890625,
    40.30885442563764
  ]
}'

Next, generate a Batch Process request for orthorectified sigma0 sentinel 1 grd tiles for the bounding box of the sen1floods11 Kansas flood on May 22, 2019 with the approx bounding box -94.36707,38.37181,-95.71289,40.30885. There are extensive docs on how to write evalscript v3 but here in this example we just pass through the bands of interest so that we write three separate single band tiles to the output, one for VV, VHanddataMask`.

Here's the JSON body for that request:

## Create Batch Process
# https://docs.sentinel-hub.com/api/latest/reference/#operation/createNewBatchProcessingRequest
curl -X "POST" "https://services.sentinel-hub.com/api/v1/batch/process/" \
     -H 'Authorization: Bearer <token>' \
     -H 'Content-Type: application/json' \
     -d $'{
  "tillingGridId": 0,
  "output": {
    "cogOutput": true,
    "defaultTilePath": "s3://noaafloodmapping-sentinelhub-batch-eu-central-1/<requestId>/<tileName>/<outputId>.tiff"
  },
  "description": "Azavea Test: US KS Flood",
  "processRequest": {
    "input": {
      "bounds": {
        "geometry": {
          "type": "Polygon",
          "coordinates": [
            [
              [
                -95.712890625,
                38.371808917147554
              ],
              [
                -94.3670654296875,
                38.371808917147554
              ],
              [
                -94.3670654296875,
                40.30885442563764
              ],
              [
                -95.712890625,
                40.30885442563764
              ],
              [
                -95.712890625,
                38.371808917147554
              ]
            ]
          ]
        },
        "properties": {
          "crs": "http://www.opengis.net/def/crs/EPSG/0/4326"
        }
      },
      "data": [
        {
          "type": "S1GRD",
          "processing": {
            "backCoeff": "SIGMA0_ELLIPSOID",
            "orthorectify": true
          },
          "dataFilter": {
            "timeRange": {
              "to": "2020-05-23T00:00:00Z",
              "from": "2020-05-22T00:00:00Z"
            },
            "polarization": "DV",
            "acquisitionMode": "IW",
            "resolution": "HIGH"
          }
        }
      ]
    },
    "evalscript": "//VERSION=3\\nfunction setup() {\\n  return {\\n    input: [\\"VV\\", \\"VH\\", \\"dataMask\\"],\\n    output: [{\\n      id: \\"VV\\",\\n      bands: 1,\\n      sampleType: \\"FLOAT32\\"\\n      },{\\n      id: \\"VH\\",\\n      bands: 1,\\n      sampleType: \\"FLOAT32\\"\\n      },{\\n      id: \\"MASK\\",\\n      bands: 1,\\n      sampleType: \\"UINT8\\"}\\n    ]\\n  };\\n}\\n\\nfunction evaluatePixel(samples) {\\n  return {\\n    VV: [samples.VV],\\n    VH: [samples.VH],\\n    MASK: [samples.dataMask]\\n  };\\n}",
    "output": {
      "responses": [
        {
          "format": {
            "type": "image/tiff"
          },
          "identifier": "VV"
        },
        {
          "format": {
            "type": "image/tiff"
          },
          "identifier": "VH"
        },
        {
          "format": {
            "type": "image/tiff"
          },
          "identifier": "MASK"
        }
      ]
    }
  },
  "resolution": "10"
}'

Next we analyze the request we created which will tell us the cost, with a POST https://services.sentinel-hub.com/api/v1/batch/process/fbabeaa1-6378-4242-8149-9ec0f26de989/analyse.

This request returns a 204 so you have to query the Batch Process detail endpoint at GET https://services.sentinel-hub.com/api/v1/batch/process/fbabeaa1-6378-4242-8149-9ec0f26de989 and wait for status ANALYSIS_DONE. At that point the tileCount and valueEstimate fields will be set, in the case of this request, that's 84 tiles and 856 PU respectively.

If the cost and tile count appear acceptable, the Batch Process job can be started with POST https://services.sentinel-hub.com/api/v1/batch/process/fbabeaa1-6378-4242-8149-9ec0f26de989/start

Again, this returns a 204 and you can query the same GET https://services.sentinel-hub.com/api/v1/batch/process/fbabeaa1-6378-4242-8149-9ec0f26de989 until the status is DONE. At that point you can query the key path you provided in the output.defaultTilePath of the original Batch Process creation request for the generated tiles. In our case, that is:

➜  ~ aws s3 ls --summarize s3://noaafloodmapping-sentinelhub-batch-eu-central-1/fbabeaa1-6378-4242-8149-9ec0f26de989/    <aws:noaa-flood>
                           PRE 15STC_3_0/
                           PRE 15STC_3_1/
                           PRE 15STC_3_2/
                            ...
2020-08-20 15:55:59       3094 request.json

Total Objects: 1
   Total Size: 3094

SUCCESS!!!

I made all my requests with the MacOS Paw API client, here's a project that can be loaded if you have the software. If not, let me know, it looks like I can export the project in Postman format.

sentinel-hub.paw.zip

@CloudNiner
Copy link
Contributor

Two follow up requests so far during an offline discussion with @echeipesh:

  1. Are the output chips of this example interchangeable with the sen1floods11 chips. In other words, are the VV and VH bands about the same values in both tifs.
  2. Can we quickly generate a Python API client with one of the many api client creator libraries that exist on GitHub?

@CloudNiner
Copy link
Contributor

CloudNiner commented Aug 21, 2020

1

Are the output chips of this example interchangeable with the sen1floods11 chips. In other words, are the VV and VH bands about the same values in both tifs?

I'm honestly still not sure. I've been able to make the case that they're close with a few assumptions:

  1. You must perform an evalscript operation using the toDB function defined here in order to convert the values in the sentinel hub tifs to decibels. Note that the example uses gamma0 instead of sigma0, but they're both backscatter coefficients so I believe you can perform the decibel conversion on sigma0 as well.
  2. Both Google Earth Engine S1 data used by sen1floods11 and SentinelHub S1 data are orthorectified. However, SentinelHub is orthorectified with Mapzen DEM and GEE S1 data is orthorectified with SRTM 30m or Aster DEM (search page for "orthorectification".
  3. The normalization range of [0, -20] chosen in (1) is correct. It's unclear to me if the "VH Threshold" column in sen1floods11 paper, Figure 3 is what should actually be used for the lower bound of said normalization instead of a fixed -20.

Holy crap research in this area must be frustrating with multiple different data sources that are all the same but slightly different.

2

Can we quickly generate a Python API client with one of the many api client creator libraries that exist on GitHub?

Probably yes. SentinelHub provides an OpenAPI 3.0.2 specification for v1.0 of the SentinelHub API which includes the Batch Processing endpoints. I was able to build a python client that includes the Batch Process endpoints with:

swagger-codegen generate -i ~/Downloads/openapi.v1.yaml -l python -o ./sentinel_hub_python_client

There's no param checking or anything for the POST Create Batch Process endpoint and the work of doing stuff like reading and formatting an evalscript from some file is left as an exercise for the reader.

Another option is to write the endpoints we need as a contribution to the SentinelHub python library. I verified that they'd eventually want these endpoints in this library in sentinel-hub/sentinelhub-py#136

@batic
Copy link

batic commented Feb 13, 2021

I have found this issue following a thread of breadcrumbs from sentinelhub-py issue.

I just wanted to let you know that the Sentinel Hub now offers Sentinel-1 ortorectification with Copernicus DEM (10m resolution within EEA, 30m worldwide) as well as radiometric terrain correction. See processing options or inspect it on EOBrowser

@jamesmcclain
Copy link
Contributor

I have found this issue following a thread of breadcrumbs from sentinelhub-py issue.

I just wanted to let you know that the Sentinel Hub now offers Sentinel-1 ortorectification with Copernicus DEM (10m resolution within EEA, 30m worldwide) as well as radiometric terrain correction. See processing options or inspect it on EOBrowser

This is a nice find. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants