PyRasterFrames enables access and processing of geospatial raster data in PySpark DataFrames.
The quickest way to get started is to pip
install the pyrasterframes package.
pip install pyrasterframes
You can then access a pyspark SparkSession
using the local[*]
master in your python interpreter as follows.
from pyrasterframes.utils import create_rf_spark_session
spark = create_rf_spark_session()
Then you can read a raster and do some work with it.
from pyrasterframes.rasterfunctions import *
from pyspark.sql.functions import lit
# Read a MODIS surface reflectance granule
df ='')
# Add 3 element-wise, show some rows of the dataframe, lit(3))).show(5, False)
Reach out to us on gitter!
Issue tracking is through github.
Community contributions are always welcome. To get started, please review our contribution guidelines, code of conduct, and developer's guide. Reach out to us on gitter so the community can help you get started!
For best results, we suggest using conda
and the conda-forge
channel to install the compiled dependencies before installing the packages in
. Assuming you're in the same directory as this file:
conda create -n rasterframes python==3.7
conda install --file ./requirements-condaforge.txt
Then you can install the source dependencies:
pip install -e .