- Update S3ClientConfig to pass in the configuration for allowing unsigned requests, under boolean flag
unsigned
. - Improve the performance of
s3reader
when utilized withpytorch.load
by incorporating support for thereadinto
method. - Add support for passing an optional custom endpoint to S3LightningCheckpoint constructor method.
- Expose a new class, S3ClientConfig, with
throughput_target_gbps
andpart_size
parameters of the inner S3 client.
- Separate completely Rust logs and Python logs. Logs from Rust components used for debugging purposes are configured through the following environment variables: S3_TORCH_CONNECTOR_DEBUG_LOGS, S3_TORCH_CONNECTOR_LOGS_DIR_PATH.
- Add PyTorch Lightning checkpoints support
- Fix deadlock when enabling CRT debug logs. Removed former experimental method _enable_debug_logging().
- Refactor User-Agent setup for extensibility.
- Update lightning User-Agent prefix to
s3torchconnector/{__version__} (lightning; {lightning.__version__}
.
- No breaking changes.
- Support for Python 3.12.
- Additional logging when constructing Datasets, and when making requests to S3.
- Provide tooling for running benchmarks for S3 Connector for Pytorch.
- Update crates and Mountpoint dependencies.
- [Experimental] Allow passing in the S3 endpoint URL to Dataset constructors.
- HeadObject is no longer called when constructing datasets with
from_prefix
and seeking relative to end of file.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- Expose a logging method for enabling debug logs of the inner dependencies.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- Avoid excessive memory consumption when utilizing s3map_dataset. Issue #89.
- Run all tests against S3 and S3 Express.
- No breaking changes.
- The Amazon S3 Connector for PyTorch now supports S3 Express One Zone directory buckets.
- No breaking changes.
- The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
- S3IterableDataset and S3MapDataset, which allow building either an iterable-style or map-style dataset, using your S3 stored data, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is in.
- Support for multiprocess data loading for the above datasets.
- S3Checkpoint, an interface for saving and loading model checkpoints directly to and from an S3 bucket.
- No breaking changes.