Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes issue with non-seekable sources breaking StreamEncryptor and StreamDecryptor handling #14

Merged
merged 14 commits into from
Sep 28, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions src/aws_encryption_sdk/internal/utils/streams.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
"""Helper stream utility objects for AWS Encryption SDK."""
import io

from aws_encryption_sdk.exceptions import ActionNotAllowedError


class PassThroughStream(object):
"""Provides a pass-through interface on top of a stream object.

:param source_stream: File-like object
"""

def __init__(self, source_stream):
"""Prepares the passthroughs."""
self._source_stream = source_stream
self._duplicate_api()

def _duplicate_api(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might just be that Python isn't my daily driver, but I want to ask about a few things:

  1. Intentionally dropping private-hinted methods: If the language idiom is that privacy is a hint, not a requirement, isn't there a risk that this will break callers of wrapped streams in funky ways? (I know this is also marked as internal, but if it's hints all the way down...)
  2. Naming is a bit squishy: you refer to attributes on the LHS and methods on the RHS, but it looks like you're getting everything not prefixed with _, not only methods
  3. Again this might be at least partially an idiom thing, but rather than trying to assume the methods and attributes of the source_stream, would it make sense to have PassThroughStream itself inherit from the base stream type (if such a thing practically exists) and have each of the defined stream methods delegate to the underlying source_stream? Or, if not, should it instead copy everything?

For example: say we get a stream object CoolStream that is itself customized and has a (for example) _report_bytes_() API that is called by the local implementation of write(). PassThroughStream copies write() but not _report_bytes(), so when write() is invoked, it unexpectedly fails. Alternatively, if PassThroughStream is say an IOBase (for example; might not be the right choice) that has all the standard stream methods, PassThroughStream.write() can call source_stream.write() and treat the wrapped stream as more of a black box. I think that would drop some edge cases but preserve intent/functionality here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the attributes/methods. Fixed that.

The problem with just subclassing io.BaseIO is that these are really meant to front file-like objects, not just streams (fixed that terminology in the comments too). File-like objects (of which streams are a subset) have a consistent API but it is also quite large, so I would rather avoid manually duplicating it.

Effectively what this is doing is duplicating the public API of _source_stream onto the instance of PassThroughStream in the form of pointers to the attributes on _source_stream. This makes it so that any public interactions with the instance of PassThroughStream behave exactly as they would if they were made to _source_stream directly. Behavior of calls inside _source_stream are unaffected, as they are still happening within the context of that (unmodified) object.

Aside from the obvious overrides, the only place where this should introduce any unexpected behavior is if someone is making calls to private-hinted attributes on _source_stream, and I think that is a safe place to draw the line, and it matches with the convention laid out in PEP8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline discussion: replacing PassThroughStream with ProxyObject from wrapt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry for my confusion on the API duplication, I did not remember that interaction correctly. wrapt is a good find, that turned out as a nice clean drop-in.

"""Maps the source stream API onto this object."""
source_attributes = set([
method for method in dir(self._source_stream)
if not method.startswith('_')
])
self_attributes = set(dir(self))
for attribute in source_attributes.difference(self_attributes):
setattr(self, attribute, getattr(self._source_stream, attribute))


class ROStream(PassThroughStream):
"""Provides a read-only interface on top of a stream object.

Used to provide MasterKeyProviders with read-only access to plaintext.

:param source_stream: File-like object
"""

def write(self, b): # pylint: disable=unused-argument
"""Blocks calls to write.

:raises ActionNotAllowedError: when called
"""
raise ActionNotAllowedError('Write not allowed on ROStream objects')


class TeeStream(PassThroughStream):
"""Provides a ``tee``-like interface on top of a stream object, which collects read bytes
into a local :class:`io.BytesIO`.

:param source_stream: File-like object
"""

def __init__(self, source_stream):
"""Creates the local tee stream."""
self.tee = io.BytesIO()
super(TeeStream, self).__init__(source_stream)

def read(self, b=None):
"""Reads data from source, copying it into ``tee`` before returning.

:param int b: number of bytes to read
"""
data = self._source_stream.read(b)
self.tee.write(data)
return data
29 changes: 29 additions & 0 deletions test/functional/test_f_aws_encryption_sdk_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
from aws_encryption_sdk.identifiers import Algorithm, EncryptionKeyType, WrappingAlgorithm
from aws_encryption_sdk.internal.crypto.wrapping_keys import WrappingKey
from aws_encryption_sdk.internal.formatting.encryption_context import serialize_encryption_context
from aws_encryption_sdk.internal.utils import ROStream
from aws_encryption_sdk.key_providers.base import MasterKeyProviderConfig
from aws_encryption_sdk.key_providers.raw import RawMasterKeyProvider
from aws_encryption_sdk.materials_managers import DecryptionMaterialsRequest, EncryptionMaterialsRequest
Expand Down Expand Up @@ -534,3 +535,31 @@ def test_encrypt_source_length_enforcement_legacy_support():
key_provider=key_provider,
source_length=int(len(VALUES['plaintext_128']) / 2)
)


class NoSeekBytesIO(io.BytesIO):
"""``io.BytesIO`` that blocks ``seek()`` and ``tell()``."""

def seekable(self):
return False

def seek(self, offset, whence=0):
raise NotImplementedError('seek is blocked')

def tell(self):
raise NotImplementedError('tell is blocked')


def test_decrypt_no_seek_input():
"""Test that StreamDecryptor can decrypt an input stream that is not seekable."""
key_provider = fake_kms_key_provider()
ciphertext, _header = aws_encryption_sdk.encrypt(
source=VALUES['plaintext_128'],
key_provider=key_provider,
encryption_context=VALUES['encryption_context']
)
ciphertext_no_seek = NoSeekBytesIO(ciphertext)
aws_encryption_sdk.decrypt(
source=ciphertext_no_seek,
key_provider=key_provider
)