-
-
Notifications
You must be signed in to change notification settings - Fork 953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the file argument to UploadFile required #1413
Conversation
file: typing.BinaryIO | ||
headers: "Headers" | ||
|
||
def __init__( | ||
self, | ||
filename: str, | ||
file: typing.Optional[typing.BinaryIO] = None, | ||
file: typing.BinaryIO, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also make this strictly file: SpooledTemporaryFile
since that's the only way it's ever used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typing.BinaryIO
makes more sense to me.
It's an implementation detail that we happen to use SpooledTemporaryFile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typing-wise, isn't there a better type for this situation? Also... It's not an implementation detail, it's written on our docs that the file is a SpooledTemporaryFile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now there's three options:
- Use
IO[bytes]
- Use
SpooledTemporaryFile
directly - Write a protocol for just the methods we need
Source: python/typing#829 (comment)
I don't think we should choose (2) and I think we should change our docs. Let's not lock ourselves into concrete implementations if we don't need to.
I think (1) has issues (as mentioned in that discussion and others) but is widely used so it's pretty defensible if a user had to add a # type: ignore
to that parameter if they pass in some less common file-like object.
(3) is the correct option and would look like:
class FileLike(Protocol):
def write(self, __data: bytes) -> None: ...
def read(self, __n: int) -> bytes: ...
def seek(self, __pos: int) -> None: ...
def close(self) -> None: ...
I think the main con here is not the LOC or complexity, the main con is that if we want to start using other methods/attributes of IO
in the future it would technically be a breaking change. But maybe that's a good thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change our docs
That's enough for me. 👍
file: typing.BinaryIO | ||
headers: "Headers" | ||
|
||
def __init__( | ||
self, | ||
filename: str, | ||
file: typing.Optional[typing.BinaryIO] = None, | ||
file: typing.BinaryIO, | ||
content_type: str = "", | ||
*, | ||
headers: "typing.Optional[Headers]" = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're changing the (internal-ish) interface at this point, should we switch it around so that file
is the only required argument?...
def __init__(
self,
file: typing.Optional[typing.BinaryIO],
*,
filename: str = "",
content_type: str = "",
headers: "typing.Optional[Headers]" = None,
):
self.file = file
self.filename = filename or getattr(file, 'name', '')
self.content_type = content_type
self.headers = headers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep makes sense. How do you feel about:
def __init__(
self,
file: typing.Optional[typing.BinaryIO],
*,
filename: typing.Optional[str] = None,
content_type: typing.Optional[str] = None,
headers: typing.Optional[Headers] = None,
):
...
That way you can differentiate between a filename or content-type that was an empty string or one that was missing altogether (I know ""
may not be valid within the context of a file name on disk and such, but it's still possible in other contexts).
self.filename = filename or getattr(file, 'name', '')
Hmm about this. I feel like it could lead to confusion. I would rather UploadFile
just blindly accept it as a parameter and assign it. I think users already have access to .file
so if they want the filename of the actual file they should do UploadFile.file.name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UploadedFile.filename
was sanitized, what guarantee we provide that file.name
also is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it would be best to just make it a constructor parameter and not fall back to file.name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, do we want filename
and content_type
at all here?
They're derived from the headers, so why not switch to...
class UploadFile:
def __init__(
self,
file: typing.Optional[typing.BinaryIO],
*,
headers: typing.Optional[Headers] = None,
):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote would be to make content_type
a property and keep filename.
The main reason for this is a practical one: extracting the filename from the headers requires knowledge that the request was originally a multipart/form-data request (that is, as far as I know, the only way a client would include that in the content-disposition
header). If we want UploadFile
to be decoupled from form parsing, I don't think it should assume that the headers came from a multipart request.
Also not every implementer of BinaryIO
has a .file
attribute (e.g. BytesIO
does not). So if we don't pass in that information at all then users might not be able to access it (I know in practice we always pass in a SpooledTemporaryFile, but the goal here is for that to be an implementation detail as much as possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay - that's not unreasonable.
The one other practical case where I could see UploadFile
being used is for raw binary uploads.
Eg. PUT https://www.example.com/uploads/my-file-upload.txt
In that kinda use-case it's useful having filename
be something that can be provided explicitly, so that it can be derived from the URL path, rather than the HTTP headers.
So yeah, sounds good to me. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that is a use case I personally really want: /~https://github.com/adriangb/xpresso/blob/c4455b29d323df1c59808ccafb306b9e7bbd80d1/xpresso/binders/_extractors/body/file.py#L40
I'm glad you brought it up and not me, I didn't want to introduce my biases 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I already pushed these changes / implementation @tomchristie, so I think this is ready for one last round of review for any typos/nits and then maybe a ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi all.
It would be nice if the release could include the removal of content_type
from the UploadFile, as this is a breaking change.
…gb/starlette into make-file-strict-argument
@tomchristie might be nice to get this into 0.18.0 since it has some breaking functionality and seems to be ready to go |
I think the only thing we'd need to change is the use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing missing here is change our docs, as per #1413 (comment).
We need to check if this breaks FastAPI |
I was able to run FastAPI's test suite with no errors on this branch as long as I ignored new deprecation warnings (which is unrelated to this PR). |
@Kludex you'll need to re-approve to get this merged |
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
Does this PR also solves the first point on here #1888 (comment) by any chance? |
I don’t think so because nothing is a context manager still. #1903 does though |
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
From #1406 (comment).
I'm sticking to a subset of the changes: just making the
file
parameter required and having the form parser construct the file. We can do headers,size
or other changes separately.This makes the form parser responsible for creating the file.
Since UploadFile no longer has the ability to set the spooled max size, I am also adding an argument to
request.form()
to set that.