Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Fix cached dependencies when using a dependency in Security() and other places (e.g. Depends()) with different OAuth2 scopes #2945

Merged
merged 2 commits into from
Aug 23, 2022

Conversation

laggardkernel
Copy link
Contributor

@laggardkernel laggardkernel commented Mar 13, 2021

class Dependant:
    def __init__(...):
        ...
        # Save the cache key at creation to optimize performance
        self.cache_key = (self.call, tuple(sorted(set(self.security_scopes or []))))

To make .security_scopes contribute to .cache_key, parameter security_scopes must be passed to Dependant.__init__(). We forgot to do it in this way. Reassignment after init dependant_instance.security_scopes = ["dummy"] doesn't change the .cache_key anymore. Main related logic is at get_sub_dependant().

This makes Security with different scopes (OAuth2 scopes) share the same cache. If we have Security(func) and Security(func, scopes=['scope1']) in a dependency tree. Only one of them will be really called.

The bug could be traced back to bff5dbb on 6/6/19 when the dependency_cache is first introduced. Seems a serious bug for anyone using Security in dependencies, better fix it soon.

@codecov
Copy link

codecov bot commented Mar 13, 2021

Codecov Report

Merging #2945 (3399de4) into master (982911f) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master     #2945   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          533       533           
  Lines        13746     13755    +9     
=========================================
+ Hits         13746     13755    +9     
Impacted Files Coverage Δ
fastapi/dependencies/utils.py 100.00% <100.00%> (ø)
tests/test_dependency_cache.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link
Contributor

📝 Docs preview for commit d8b63cdaebaf9c37b195bfa0bca54f3b9b0c8fa2 at: https://604c78b2648d6ca4039cc914--fastapi.netlify.app

@laggardkernel laggardkernel changed the title Fix Dependant cache_key for Security [Bugfix] Fix Dependant cache_key for Security Mar 27, 2021
@laggardkernel
Copy link
Contributor Author

laggardkernel commented Jun 15, 2021

I've also considered implementing Dependant.cache_key as a property. But current fix is enough

  • it keeps the designed made by the author to pass scopes during __init__()
  • Dependant is a class for internal use, and Dependant.__init__() is only called few times in the source code

@laggardkernel
Copy link
Contributor Author

laggardkernel commented Jun 16, 2021

Updated once after tweaking the related test. Also rebased the branch to current master in this force push.

@tiangolo tiangolo changed the title [Bugfix] Fix Dependant cache_key for Security 🐛 Fix cached dependencies when using a dependency in Security() and other places (e.g. Depends()) with different OAuth2 scopes Aug 23, 2022
@github-actions
Copy link
Contributor

📝 Docs preview for commit 3399de4 at: https://6304d5c087148506d673b746--fastapi.netlify.app

@tiangolo
Copy link
Member

Thanks for the detailed explanation and for your contribution @laggardkernel! 🙇 🚀 🍰

@qmorek
Copy link

qmorek commented Sep 4, 2022

Hi guys,
it looks that changes from this PR broke my privs checking mechanism.
Simplest code to demonstrate it is below:

from fastapi import FastAPI, Depends, Security
from fastapi.security import SecurityScopes


def get_obj(
    required_scopes: SecurityScopes,
):
    print(f'get_obj called with scopes: "{required_scopes.scope_str}"')
    return 'object'


def check_user_role_for_obj(
    required_scopes: SecurityScopes,
    obj=Depends(get_obj),
):
    print(f'check_user_role_for_obj called for {obj} with scopes:, "{required_scopes.scope_str}"')
    return 'object role'


app = FastAPI()


@app.get(
    '/',
    dependencies=[Security(check_user_role_for_obj, scopes=['read_scope'])],
)
def get_obj_with_scopes(
    obj=Depends(get_obj),
):
    return obj


@app.put(
    '/',
    dependencies=[Security(check_user_role_for_obj, scopes=['edit_scope'])],
)
def put_obj_with_scopes(
    obj=Depends(get_obj),
):
    return obj


@app.get(
    '/child_obj',
    dependencies=[Security(check_user_role_for_obj, scopes=['read_child_scope'])],
)
def get_obj_with_scopes():
    return 'child object'

Here is output from calling get_obj_with_scopes endpoint:

INFO:     127.0.0.1:58537 - "GET / HTTP/1.1" 200 OK
get_obj called with scopes: "read_scope"
check_user_role_for_obj called for object with scopes:, "read_scope"
get_obj called with scopes: ""

as you can see get_obj was called twice....
I understand that this change was necessary as I was also suprised that I can not use different scopes for the same dependencies, but thought that it is by desing ;]
Maybe my understanding of this is not correct, but in such case get_obj itself is only used with Depends, so no specific scopes are requested for it, so maybe it should be cached?
Especially that I don't even need/use required_scopes: SecurityScopes in get_obj function, so why differentiate cache key by it? Maybe it should depend if get_obj has SecurityScopes in its parameters list?

I am just throwing it for consideration.
As for now I will fix my FastAPI version to 0.79.1 and maybe I need to rethink how my privs model is designed.
Good Job guys!

@nparley
Copy link

nparley commented Sep 5, 2022

Using a database session inside a Security and Depends also will open two sessions after this PR. Thus if the security method if returning a sqlalchemy the object return from the two methods will be attached to different sessions.

@qmorek
Copy link

qmorek commented Sep 20, 2022

Hi @tiangolo,
maybe my case is kind specific to how I designed my privileges model, but creating multiple db sessions may lead to unexpected errors... I am not sure if I fully understand how dependency cache is working, but maybe it should be divided to 2 parts first scopes dependant and another not?
Can you please have a look on this?

@laggardkernel
Copy link
Contributor Author

laggardkernel commented Sep 21, 2022

To understand dependency cache, we may start from what features are added by FastAPI upon Starlette.

  • When a request is received from the client at the server, it's received as bytes
  • The ASGI server parse the bytes to environ, recv(), send() (ASGI standard)
  • Starlette abstract the environ dict and recv() as a Request, which is a more developer-friendly Python object.

In the view/router function, we parse the Request (e.g. it's query param, json body, form body, etc), get the info we want, and compose a Response object.

FastAPI is capable to let developers declare the query param, or form, or json body as arguments of the view function. It helps us use the more finer part of Request directly without using the whole Request.

The introduction of dependency (Depends, Security) makes FastAPI more powerful. We can declare some composite value based on the primary data (query param, json body, form element, etc). e.g.

async def sum_c(a: int = 0, b: int = 0):
    return a + b

@app.get("/sum")
async def calc_sum(c: int = Depends(sum_c)):
    return c

In the above example, the view func doesn't accept query param a or b, but a composite param named c, which is the sum of a + b.

To avoid duplicated, high cost calculation in some Depends, FastAPI caches the result.

  • For Depends, the cache key is based on the func wrapped by Depends
  • For Security (subclass of Depends), the cache key is based on the wrapped func and param scopes. (Formerly scopes are ignored cause a bug in the implementation, and this PR fixed it.)

nparley qmorek For your questions, you may separate db session initialization out of existing Securitys as a standalone Depends, and re-used it as an argument in the existing multiple Securitys.
This makes the new Depends, namely db session initialization process, has its own cache key, being executed only once, and not affected by scopes of Security anymore.
(Nope. I forgot that scopes is inherited in the dependency chain. Two Depends under two different Security have different cache key.)

@qmorek
Copy link

qmorek commented Sep 21, 2022

Hi @laggardkernel,
thank you very much for your answer.

could you eplain what you mean by:

you may separate db session initialization out of existing Securitys as a standalone Depends, and re-used it as an argument in the existing multiple Securitys.

@qmorek
Copy link

qmorek commented Sep 21, 2022

I took example from: https://fastapi.tiangolo.com/advanced/security/oauth2-scopes/#dependency-tree-and-scopes
and modified it slightly (paswords are in cleartext, but it is not important here) with db dependency as in:
/~https://github.com/tiangolo/full-stack-fastapi-postgresql/blob/master/%7B%7Bcookiecutter.project_slug%7D%7D/backend/app/app/api/deps.py
and
/~https://github.com/tiangolo/full-stack-fastapi-postgresql/blob/master/%7B%7Bcookiecutter.project_slug%7D%7D/backend/app/app/api/api_v1/endpoints/items.py

from typing import Dict
from datetime import datetime, timedelta
from typing import List, Union

from fastapi import Depends, FastAPI, HTTPException, Security, status
from fastapi.security import (
    OAuth2PasswordBearer,
    OAuth2PasswordRequestForm,
    SecurityScopes,
)
from jose import JWTError, jwt
from pydantic import BaseModel, ValidationError

# to get a string like this run:
# openssl rand -hex 32
SECRET_KEY = "09d25e094faa6ca2556c818166b7a9563b93f7099f6f0f4caa6cf63b88e8d3e7"
ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_MINUTES = 30


fake_dict_db = {
    "users": {
        "johndoe": {
            "username": "johndoe",
            "full_name": "John Doe",
            "email": "johndoe@example.com",
            "cleartext_password": "password",
            "disabled": False,
        },
        "alice": {
            "username": "alice",
            "full_name": "Alice Chains",
            "email": "alicechains@example.com",
            "cleartext_password": "password",
            "disabled": True,
        },
    },
    "items": [
        {
            "name": "First"
        },
        {
            "name": "Second"
        }
    ]
}


class Token(BaseModel):
    access_token: str
    token_type: str


class TokenData(BaseModel):
    username: Union[str, None] = None
    scopes: List[str] = []


class User(BaseModel):
    username: str
    email: Union[str, None] = None
    full_name: Union[str, None] = None
    disabled: Union[bool, None] = None


class UserInDB(User):
    cleartext_password: str

oauth2_scheme = OAuth2PasswordBearer(
    tokenUrl="token",
    scopes={"me": "Read information about the current user.", "items": "Read items."},
)

app = FastAPI()


def verify_password(plain_password, db_password):
    return plain_password == db_password


def get_user(db, username: str):
    users_db = db['users']
    if username in users_db:
        user_dict = users_db[username]
        return UserInDB(**user_dict)


def authenticate_user(fake_db, username: str, password: str):
    user = get_user(fake_db, username)
    if not user:
        return False
    if not verify_password(password, user.cleartext_password):
        return False
    return user


def create_access_token(data: dict, expires_delta: Union[timedelta, None] = None):
    to_encode = data.copy()
    if expires_delta:
        expire = datetime.utcnow() + expires_delta
    else:
        expire = datetime.utcnow() + timedelta(minutes=15)
    to_encode.update({"exp": expire})
    encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
    return encoded_jwt


def get_db() -> Dict:
    print('get_db called')
    return fake_dict_db


async def get_current_user(
    security_scopes: SecurityScopes,
    db: Dict = Depends(get_db),
    token: str = Depends(oauth2_scheme),
):
    print('required scopes are:', security_scopes.scope_str)
    if security_scopes.scopes:
        authenticate_value = f'Bearer scope="{security_scopes.scope_str}"'
    else:
        authenticate_value = f"Bearer"
    credentials_exception = HTTPException(
        status_code=status.HTTP_401_UNAUTHORIZED,
        detail="Could not validate credentials",
        headers={"WWW-Authenticate": authenticate_value},
    )
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        username: str = payload.get("sub")
        if username is None:
            raise credentials_exception
        token_scopes = payload.get("scopes", [])
        token_data = TokenData(scopes=token_scopes, username=username)
    except (JWTError, ValidationError):
        raise credentials_exception
    user = get_user(db, username=token_data.username)
    if user is None:
        raise credentials_exception
    for scope in security_scopes.scopes:
        if scope not in token_data.scopes:
            raise HTTPException(
                status_code=status.HTTP_401_UNAUTHORIZED,
                detail="Not enough permissions",
                headers={"WWW-Authenticate": authenticate_value},
            )
    return user


async def get_current_active_user(
    current_user: User = Security(get_current_user, scopes=["me"])
):
    if current_user.disabled:
        raise HTTPException(status_code=400, detail="Inactive user")
    return current_user


@app.post("/token", response_model=Token)
async def login_for_access_token(
    db: Dict = Depends(get_db),
    form_data: OAuth2PasswordRequestForm = Depends(),
):
    user = authenticate_user(db, form_data.username, form_data.password)
    if not user:
        raise HTTPException(status_code=400, detail="Incorrect username or password")
    access_token_expires = timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
    access_token = create_access_token(
        data={"sub": user.username, "scopes": form_data.scopes},
        expires_delta=access_token_expires,
    )
    return {"access_token": access_token, "token_type": "bearer"}


@app.get("/users/me/", response_model=User)
async def read_users_me(current_user: User = Depends(get_current_active_user)):
    return current_user


@app.get("/users/me/items/")
async def read_own_items(
    db: Dict = Depends(get_db),
    current_user: User = Security(get_current_active_user, scopes=["items"])
):
    return {
        'items': db['items'],
        'current_user': current_user
    }


@app.get("/status/")
async def read_system_status(current_user: User = Depends(get_current_user)):
    return {"status": "ok"}

if you run this example code and call endpoint: /users/me/items/

you will see in console:

get_db called
get_db called
required scopes are: items me

so as you see get_db dependency is called twice...

Maybe my modifications are not correct, but it consistent with project template creted by tiangolo: /~https://github.com/tiangolo/full-stack-fastapi-postgresql
so in /users/me/items/ endpoint I added db dependency Depends(get_db) as I need to get items from fake db...

I am aware that it was created before this bug was discovered, so may also not be correct in this matter, but I would love to know how to do this correctly....

@laggardkernel
Copy link
Contributor Author

@qmorek Sorry, my former solution is wrong. I forgot something, the scopes is inherited by children in the dependency tree. In your example,

@app.get("/users/me/items/")
async def read_own_items(
    db: Dict = Depends(get_db),
    current_user: User = Security(get_current_active_user, scopes=["items"])
):

async def get_current_active_user(
    current_user: User = Security(get_current_user, scopes=["me"])
):

async def get_current_user(
    security_scopes: SecurityScopes,
    db: Dict = Depends(get_db),
    token: str = Depends(oauth2_scheme),
):

Depends(get_db) param in read_own_items() and Depend(get_db) from get_current_user() <- get_current_active_user() <- read_own_items() has different cache keys. The later inherits scopes "items" and "me" as part of the cache key. It makes get_db() executed twice during one request.

confirmed by printing dependency_cache

Two different cache keys

  • (<function get_db at 0x10eeebf70>, ())
  • (<function get_db at 0x10eeebf70>, ('items', 'me'))
get_db called: 2
dependency_cache: {(<function get_db at 0x10eeebf70>, ()): {'items': [{'name': 'First'},
                                                    {'name': 'Second'}],
                                          'users': {'alice': {'cleartext_password': 'password',
                                                              'disabled': True,
                                                              'email': 'alicechains@example.com',
                                                              'full_name': 'Alice '
                                                                           'Chains',
                                                              'username': 'alice'},
                                                    'johndoe': {'cleartext_password': 'password',
                                                                'disabled': False,
                                                                'email': 'johndoe@example.com',
                                                                'full_name': 'John '
                                                                             'Doe',
                                                                'username': 'johndoe'}}},
 (<function get_db at 0x10eeebf70>, ('items', 'me')): {'items': [{'name': 'First'},
                                                                 {'name': 'Second'}],
                                                       'users': {'alice': {'cleartext_password': 'password',
                                                                           'disabled': True,
                                                                           'email': 'alicechains@example.com',
                                                                           'full_name': 'Alice '
                                                                                        'Chains',
                                                                           'username': 'alice'},
                                                                 'johndoe': {'cleartext_password': 'password',
                                                                             'disabled': False,
                                                                             'email': 'johndoe@example.com',
                                                                             'full_name': 'John '
                                                                                          'Doe',
                                                                             'username': 'johndoe'}}}}

@tiangolo It seems making scopes part of cache key may have a side effect because of the inheritance and merging of scopes in the dependency tree.
But this very inheritance also makes sense when we do this out of security concerns. What's your idea?

@laggardkernel
Copy link
Contributor Author

@tiangolo I have a solution but not sure if this contradicts the original security design.

  • scopes is still passed down in the dependency tree
  • but scopes is taken into cache key if the child node is still Security while not Depends in the dependency tree

@qmorek
Copy link

qmorek commented Dec 8, 2022

@tiangolo would you be able to have a look on this please?

@nyte-owl
Copy link

This issue of scopes affecting Depends cache_key bit me as well in #9513. Curious what the solution to this could be as it doesn't seem intended. 🤔

@DurandA
Copy link

DurandA commented Jun 27, 2023

I am also affected by this issue and opened a discussion (#6024).

@tiangolo I have a solution but not sure if this contradicts the original security design.

* `scopes` is still passed down in the dependency tree

* but `scopes` is taken into cache key if the child node is still `Security` while not `Depends` in the dependency tree

@laggardkernel This is the way I wanted to implement a fix when poking with the cache code. I could also make a PR if we get some indication from @tiangolo that this approach is acceptable. Also we could be on the conservative side and introduce a new parameter to Security to allow dropping the scopes of non-security Depends in cache keys. I am however not sure of how this parameter should be passed down the dependency chain.

@DurandA
Copy link

DurandA commented Jul 3, 2023

I created #9790 to fix this issue. @laggardkernel Let me know what you think of it.

@Barnyclark
Copy link

Is there a timeline for this to be PR to be merged? This bug is causing a few issues with running FastAPI in production because it halves our available db connections - every request requires two db sessions.

@Barnyclark
Copy link

@DurandA we have been doing some more testing on this and have found that if you set up the DB using middleware you don't have this issue. This is because although you still have two entries in the cache for get_db they both point to the same db connection - e.g:

@app.middleware("http")
async def db_session_middleware(request: Request, call_next):
    response = Response("Internal server error", status_code=500)
    try:
        request.state.db = SessionLocal()
        response = await call_next(request)
    finally:
        request.state.db.close()
    return response
def get_db(request: Request):
    return request.state.db

This approach also limits the amount of refactoring needed because you are still using get_db as a dependancy.

@DurandA
Copy link

DurandA commented Aug 7, 2023

@DurandA we have been doing some more testing on this and have found that if you set up the DB using middleware you don't have this issue.

Thank you for sharing your solution. IMHO, this is clean enough to be put in production while waiting for an update on this. The only downside I see is that a session is established regardless of whether a DB session is required by the request.

@DurandA
Copy link

DurandA commented Aug 8, 2023

The only downside I see is that a session is established regardless of whether a DB session is required by the request.

Hi @brian-montgomery. You provided an interesting solution to this problem in a comment here but it somehow disappeared. Was there any particular issue with your proposal?

@YuriiMotov
Copy link
Contributor

We continue to discuss the mentioned problem (about dependency caching) caused by this PR.
Could you take a look and give your opinion?
#6024 (comment)

@laggardkernel , @qmorek , @nyte-owl , @Barnyclark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants