Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cache functions and fake relations in filters #127

Merged
merged 8 commits into from
Jun 5, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.27.0
current_version = 0.28.0

[bumpversion:file:pyproject.toml]

Expand Down
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

# [0.27.0] - YYYY-MM-DD
# [0.28.0] - 2023-05-08
- [PR 125](/~https://github.com/salesforce/django-declarative-apis/pull/127) Add function and fake relations caching in filters

# [0.27.0] - 2023-04-27
- [PR 125](/~https://github.com/salesforce/django-declarative-apis/pull/125) Support resources that are not django models

# [0.26.0] - 2023-03-08
Expand Down
96 changes: 65 additions & 31 deletions django_declarative_apis/machinery/filtering.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,12 @@
import types

from django.conf import settings
from django.core.exceptions import FieldDoesNotExist
from django.db import models
from django.db.models import ManyToOneRel
from django.core.exceptions import FieldDoesNotExist
from django.db.models.fields.reverse_related import ForeignObjectRel

logger = logging.getLogger()
logger = logging.getLogger(__name__)

NEVER = 0
ALWAYS = 1
Expand All @@ -38,7 +39,7 @@ def expandable(model_class=None, display_key=None, inst_field_name=None):
if model_class and display_key:
try:
model_class._meta.get_field(display_key)
except models.FieldDoesNotExist as e: # noqa
except FieldDoesNotExist:
raise ValueError(f"{display_key} is not a field on {model_class.__name__}")
return _ExpandableForeignKey(display_key, model_class, inst_field_name)

Expand Down Expand Up @@ -67,6 +68,36 @@ def _get_unexpanded_field_value(inst, field_name, field_type):
return {display_key: getattr(obj, display_key)}


def _is_relation(field):
return field.is_relation or getattr(field, "is_fake_relation", False)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this adds support for fake relationships. In example CharFields acting as foreign keys.



def _is_reverse_relation(field):
return isinstance(field, ForeignObjectRel)


def _cache_related_instance(inst, field_name, model_cache):
try:
field_meta = inst._meta.get_field(field_name)
except (AttributeError, FieldDoesNotExist):
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we're returning None here, shouldn't we bubble up the error? The user is trying to cache a field that doesn't exist

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that field_name is a consumer defined string for specifying the output field name after filtering it. So it can have aj different name of the actual field. For example:

FILTERS = {
    'my_increased_counter': lambda inst: inst.counter.count + 1
}

Here field_name would be my_increased_counter which doesn't exists in the model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but shouldn't we alert the user that the field doesn't exist somehow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code won't break if the field doesn't exists. It just won't get cached.

Or do mean alerting the user that using my_increased_counter as field name he won't get caching? In that case I'm not sure how much useful it is but yeah we could add some logging

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you're saying now. This PR looks good, approved!

if not _is_relation(field_meta) or _is_reverse_relation(field_meta):
return # no `attname` in reverse relations
val_pk = getattr(inst, field_meta.attname)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just wrap this in a try catch block and log the error or reraise the error?

Copy link
Collaborator Author

@lukka5 lukka5 Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point. I had run over the code again to be sure and the problem is that some fields have attname but are not relations so we cannot cache them

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok.

val_cls = field_meta.related_model
cache_key = _make_model_cache_key(val_cls, val_pk)
if cache_key in model_cache:
logger.debug("ev=model_cache, status=hit, key=%s", cache_key)
related_instance = model_cache[cache_key]
else:
logger.debug("ev=model_cache, status=miss, key=%s", cache_key)
related_instance = getattr(inst, field_name)
if not isinstance(related_instance, models.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems very generic, can't we use the instance type of val_cls

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. yeah, I like it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it

return
model_cache[cache_key] = related_instance
setattr(inst, field_name, related_instance)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idea here is to pre-populate the related instance so if the function decides to access it or use it (normally it does) we don't do another query to the db. In example:

FILTERS = {
    'counter': lambda inst: inst.counter.count + 1
}

When that function filter is called counter has already been populated by a counter from cache or stored for later used by another filter function.



def _get_filtered_field_value( # noqa: C901
inst,
field_name,
Expand All @@ -75,12 +106,15 @@ def _get_filtered_field_value( # noqa: C901
expand_this,
expand_children,
filter_cache,
model_cache,
):
# get the value from inst
if field_type == NEVER:
return None

if isinstance(field_type, types.FunctionType):
if is_caching_enabled():
_cache_related_instance(inst, field_name, model_cache)
val = field_type(inst)
elif isinstance(field_type, _ExpandableForeignKey):
if expand_this:
Expand All @@ -93,7 +127,7 @@ def _get_filtered_field_value( # noqa: C901
if isinstance(inst, (models.Model)):
try:
field_meta = inst._meta.get_field(field_name)
if field_meta.is_relation:
if _is_relation(field_meta):
val_pk = getattr(inst, field_meta.attname)
val_cls = field_meta.related_model
val_expand_children = expand_children.get(field_name, {})
Expand All @@ -110,7 +144,7 @@ def _get_filtered_field_value( # noqa: C901
pass

val = getattr(inst, field_name)
except (AttributeError, FieldDoesNotExist) as e: # noqa
except (AttributeError, FieldDoesNotExist):
return None

if isinstance(val, models.Manager):
Expand All @@ -122,11 +156,7 @@ def _get_filtered_field_value( # noqa: C901
val, (list, tuple, models.Model, models.query.QuerySet)
):
val = _apply_filters_to_object(
val,
filter_def,
expand_children=expand_children,
klass=val.__class__,
filter_cache=filter_cache,
val, filter_def, expand_children, val.__class__, filter_cache, model_cache
)

if (
Expand All @@ -144,18 +174,17 @@ def _make_filter_cache_key(expand_children, klass, pk):
return (str(expand_children), klass, str(pk))


def _make_model_cache_key(klass, pk):
return (klass, str(pk))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will this key look like if we have to manually inspect it in cache? Can we have a more human readable cache key here, maybe something like {klass.__cache__.__name}:{str(pk)}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind, i see it matches the style of the cache key of the other cache



# TODO: make this method less complex and remove the `noqa`
def _apply_filters_to_object( # noqa: C901
inst,
filter_def,
expand_children=None,
klass=None,
filter_cache=None,
inst, filter_def, expand_children, klass, filter_cache, model_cache
):
is_cacheable = False
caching_enabled = getattr(settings, "DDA_FILTER_MODEL_CACHING_ENABLED", False)
if (
caching_enabled
is_caching_enabled()
and isinstance(inst, (models.Model,))
and filter_cache is not None
):
Expand All @@ -173,9 +202,10 @@ def _apply_filters_to_object( # noqa: C901
_apply_filters_to_object(
item,
filter_def,
expand_children=expand_children,
klass=item.__class__,
filter_cache=filter_cache,
expand_children,
item.__class__,
filter_cache,
model_cache,
)
for item in inst
]
Expand All @@ -184,9 +214,10 @@ def _apply_filters_to_object( # noqa: C901
k: _apply_filters_to_object(
v,
filter_def,
expand_children=expand_children,
klass=v.__class__,
filter_cache=filter_cache,
expand_children,
v.__class__,
filter_cache,
model_cache,
)
for k, v in inst.items()
}
Expand All @@ -200,9 +231,10 @@ def _apply_filters_to_object( # noqa: C901
result = _apply_filters_to_object(
inst,
filter_def,
expand_children=expand_children,
klass=base_class,
filter_cache=filter_cache,
expand_children,
base_class,
filter_cache,
model_cache,
)
break

Expand All @@ -215,11 +247,7 @@ def _apply_filters_to_object( # noqa: C901
result = defaultdict(list)
for base in klass.__bases__:
filtered_ancestor = _apply_filters_to_object(
inst,
filter_def,
expand_children=expand_children,
klass=base,
filter_cache=filter_cache,
inst, filter_def, expand_children, base, filter_cache, model_cache
)
if filtered_ancestor:
result.update(filtered_ancestor)
Expand All @@ -243,6 +271,7 @@ def _apply_filters_to_object( # noqa: C901
expand_this=field_name in expand_children,
expand_children=expand_children.get(field_name, {}),
filter_cache=filter_cache,
model_cache=model_cache,
)

if value is not None and value != DEFAULT_UNEXPANDED_VALUE:
Expand All @@ -269,6 +298,10 @@ def _compile_expansion(expand_fields):
return top


def is_caching_enabled():
return getattr(settings, "DDA_FILTER_MODEL_CACHING_ENABLED", False)


def apply_filters_to_object(inst, filter_def, expand_header=""):
if expand_header:
expand_dict = _compile_expansion(expand_header.split(","))
Expand All @@ -280,4 +313,5 @@ def apply_filters_to_object(inst, filter_def, expand_header=""):
expand_children=expand_dict,
klass=inst.__class__,
filter_cache={},
model_cache={},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided to add a new cache instead of reusing filter_cache since they key/values types are completely different from filter cache.

Filter cache using is storing

  • key: (expandables, class, pk)
  • value: defaultdict(...) with filtered/formatted results

Model cache:

  • key: (class, pk)
  • value: model instances

)
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@
# built documents.

# The full version, including alpha/beta/rc tags.
release = '0.27.0' # set by bumpversion
release = '0.28.0' # set by bumpversion

# The short X.Y version.
version = release.rsplit('.', 1)[0]
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "django-declarative-apis"
version = "0.27.0" # set by bumpversion
version = "0.28.0" # set by bumpversion
description = "Simple, readable, declarative APIs for Django"
readme = "README.md"
dependencies = [
Expand Down
10 changes: 10 additions & 0 deletions tests/filters.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,22 @@
"parent_field": expandable(model_class=models.ParentModel),
"parents": expandable(model_class=models.ParentModel),
},
}

INEFFICIENT_FILTERS = {
models.InefficientLeaf: {"id": ALWAYS},
models.InefficientBranchA: {"leaf": ALWAYS},
models.InefficientBranchB: {"leaf": ALWAYS},
models.InefficientRoot: {"branch_a": ALWAYS, "branch_b": ALWAYS},
}

INEFFICIENT_FUNCTION_FILTERS = {
models.InefficientLeaf: {"id": ALWAYS},
models.InefficientBranchA: {"leaf": lambda inst: inst.leaf},
models.InefficientBranchB: {"leaf": lambda inst: inst.leaf},
models.InefficientRoot: {"branch_a": ALWAYS, "branch_b": ALWAYS},
}

RENAMED_EXPANDABLE_MODEL_FIELDS = {
str: ALWAYS,
int: ALWAYS,
Expand Down
47 changes: 36 additions & 11 deletions tests/machinery/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -717,23 +717,48 @@ def setUp(self):
)
self.root_id = root.id

def test_filter_query_cache_reduces_queries(self):
def test_filter_cache_reduces_queries(self):
root = models.InefficientRoot.objects.get(id=self.root_id)
with self.assertNumQueries(4):
filtering.apply_filters_to_object(root, filters.INEFFICIENT_FILTERS)

root = models.InefficientRoot.objects.get(id=self.root_id)
with self.assertNumQueries(3), override_settings(
DDA_FILTER_MODEL_CACHING_ENABLED=True, DDA_FILTER_CACHE_DEBUG_LOG=True
):
filtering.apply_filters_to_object(root, filters.INEFFICIENT_FILTERS)

def test_model_cache_reduces_queries(self):
root = models.InefficientRoot.objects.get(id=self.root_id)
with self.assertNumQueries(4):
filtering.apply_filters_to_object(
root,
filters.DEFAULT_FILTERS,
root, filters.INEFFICIENT_FUNCTION_FILTERS
)

root = models.InefficientRoot.objects.get(id=self.root_id)
with self.assertNumQueries(3):
with override_settings(
DDA_FILTER_MODEL_CACHING_ENABLED=True, DDA_FILTER_CACHE_DEBUG_LOG=True
):
filtering.apply_filters_to_object(
root,
filters.DEFAULT_FILTERS,
)
with self.assertNumQueries(3), override_settings(
DDA_FILTER_MODEL_CACHING_ENABLED=True, DDA_FILTER_CACHE_DEBUG_LOG=True
):
filtering.apply_filters_to_object(
root, filters.INEFFICIENT_FUNCTION_FILTERS
)


class FilterCachingFakeRelationTestCase(FilterCachingTestCase):
def setUp(self):
leaf = models.InefficientLeaf.objects.create(id=1)
branch_a = models.InefficientBranchA.objects.create(id=1, leaf=leaf)
branch_b = models.InefficientBranchB.objects.create(id=1, leaf=leaf)
root = models.InefficientRoot.objects.create(
id=4, branch_a=branch_a, branch_b=branch_b
)
self.root_id = root.id
self._mark_as_fake_relation(branch_a._meta.get_field("leaf"))
self._mark_as_fake_relation(branch_b._meta.get_field("leaf"))

def _mark_as_fake_relation(self, field):
field.is_relation = False
field.is_fake_relation = True


class ResourceUpdateEndpointDefinitionTestCase(
Expand Down