Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Update DAOs to use singular deletion method instead of bulk #24894

Conversation

jfrag1
Copy link
Member

@jfrag1 jfrag1 commented Aug 5, 2023

SUMMARY

This change shouldn't have any impact on the behavior of the application outside of maybe a small performance hit when deleting many objects at once. Note additionally that this is the default method of deletion used by the BaseDAO.

The motivation behind the change is that at Preset we'd like the ability to utilize the after_delete sqlalchemy listener hooks for these objects, and these aren't triggered for bulk deletes.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Aug 5, 2023

Codecov Report

Merging #24894 (e52cff5) into master (7397ab3) will increase coverage by 0.03%.
Report is 4 commits behind head on master.
The diff coverage is 100.00%.

❗ Current head e52cff5 differs from pull request most recent head 72dadf9. Consider uploading reports for the commit 72dadf9 to get more accurate results

@@            Coverage Diff             @@
##           master   #24894      +/-   ##
==========================================
+ Coverage   68.96%   68.99%   +0.03%     
==========================================
  Files        1906     1906              
  Lines       74122    74107      -15     
  Branches     8208     8208              
==========================================
+ Hits        51116    51132      +16     
+ Misses      20883    20852      -31     
  Partials     2123     2123              
Flag Coverage Δ
hive 54.18% <20.00%> (+0.01%) ⬆️
mysql 79.21% <100.00%> (+<0.01%) ⬆️
postgres 79.31% <100.00%> (?)
presto 54.08% <20.00%> (+0.01%) ⬆️
python 83.37% <100.00%> (+0.08%) ⬆️
sqlite 77.91% <100.00%> (+0.02%) ⬆️
unit 55.06% <20.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
superset/daos/chart.py 92.72% <100.00%> (ø)
superset/daos/dashboard.py 96.68% <100.00%> (ø)
superset/daos/dataset.py 93.38% <100.00%> (+2.21%) ⬆️

... and 9 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pull-request-size pull-request-size bot added size/M and removed size/S labels Aug 5, 2023
datasets = (
db.session.query(SqlaTable)
.filter(SqlaTable.table_name.in_(self.fixture_tables_names))
.all()
)
assert datasets == []
for dataset in deleted_datasets:
setattr(dataset, "_deleted", True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test started failing since the cleanup was trying to delete already-deleted datasets. This was fine for some reason before with the bulk-deletes.

@@ -41,16 +41,14 @@ class ChartDAO(BaseDAO[Slice]):

@classmethod
def delete(cls, items: Slice | list[Slice], commit: bool = True) -> None:
item_ids = [item.id for item in get_iterable(items)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfrag1 if we're removing the bulk delete logic, i.e., deleting outside of the ORM, then I think there's merit in removing the entire function and falling back to BaseDAO.delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do this here because of this logic which is specific to charts:

for item in get_iterable(items):
    item.dashboards = []
    db.session.merge(item)

Copy link
Member

@john-bodley john-bodley Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfrag1 the reason this logic is required is likely because for non SQLAlchemy ORM operations, i.e., bulk deletion, the ON DELETE CASCADE isn't configured. If one were to simply remove the bulk deletion logic then the BaseDAO.delete method should be suffice.

Copy link
Member

@john-bodley john-bodley Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfrag1 I authored #24938 and #24939 which should help provide some clarity, i.e., that said code block is redundant as we're leaning on the database schema.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the chart/dashboard DAOs to fall back to BaseDAO.delete

@@ -185,17 +185,15 @@ def update_charts_owners(model: Dashboard, commit: bool = True) -> Dashboard:

@classmethod
def delete(cls, items: Dashboard | list[Dashboard], commit: bool = True) -> None:
item_ids = [item.id for item in get_iterable(items)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, there's some logic which is specific to dashboards.

connection = db.session.connection()
mapper = next(iter(cls.model_cls.registry.mappers)) # type: ignore

for item in get_iterable(items):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the PR description this method actually should handle bulk deletion correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other custom after_delete listeners we'd like to trigger outside of just security_manager.dataset_after_delete

datasets = (
db.session.query(SqlaTable)
.filter(SqlaTable.table_name.in_(self.fixture_tables_names))
.all()
)
assert datasets == []
for dataset in deleted_datasets:
setattr(dataset, "_deleted", True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach doesn't see right, i.e., we shouldn't need to be setting/getting hidden attributes. If I were you, I would re-evaluate the failing tests and see what the issue is. Maybe the results weren't flushed or committed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with a better method of checking whether the object was deleted

mapper = next(iter(cls.model_cls.registry.mappers)) # type: ignore

for item in get_iterable(items):
security_manager.dataset_after_delete(mapper, connection, item)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic is still valid to be removed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there any test failing if we remove this? I think we should add a test here to catch if
the vm is removed once dataset is deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, when doing singular deletes this is triggered via this listener: /~https://github.com/apache/superset/blob/master/superset/connectors/sqla/models.py#L1585. The logic was only there since this was not triggered via the bulk delete method

@jfrag1 jfrag1 requested a review from john-bodley August 7, 2023 17:27
@eschutho eschutho merged commit 4a59a26 into apache:master Aug 18, 2023
@eschutho eschutho deleted the jack/do-single-deletes-rather-than-bulk-in-daos branch August 18, 2023 00:00
@michael-s-molina michael-s-molina added v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch and removed v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch labels Aug 18, 2023
jinghua-qa pushed a commit to preset-io/superset that referenced this pull request Aug 18, 2023
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants