Add new option to run just new jobs or jobs without history #831

nille02 · 2024-11-10T12:34:22Z

I have the issue that i have some url.yaml files with hundreds to thousands of jobs. A full run can take hours and i do that in general just once a month or even less.
Before that option i checked for the new ids and run them manual. but it was quite the chore and i looked how i can avoid that.

My only issue is that get_history_data() is quite slow but i guess sqllite is the reason?

thp

See comments.

lib/urlwatch/command.py

lib/urlwatch/config.py

lib/urlwatch/storage.py

nille02 · 2024-11-26T19:20:07Z

I added the redis version of has_history_data() to it but it just wraps around get_history_data() since redis is already fast.

For this a new cache.db was used and a redis server that runs on a debian stable vm.

Redis          Benchmark: 864 Jobs took 0:00:00.680740
sqllite        Benchmark: 864 Jobs took 0:00:30.835610
sqllite Cached Benchmark: 864 Jobs took 0:00:00.322071

The time is just the loop for all jobs over has/get_history_data()

thp

See comments.

lib/urlwatch/config.py

lib/urlwatch/storage.py

thp

Works for me. Please also add a changelog entry to the PR.

nille02 · 2025-01-01T13:58:13Z

Just a entry in the CHANGELOG.md? If yes its done.

And everyone i hope you all had a happy new year.

thp

Looks good, thanks for the updates!

thp · 2025-01-05T09:50:49Z

@nille02 Can you fix up the last remaining code style issues?

----------------------------- Captured stdout call -----------------------------
/home/runner/work/urlwatch/urlwatch/lib/urlwatch/command.py:151:36: E226 missing whitespace around arithmetic operator
/home/runner/work/urlwatch/urlwatch/lib/urlwatch/storage.py:608:89: E231 missing whitespace after ','
/home/runner/work/urlwatch/urlwatch/lib/urlwatch/storage.py:609:115: W504 line break after binary operator
/home/runner/work/urlwatch/urlwatch/lib/urlwatch/storage.py:610:121: E261 at least two spaces before inline comment
=========================== short test summary info ============================

After that, this is ready to merge :)

nille02 · 2025-01-05T11:10:33Z

Should be fixed now

thp · 2025-01-05T13:13:32Z

There's still 3 more:

/home/runner/work/urlwatch/urlwatch/lib/urlwatch/storage.py:608:89: E231 missing whitespace after ','
/home/runner/work/urlwatch/urlwatch/lib/urlwatch/storage.py:608:108: W291 trailing whitespace
/home/runner/work/urlwatch/urlwatch/lib/urlwatch/storage.py:609:114: W291 trailing whitespace

(you can run this locally to verify it's fixed, no need to wait for CI -- use pytest -v after installing dependencies, see /~https://github.com/thp/urlwatch/blob/master/.github/workflows/unit-tests.yml)

New command-line option `--prepare-jobs` to initialize new jobs or jobs without history Fix some typos Fix Again

nille02 · 2025-01-05T14:17:28Z

Now it should be done.

I did try pytest but i got some different error and i guess its due to windows. So had to skip the documentation test but pep8 was happy now.

thp · 2025-01-05T18:48:39Z

Thanks for working on this and seeing it through! Merged :)

nille02 marked this pull request as ready for review November 10, 2024 12:34

thp requested changes Nov 16, 2024

View reviewed changes

nille02 force-pushed the run-jobs branch 3 times, most recently from ae415cc to c3e02a9 Compare November 25, 2024 08:13

thp reviewed Nov 26, 2024

View reviewed changes

lib/urlwatch/storage.py Outdated Show resolved Hide resolved

nille02 force-pushed the run-jobs branch from c46dd6a to b1df74e Compare November 26, 2024 15:39

nille02 requested a review from thp November 26, 2024 19:20

thp requested changes Dec 10, 2024

View reviewed changes

lib/urlwatch/config.py Outdated Show resolved Hide resolved

lib/urlwatch/storage.py Outdated Show resolved Hide resolved

lib/urlwatch/storage.py Outdated Show resolved Hide resolved

lib/urlwatch/storage.py Outdated Show resolved Hide resolved

nille02 requested a review from thp December 11, 2024 11:36

thp requested changes Dec 21, 2024

View reviewed changes

nille02 force-pushed the run-jobs branch from 1665be2 to 87f5310 Compare January 1, 2025 13:56

thp approved these changes Jan 5, 2025

View reviewed changes

nille02 force-pushed the run-jobs branch from 87f5310 to 3c7e1fa Compare January 5, 2025 11:05

Add new option to run just new jobs or jobs without history

a0c1cc0

New command-line option `--prepare-jobs` to initialize new jobs or jobs without history Fix some typos Fix Again

nille02 force-pushed the run-jobs branch from 3c7e1fa to a0c1cc0 Compare January 5, 2025 14:04

thp merged commit b601917 into thp:master Jan 5, 2025
6 checks passed

nille02 deleted the run-jobs branch January 5, 2025 19:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new option to run just new jobs or jobs without history #831

Add new option to run just new jobs or jobs without history #831

nille02 commented Nov 10, 2024

thp left a comment

nille02 commented Nov 26, 2024 •

edited

Loading

thp left a comment

thp left a comment

nille02 commented Jan 1, 2025

thp left a comment

thp commented Jan 5, 2025

nille02 commented Jan 5, 2025

thp commented Jan 5, 2025

nille02 commented Jan 5, 2025 •

edited

Loading

thp commented Jan 5, 2025

Add new option to run just new jobs or jobs without history #831

Add new option to run just new jobs or jobs without history #831

Conversation

nille02 commented Nov 10, 2024

thp left a comment

Choose a reason for hiding this comment

nille02 commented Nov 26, 2024 • edited Loading

thp left a comment

Choose a reason for hiding this comment

thp left a comment

Choose a reason for hiding this comment

nille02 commented Jan 1, 2025

thp left a comment

Choose a reason for hiding this comment

thp commented Jan 5, 2025

nille02 commented Jan 5, 2025

thp commented Jan 5, 2025

nille02 commented Jan 5, 2025 • edited Loading

thp commented Jan 5, 2025

nille02 commented Nov 26, 2024 •

edited

Loading

nille02 commented Jan 5, 2025 •

edited

Loading