-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
LockStore: fix acquiring a lock via LockStore.try_acquire_lock
#12832
LockStore: fix acquiring a lock via LockStore.try_acquire_lock
#12832
Conversation
bf1843f
to
b565acf
Compare
It turns out that the test I posted is poorly written: it doesn't run the three tasks to completion. Previously all three tasks would acquire the worker lock and be waiting for a database transaction to release the lock at the end of the test case. After the fix, at the end of
Python's GC runs and throws a We can ensure all three tasks have finished like so: --- a/tests/storage/databases/main/test_lock.py
+++ b/tests/storage/databases/main/test_lock.py
@@ -12,7 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-from twisted.internet import defer
+from twisted.internet import defer, reactor
+from twisted.internet.base import ReactorBase
from twisted.internet.defer import Deferred
from synapse.server import HomeServer
@@ -51,9 +52,9 @@ class LockTestCase(unittest.HomeserverTestCase):
in_lock -= 1
# Start 3 tasks.
- defer.ensureDeferred(task())
- defer.ensureDeferred(task())
- defer.ensureDeferred(task())
+ task1 = defer.ensureDeferred(task())
+ task2 = defer.ensureDeferred(task())
+ task3 = defer.ensureDeferred(task())
# All three are now waiting for the database.
# Give the reactor a kick so that the database transaction returns.
@@ -61,6 +62,18 @@ class LockTestCase(unittest.HomeserverTestCase):
release_lock.callback(None)
+ # Run the tasks to completion.
+ # To work around `Linearizer`s using a different reactor to sleep when
+ # contended (#12841), we call `runUntilCurrent` on
+ # `twisted.internet.reactor`, which is a different reactor to that used
+ # by the homeserver.
+ assert isinstance(reactor, ReactorBase)
+ self.get_success(task1)
+ reactor.runUntilCurrent()
+ self.get_success(task2)
+ reactor.runUntilCurrent()
+ self.get_success(task3)
+
# At most one task should have held the lock at a time.
self.assertEqual(max_in_lock, 1) but note that the "All three are now waiting for the database." comment is no longer accurate. |
4dde928
to
778bc7b
Compare
nb: currently blocks #12484 |
This commit proves that there is a bug in LockStore.try_acquire_lock, and provides a test case that must pass. Signed-off-by: Sumner Evans <sumner@beeper.com> Co-authored-by: Sean Quah <seanq@matrix.org>
This fixes the "test_acquire_contention" test case added in the previous commit. Signed-off-by: Sumner Evans <sumner@beeper.com>
Signed-off-by: Sumner Evans <sumner@beeper.com>
778bc7b
to
b029f3a
Compare
This depends on the new newly-fixed locking mechanism from matrix-org#12832 to prevent races. Signed-off-by: Sumner Evans <sumner@beeper.com>
This depends on the new newly-fixed locking mechanism from matrix-org#12832 to prevent races. Signed-off-by: Sumner Evans <sumner@beeper.com>
This depends on the new newly-fixed locking mechanism from matrix-org#12832 to prevent races. Signed-off-by: Sumner Evans <sumner@beeper.com>
This depends on the new newly-fixed locking mechanism from matrix-org#12832 to prevent races. Signed-off-by: Sumner Evans <sumner@beeper.com>
This depends on the new newly-fixed locking mechanism from matrix-org#12832 to prevent races. Signed-off-by: Sumner Evans <sumner@beeper.com>
Synapse 1.61.0 (2022-06-14) =========================== This release removes support for the non-standard feature known both as 'groups' and as 'communities', which have been superseded by *Spaces*. See [the upgrade notes](/~https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1610) for more details. Improved Documentation ---------------------- - Mention removed community/group worker endpoints in [upgrade.md](/~https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1610s). Contributed by @olmari. ([\matrix-org#13023](matrix-org#13023)) Synapse 1.61.0rc1 (2022-06-07) ============================== Features -------- - Add new `media_retention` options to the homeserver config for routinely cleaning up non-recently accessed media. ([\matrix-org#12732](matrix-org#12732), [\matrix-org#12972](matrix-org#12972), [\matrix-org#12977](matrix-org#12977)) - Experimental support for [MSC3772](matrix-org/matrix-spec-proposals#3772): Push rule for mutually related events. ([\matrix-org#12740](matrix-org#12740), [\matrix-org#12859](matrix-org#12859)) - Update to the `check_event_for_spam` module callback: Deprecate the current callback signature, replace it with a new signature that is both less ambiguous (replacing booleans with explicit allow/block) and more powerful (ability to return explicit error codes). ([\matrix-org#12808](matrix-org#12808)) - Add storage and module API methods to get monthly active users (and their corresponding appservices) within an optionally specified time range. ([\matrix-org#12838](matrix-org#12838), [\matrix-org#12917](matrix-org#12917)) - Support the new error code `ORG.MATRIX.MSC3823.USER_ACCOUNT_SUSPENDED` from [MSC3823](matrix-org/matrix-spec-proposals#3823). ([\matrix-org#12845](matrix-org#12845), [\matrix-org#12923](matrix-org#12923)) - Add a configurable background job to delete stale devices. ([\matrix-org#12855](matrix-org#12855)) - Improve URL previews for pages with empty elements. ([\matrix-org#12951](matrix-org#12951)) - Allow updating a user's password using the admin API without logging out their devices. Contributed by @jcgruenhage. ([\matrix-org#12952](matrix-org#12952)) Bugfixes -------- - Always send an `access_token` in `/thirdparty/` requests to appservices, as required by the [Application Service API specification](https://spec.matrix.org/v1.1/application-service-api/#third-party-networks). ([\matrix-org#12746](matrix-org#12746)) - Implement [MSC3816](matrix-org/matrix-spec-proposals#3816): sending the root event in a thread should count as having 'participated' in it. ([\matrix-org#12766](matrix-org#12766)) - Delete events from the `federation_inbound_events_staging` table when a room is purged through the admin API. ([\matrix-org#12784](matrix-org#12784)) - Fix a bug where we did not correctly handle invalid device list updates over federation. Contributed by Carl Bordum Hansen. ([\matrix-org#12829](matrix-org#12829)) - Fix a bug which allowed multiple async operations to access database locks concurrently. Contributed by @sumnerevans @ Beeper. ([\matrix-org#12832](matrix-org#12832)) - Fix an issue introduced in Synapse 0.34 where the `/notifications` endpoint would only return notifications if a user registered at least one pusher. Contributed by Famedly. ([\matrix-org#12840](matrix-org#12840)) - Fix a bug where servers using a Postgres database would fail to backfill from an insertion event when MSC2716 is enabled (`experimental_features.msc2716_enabled`). ([\matrix-org#12843](matrix-org#12843)) - Fix [MSC3787](matrix-org/matrix-spec-proposals#3787) rooms being omitted from room directory, room summary and space hierarchy responses. ([\matrix-org#12858](matrix-org#12858)) - Fix a bug introduced in Synapse 1.54.0 which could sometimes cause exceptions when handling federated traffic. ([\matrix-org#12877](matrix-org#12877)) - Fix a bug introduced in Synapse 1.59.0 which caused room deletion to fail with a foreign key violation error. ([\matrix-org#12889](matrix-org#12889)) - Fix a long-standing bug which caused the `/messages` endpoint to return an incorrect `end` attribute when there were no more events. Contributed by @Vetchu. ([\matrix-org#12903](matrix-org#12903)) - Fix a bug introduced in Synapse 1.58.0 where `/sync` would fail if the most recent event in a room was a redaction of an event that has since been purged. ([\matrix-org#12905](matrix-org#12905)) - Fix a potential memory leak when generating thumbnails. ([\matrix-org#12932](matrix-org#12932)) - Fix a long-standing bug where a URL preview would break if the image failed to download. ([\matrix-org#12950](matrix-org#12950)) Improved Documentation ---------------------- - Fix typographical errors in documentation. ([\matrix-org#12863](matrix-org#12863)) - Fix documentation incorrectly stating the `sendToDevice` endpoint can be directed at generic workers. Contributed by Nick @ Beeper. ([\matrix-org#12867](matrix-org#12867)) Deprecations and Removals ------------------------- - Remove support for the non-standard groups/communities feature from Synapse. ([\matrix-org#12553](matrix-org#12553), [\matrix-org#12558](matrix-org#12558), [\matrix-org#12563](matrix-org#12563), [\matrix-org#12895](matrix-org#12895), [\matrix-org#12897](matrix-org#12897), [\matrix-org#12899](matrix-org#12899), [\matrix-org#12900](matrix-org#12900), [\matrix-org#12936](matrix-org#12936), [\matrix-org#12966](matrix-org#12966)) - Remove contributed `kick_users.py` script. This is broken under Python 3, and is not added to the environment when `pip install`ing Synapse. ([\matrix-org#12908](matrix-org#12908)) - Remove `contrib/jitsimeetbridge`. This was an unused experiment that hasn't been meaningfully changed since 2014. ([\matrix-org#12909](matrix-org#12909)) - Remove unused `contrib/experiements/cursesio.py` script, which fails to run under Python 3. ([\matrix-org#12910](matrix-org#12910)) - Remove unused `contrib/experiements/test_messaging.py` script. This fails to run on Python 3. ([\matrix-org#12911](matrix-org#12911)) Internal Changes ---------------- - Test Synapse against Complement with workers. ([\matrix-org#12810](matrix-org#12810), [\matrix-org#12933](matrix-org#12933)) - Reduce the amount of state we pull from the DB. ([\matrix-org#12811](matrix-org#12811), [\matrix-org#12964](matrix-org#12964)) - Try other homeservers when re-syncing state for rooms with partial state. ([\matrix-org#12812](matrix-org#12812)) - Resume state re-syncing for rooms with partial state after a Synapse restart. ([\matrix-org#12813](matrix-org#12813)) - Remove Mutual Rooms' ([MSC2666](matrix-org/matrix-spec-proposals#2666)) endpoint dependency on the User Directory. ([\matrix-org#12836](matrix-org#12836)) - Experimental: expand `check_event_for_spam` with ability to return additional fields. This enables spam-checker implementations to experiment with mechanisms to give users more information about why they are blocked and whether any action is needed from them to be unblocked. ([\matrix-org#12846](matrix-org#12846)) - Remove `dont_notify` from the `.m.rule.room.server_acl` rule. ([\matrix-org#12849](matrix-org#12849)) - Remove the unstable `/hierarchy` endpoint from [MSC2946](matrix-org/matrix-spec-proposals#2946). ([\matrix-org#12851](matrix-org#12851)) - Pull out less state when handling gaps in room DAG. ([\matrix-org#12852](matrix-org#12852), [\matrix-org#12904](matrix-org#12904)) - Clean-up the push rules datastore. ([\matrix-org#12856](matrix-org#12856)) - Correct a type annotation in the URL preview source code. ([\matrix-org#12860](matrix-org#12860)) - Update `pyjwt` dependency to [2.4.0](/~https://github.com/jpadilla/pyjwt/releases/tag/2.4.0). ([\matrix-org#12865](matrix-org#12865)) - Enable the `/account/whoami` endpoint on synapse worker processes. Contributed by Nick @ Beeper. ([\matrix-org#12866](matrix-org#12866)) - Enable the `batch_send` endpoint on synapse worker processes. Contributed by Nick @ Beeper. ([\matrix-org#12868](matrix-org#12868)) - Don't generate empty AS transactions when the AS is flagged as down. Contributed by Nick @ Beeper. ([\matrix-org#12869](matrix-org#12869)) - Fix up the variable `state_store` naming. ([\matrix-org#12871](matrix-org#12871)) - Faster room joins: when querying the current state of the room, wait for state to be populated. ([\matrix-org#12872](matrix-org#12872)) - Avoid running queries which will never result in deletions. ([\matrix-org#12879](matrix-org#12879)) - Use constants for EDU types. ([\matrix-org#12884](matrix-org#12884)) - Reduce database load of `/sync` when presence is enabled. ([\matrix-org#12885](matrix-org#12885)) - Refactor `have_seen_events` to reduce memory consumed when processing federation traffic. ([\matrix-org#12886](matrix-org#12886)) - Refactor receipt linearization code. ([\matrix-org#12888](matrix-org#12888)) - Add type annotations to `synapse.logging.opentracing`. ([\matrix-org#12894](matrix-org#12894)) - Remove PyNaCl occurrences directly used in Synapse code. ([\matrix-org#12902](matrix-org#12902)) - Bump types-jsonschema from 4.4.1 to 4.4.6. ([\matrix-org#12912](matrix-org#12912)) - Rename storage classes. ([\matrix-org#12913](matrix-org#12913)) - Preparation for database schema simplifications: stop reading from `event_edges.room_id`. ([\matrix-org#12914](matrix-org#12914)) - Check if we are in a virtual environment before overriding the `PYTHONPATH` environment variable in the demo script. ([\matrix-org#12916](matrix-org#12916)) - Improve the logging when signature checks on events fail. ([\matrix-org#12925](matrix-org#12925)) # -----BEGIN PGP SIGNATURE----- # # iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmKoaa0QHGVyaWtAbWF0 # cml4Lm9yZwAKCRClQuTtGw+sCVn3B/sF8hdhrZ7hWW40ST3eG9cEKNFrj/xZXiaI # zho3ryrxaQF68BSKot15AvZSprEdwBXWrb8WeTjyw+QH7vTKrCQDZ0p7qubn10Z7 # BuKq9hyYjyCLjBZrgy8d4U3Y8gsSByuO59YKHNLn+UTJLOs5GTH8Wprwh4mpU3Jl # +o+cC+lMSVcyZij2hihFymcSxWq/I9WL0dsjRif8x0BUQwRXwmXc6+mhlgLBe2Zs # 2dUouzJ8NVZcjfWvsg4noXPrNQ/IiyCVZlSIgaDftDIxVSPk5/rXiUUNex8Tn1+I # TnOgnXhpOQD1vwVGYS8LrcfA0ubSili7xUJ8k2e5TkCjpkaVnYNu # =q+7N # -----END PGP SIGNATURE----- # gpg: Signature made Tue Jun 14 11:57:49 2022 BST # gpg: using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09 # gpg: issuer "erik@matrix.org" # gpg: WARNING: server 'dirmngr' is older than us (2.2.34 < 2.3.6) # gpg: Note: Outdated servers may lack important security fixes. # gpg: Note: Use the command "gpgconf --kill all" to restart them. # gpg: Can't check signature: No public key # Conflicts: # docs/workers.md # synapse/handlers/message.py # synapse/handlers/pagination.py # synapse/push/bulk_push_rule_evaluator.py # synapse/push/push_rule_evaluator.py # synapse/rest/client/receipts.py # synapse/storage/databases/main/appservice.py # tests/push/test_push_rule_evaluator.py
This PR fixes acquiring a lock via
LockStore.try_acquire_lock
which is currently broken.Linearizer
toLockStore
to make the test pass.Outstanding questions:
This change causes a regression inLockTestCase.test_drop
and I'm not quite sure what to do about it. It's not obvious where the error is coming from (seems to be in the internals oftwisted
which seems concerning):old error
Note to Reviewer
If you checkout the branch at the commit where I add the initial test case, and run the test, it will fail.
After fast-forwarding to the HEAD of the branch, running the same tests will pass.
Sign-off
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)