Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit nwk event on concentratorIndCb #992

Merged
merged 3 commits into from
Mar 28, 2024

Conversation

deviantintegral
Copy link
Contributor

I ran into another situation where zigbee2mqtt had a stale network address, similar to Koenkk/zigbee2mqtt#15874, but with https://www.zigbee2mqtt.io/devices/43076.html#enbrighten-43076 instead of the Sonoff ZBMINI.

I think this was triggered when I had circuits off as I was installing a new relay. I flipped the breakers several times in the process, so many devices were restarting over a 1 hour period.

I turned on debug logging and sniffed traffic, and saw the switch sending updates that weren't making it up the stack to zigbee2mqtt. It would respond to group broadcasts.

if (!device) {
debug.log(
`'${dataType}' data is from unknown device with address '${dataPayload.address}', ` +
`skipping...`
);
return;
}
is where the messages were in herdsman, but at that point none of the data in the method had the IEEE address to check against.

As well, I could see the device sending many route request and link status packets which seemed much higher than normal. While I could see what looked to be On / Off events triggered at the switch in Wireshark, for some reason it wasn't able to decrypt them normally.

SCR-20240325-tndz

In Herdsman, I saw that there were many messages being sent with the command concentratorIndCb. That data contained both the IEEE and the short address, so it seemed like a good place to check for network address changes.

With this PR, the switch came back online shortly after starting zigbee2mqtt. The "Many-to-one" route requests stopped being spammed.

What I'm not sure is:

  1. I haven't found yet a clear reference on what concentratorIndCb is. So, I'm not sure if this is the right place to improve or fix this.
  2. If this is, we could perhaps collapse this else statement with the one above.

@deviantintegral
Copy link
Contributor Author

Tests are failing due to decreasing coverage - no new failures introduced. I can update tests once I have confirmation the change itself is OK.

@Koenkk
Copy link
Owner

Koenkk commented Mar 26, 2024

Nice finding! Once tests are updated this can be merged.

@deviantintegral
Copy link
Contributor Author

I added some comments explaining why we're mixing and matching commands and events, and added a test. My original question about collapsing the if doesn't matter because the packet fields are different for the two commands.

Aside, this was the first time I'd encountered "concentrator" and was having trouble finding good upstream docs for it. I think this chat does a reasonably OK job of explaining it. I totally thought it was hallucinating the callback and that it was mixing up JavaScript, but that does appear to be the case. That gave me enough breadcrumbs to find some zstack code that referenced it.

https://chat.openai.com/share/2ac8ee11-31f4-48bc-9f6d-5fb3f3b3c874

@Nerivec
Copy link
Collaborator

Nerivec commented Mar 28, 2024

@Koenkk Looking at this and the mentioned discussion, I'm wondering if this is the famous address conflict that you guys are seeing like this with zStack, except it has a more "generic" callback than ember and just reports as "a new route from concentrator"?
According to the spec, a network address change is not really supposed to happen otherwise.

Network short addresses SHALL be chosen at random. The random address assigned SHALL conform to the NIST testing regimen described in reference [B10]. When a device joins the network using MAC association, its parent SHALL choose a random address that does not already appear in any entry in the parent’s NIB. Under stochastic addressing, once a device has been assigned an address, it has no reason to relinquish that address and SHOULD retain it unless it receives an indication that its address is in conflict with that of another device on the network. Furthermore, devices MAY self-assign random addresses under stochastic addressing and retain them, as in the case of joining a network using the rejoin command frame (see section 3.6.1.6.1.2). The Zigbee coordinator, which has no parent, SHALL always have the address 0x0000.

  • 3.6.1.8 Stochastic Address Assignment Mechanism

Note: That kind of problem should be entirely solvable when we implement the full ZDO spec, since we can then trigger a proper network address request by IEEE when a device gets in that state to figure out if it's really gone, or if it just changed network address and "we missed it".

@Koenkk
Copy link
Owner

Koenkk commented Mar 28, 2024

when a device gets in that state to figure out if it's really gone, or if it just changed network address and "we missed it".

This is what zstack already does:

// Figure out once if the network address has been changed.

@Koenkk Koenkk merged commit 10e42fd into Koenkk:master Mar 28, 2024
1 check passed
@Koenkk
Copy link
Owner

Koenkk commented Mar 28, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants