-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intel10G: better initialization / closing #360
Conversation
Looks promising! Do you see why SnabbBot failed this on the Intel selftest? |
Test case is looking really good with so many iterations! I tested this branch (on |
the This is typically what happens when some allocated resource taps out and the packet flow grinds to a halt, meaning that i'm still not releasing or reusing everything on each reconfig. I think this test run churns packets faster than what i've seen manually, thus maybe in my test i didn't manage to totally leak out and with 'just' 100 iterations of 0.25sec each. I'll try to make it happen earlier changing the preallocated amounts, time of iterations, size of packets, etc. to find exactly what resource is leaking. |
@@ -54,9 +54,14 @@ uint32_t volatile *map_pci_resource(int fd) | |||
} | |||
} | |||
|
|||
void close_pci_resource(int fd) | |||
void close_pci_resource(int fd, uint32_t volatile *addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been clarifying volatile
and related issues with Mike Pall on the LuaJIT list.
Conclusions:
There is a risk that the JIT optimizations will eliminate or reorder loads and stores.
volatile
is not the solution: LuaJIT ignores that. Probably we should not use it: I see a risk that we will expect it to behave in a way that it does not.
Compiler barriers are a solution. For example, immediately before reading or writing a memory mapped register we could always call a lib.c
function:
void compiler_barrier() {}
and LuaJIT will flush loads/stores from registers to memory before making an FFI call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know, this could really be relevant to the bugs that you are seeing. The first times the NIC is initialised the code will be running interpreted. Once the init code gets hot and compiled then the behaviour could change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the volatile
keword, it's only there to be consistent with the return value of map_pci_resource()
. I agree that it's superfluous and would be better removed. Even in C it's dubious that would be any benefit as an argument or return value.
About barriers in initialization code, most of the places where order is important, the Intel docs advise to insert a wait, sometimes a few microseconds, sometimes up to a millisecond. In those cases we use C.usleep()
, which should have the barrier effect as well. (and it's hard to imagine some reordering/pending write to last as long as a microsecond).
Where we could be missing some defensive coding is more in the data handling itself. Maybe the M_sf:sync_transmit ()
race is related to this?
Finally, I don't think the initialization code is ever compiled. Not even in the manyreconf()
calls it tightly enough. What would be a simple way to check that?
The last point in particular made all cards on both davos and grindelwald pass all tests. I couldn't test on chur because it's all tied up in some long-term process.
No more "flaky hardware" excuses!