Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intel10g: Disable DCA relaxed ordering in SF mode #523

Closed
wants to merge 1 commit into from

Conversation

lukego
Copy link
Member

@lukego lukego commented Jun 22, 2015

Apply the same DCA settings for SF/PF (Single Function) mode as for VF (Virtual Function) mode. This prevents heavy packet payload corruption on at least one server.

This is a draft fix for the issue reported by Alex Gall on snabb-devel.

I am not sure the details are exactly right: by my reading of the data sheet this is actually clearing a reserved bit. @javierguerragiraldez what do you think?

Apply the same DCA settings for SF/PF (Single Function) mode as for VF
(Virtual Function) mode. This prevents heavy packet payload corruption
on at least one server.
@lukego
Copy link
Member Author

lukego commented Jun 22, 2015

(I'm not sure why this failed CI. It works when I cherry-pick it onto the vpn branch where the root problem is found. Some revision likely needed anyway.)

@javierguerragiraldez
Copy link
Contributor

well, in the bit definition it only says "Reserved. Must be set to 0." and the initial value is 1b; so it seems this should be done anyway.

Section 4.6.7 "Receive initialization" ends with "Set bit 16 of the CTRL_EXT register and clear bit 12 of the DCA_RXCTRL[n] register[n]." as the last steps to initialize the receive path.

I guess we haven't seen this issue before because it's somewhat hardware dependent, and it's only missing in the non-VMDq setup.

@alexandergall
Copy link
Contributor

Bingo. This fixes my problem :) It is puzzling why a reserved bit would have such an effect and why the default value is not 0. OTOH, the data sheet is explicit what to do :/ @lukego it seems to me that labeling this with "disable relaxed ordering" is a misnomer, since that's not what it does, AFAICT. There are other flags that deal with relaxed ordering for descriptor write back and data.

Maybe the initialization sequence needs to be ironed out a bit? And I would really like to enable DCA properly, because it looks like it could be very relevant for us. In fact, I wonder why nobody has done this yet?

lukego added a commit to lukego/snabb that referenced this pull request Jun 23, 2015
Fix a problem where packet payload could be corrupted on receive on
certain hardware. There was an exotic register flag that needs to be
cleared and this was taken care of in VF mode (when using VMDq) but not
in SF mode (when using a raw NIC). Now both cases work the same.

Reported here:
https://groups.google.com/d/msg/snabb-devel/sP3wJ-8fEEA/O1WMAG96SlQJ

May also resolve a previously reported problem:
https://groups.google.com/forum/#!topic/snabb-devel/MrzImre1gbM

Obsoletes snabbco#523.
@lukego
Copy link
Member Author

lukego commented Jun 23, 2015

It would be neat to have a benchmark that is CPU bound and does only simple packet forwarding. That could be used to test optimizations of the basic I/O facilities.

In the case of DCA I believe that this is legacy technology from older CPUs that had an external PCIe controller (before Sandy Bridge). I believe that the successor DDIO works automatically provided that the CPU processing traffic is the same one that the NIC is attached to (now that each CPU has a private PCIe controller). Anecdotally we have seen ~30% performance difference between using the CPU that the NIC is attached to vs. a second CPU, and this may well be the effect of DDIO.

I would actually love to optimize the "wrong NUMA" setup and have a benchmark for that too. I am sure that it will be common for NUMA affinity to be misconfigured in practice e.g. when working with complicated cloud computing middleware.

Closing this issue now in favor of #524.

@lukego lukego closed this Jun 23, 2015
@alexandergall
Copy link
Contributor

Interesting. Up to now, I had also assumed that DDIO was there and enabled, allthough it confused me that the data sheet has absolutely nothing to say about it (at least I can't find anything). The section on DCA made me believe that DDIO might not actually be supported, but if it's the way you say, all is fine :)

eugeneia pushed a commit to eugeneia/snabb that referenced this pull request Jul 9, 2015
Fix a problem where packet payload could be corrupted on receive on
certain hardware. There was an exotic register flag that needs to be
cleared and this was taken care of in VF mode (when using VMDq) but not
in SF mode (when using a raw NIC). Now both cases work the same.

Reported here:
https://groups.google.com/d/msg/snabb-devel/sP3wJ-8fEEA/O1WMAG96SlQJ

May also resolve a previously reported problem:
https://groups.google.com/forum/#!topic/snabb-devel/MrzImre1gbM

Obsoletes snabbco#523.
@lukego lukego deleted the intel10g-sf-dca-norelax branch August 29, 2016 19:14
mwiget pushed a commit to mwiget/snabb that referenced this pull request Oct 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants