-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Reformat of TensorRT to use subgraph API #14040
Conversation
*/ | ||
MXNET_DLL int MXExecutorGetOptimizedSymbol(ExecutorHandle handle, | ||
SymbolHandle *out); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to remove an API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually created this API just for the previous TensorRT implementation, I'm not sure it is used anywhere else. It could actually be used in case you are calling a subgraph backend with variable environment. Do you want to keep it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It affects semantic versioning if we remove it (it can break compilation on downstream projects) so if there's not a strong reason to remove it we should leave it in for that reason. Can it still preform a useful function (for example showing the graph after optimization)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, should I make it more general then ? (It was only working using TrtGraphExecutor which doesn't exist anymore)
} | ||
} | ||
|
||
inline std::string StringDType(int dtype) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed, don't see any references to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug function, nice catch, will remove
if (trt_builder->platformHasFastFp16()) { | ||
trt_builder->setFp16Mode(true); | ||
} else { | ||
LOG(INFO) << "WARNING: TensorRT can't use fp16 on this plateform"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plateform -> platform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we're logging INFO level logs but have WARNING in the message. I'd remove the warning from the message and set log level to warning. (this is a common issue in our codebase).
return false; | ||
} | ||
} | ||
if (isTRTCompatible(new_node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified to
return isTRTCompatible(new_node);
if (o.index == e->index && o.node.get() == e->node.get()) { | ||
e->index = i; | ||
e->node = subgraph_node; | ||
// TODO(cfujitsang): For futur support this would fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
futur -> future.
} | ||
|
||
nnvm::NodePtr CreateSubgraphNode(const nnvm::Symbol &sym, | ||
const int subgraph_id = 0) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible can we avoid default values in overriding functions.
ONNX-Tensorrt might fail to build as it adds a new default target that requires a header not in a standard search path. There's a few ways to work around the problem, we could just build the library and not that tool in CI, or we could include the header folder location in the search path like so: /~https://github.com/apache/incubator-mxnet/pull/13906/files#diff-56133c25b5a238b76f54c0928f05a8e6 It should allow the TensorRT build to pass CI if we make that change in this PR. |
Seems that my modification in CutGraphInputs is breaking some of the default attributes inference functions, I think my modification make sense (duplicating inputs in the subgraph appear to me like a bug). I can modify |
CI should be in reasonably good shape now. Looks like there's some linting issues in a few headers:
|
Great to see this is happening. I have two high-level comments:
|
|
@Caenorst The point is adding zero overhead in terms of user experience of using TensorRT in MXNet. I don't think you need to pass the param data explicitly beforehand. They can all be collected at the first forward call for each subgraph. You can take a look how this is done in TVM. On the high level, in the forward function of a tensorrt subgraph op, you have a list of inputs which contains both input data and params. You just need to differentiate param names from input data names. This can be achieved by analyzing whether a data node's Regarding the dependence on protobuf, onnx, and onnx-tensorrt, here is my two cents, I personally don't think introducing so many third-party dependencies is a good idea, because it results in an extra layer of complexity in terms of implementation, maintenance, build, and deployment, and the conversion process is not manageable by this community. If there is a direct solution of creating a bridge from nnvm to tensorrt, why not use it? |
@reminisce |
What do you mean "mix context"? We only have one context which gpu in this case. Regarding releasing the weight params after they are copied to tensorrt engines, I think it's doable by removing the corresponding |
@reminisce |
This is not true. When binding completes, you have all the weights on GPU. |
Are you saying that there are operators supported by TensorRT but creating them is not available through the C++ APIs? Have those operators been added to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that there are operators supported by TensorRT but creating them is not available through the C++ APIs? Have those operators been added to the
unconditionalTRTop
? The TRT C++ API also supports plugin interface.
They are added as IPlugin (so not natively supported by TensorRT), but of course, you could do try to do your own Plugins. For the moment I don't think any of those ops are supported by my implementation, but I don't think it will be hard to do, it's just a long term argument
} | ||
} | ||
|
||
inline std::string StringDType(int dtype) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug function, nice catch, will remove
*/ | ||
MXNET_DLL int MXExecutorGetOptimizedSymbol(ExecutorHandle handle, | ||
SymbolHandle *out); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, should I make it more general then ? (It was only working using TrtGraphExecutor which doesn't exist anymore)
@Caenorst Could you look fix merge conflicts and retrigger CI? |
Yes, that's what I meant, weights have to be copied from CPU to GPU, so they have to be originally on CPU. Actually they have to be copied first in the ONNX model |
@KellenSunderland and @anirudh2290 could you please review this PR again. Thanks ! |
@karan6181 Taking a look this week. |
Hey @Caenorst Sorry can you rebase this once more? There's no reported conflicts from Github but for some reason when running in CI it's not picking up some changes which means the R tests will fail (they're trying to use an old domain). PR looks good to me though, will merge as soon as it's rebased. |
ccae63f
to
5418349
Compare
@Caenorst Thank you for making the changes. @KellenSunderland is this good to go? |
@mxnet-label-bot add [pr-awaiting-testing] |
LGTM, did some testing on a home machine. |
Small update: still debugging a crash showing up during teardown. Seems like a problem relating to the order of static scoped objects being deconstructed. |
Are you able to reproduce this locally @Caenorst? I've run through quite a few times now and haven't hit the issue. |
Scratch that, able to reproduce now. Digging a bit further in ... |
Ok I see what's happening. Looks like we've done some refactoring to help prevent shutdown engine crashes but still have some fixed to apply before it's working 100%. I believe this PR should solve the issue we're seeing in the test: #14591 Would you be able to test with that commit applied? |
Ok issue that was tripping up our tests has been fixed. Would you be able to do another rebase (sorry) and then we should be able to merge. |
@Caenorst thanks for the contribution, gentle ping to rebase so we can merge this PR. |
Rebase done. Sorry for the delay. |
No worries about the delay. My team's been testing the TRT integration out and are seeing better than expected speedups so far. Let's merge this change. I'm adding a todo for myself to update all documentation relating to the API change in our TRT tutorials. We've also found a few minor bugs we'll hope to PR soon. |
@KellenSunderland @Caenorst , the tutorial for tensorrt is out of the date and I have had a few user asking us why it's not working anymore https://mxnet.incubator.apache.org/versions/master/tutorials/tensorrt/inference_with_trt.html |
@ThomasDelteil the PR for updating it is on its way: https://mxnet.incubator.apache.org/versions/master/tutorials/tensorrt/inference_with_trt.html Meanwhile here is the new way to use TensorRT:
And then you can use |
Thanks @Caenorst for the quick update! |
Yes, sorry let me update that PR to target the 1.5 branch.
…On Fri, Jun 7, 2019 at 10:45 AM Thomas Delteil ***@***.***> wrote:
Thanks @Caenorst </~https://github.com/Caenorst> for the quick update!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14040>,
or mute the thread
</~https://github.com/notifications/unsubscribe-auth/ABYZGE4XNDH2C4DFF2WSHDDPZKNCVANCNFSM4GTVIXMA>
.
|
Description
This PR modify TensorRT usage by relying on the Subgraph API, the backend is named 'TensorRT'
Checklist
Essentials
Changes
Comments