rm convertToSSA API,test=huawei_ascend_npu test=nvidia_tensorrt test=verisilicon_timvx #8988
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
之前修复拓扑排序紊乱的PR #8967 已合入。本pr的目的测试删除convertToSSA api的使用, 如果该pr顺利合入的话,证明了之前修复的有效性。
除了删除 converttossa API相关的调用以及cmake定义之外;
本pr还修改了 type_target_cast_pass,原因如下:
直接删除 converttossa , 在xpu_dasou的CI中 模型ernie_gen 会报错,原因如下:
该模型的第8个输入 placeholder_7,作为模型的输入,其是host上的tensor, 但在子block中 assign算子,重新给这个变量赋值了,使placeholder_7成为了 xpu上的tensor,因此模型在第二次run时,初始化数据时, 使用memcpy去给placeholder_7 tensor里面的指针赋值,会报错。只能使用 xpu_memcpy。
一个简化的demo如下图所示:
feed 是 op的输入, op的结果回写到feed, 然后再执行其他op,假设这个模型运行在xpu上,得到的中间表达如下:
可见,feed var从host上的tensor变成了xpu上的tensor。因此模型第二次run,初始化数据时,只能使用xpu_memcpy。
一个可行的方案是在输出侧也插入 io copy算子。(这样会产生一个新的变量,与原始的converttossa api中生成新的变量类似)