Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly] Support assembly parsing for new EH #108668

Merged
merged 13 commits into from
Sep 17, 2024
Merged

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Sep 14, 2024

This adds assembly parsing support for the new EH (exnref) proposal.

try_table parsing is a little tricky because catch clause lists use () and the multivalue block return types also use (). This handles all combinations below:

  • No return type (void) + no catch list
  • No return type (void) + catch list
  • Single return type + no catch list
  • Single return type + catch list
  • Multivalue return type + no catch list
  • Multivalue return type + catch list

This does not include AsmTypeCheck support yet. That's the reason why this adds a new test file and use --no-type-check in the command line. After the type checker is added as a follow-up, I plan to merge /~https://github.com/llvm/llvm-project/blob/main/llvm/test/MC/WebAssembly/eh-assembly-legacy.s with this file. (Turning on -mattr=+exception-handling adds support for all legacy and new EH instructions in the assembly. -wasm-enable-exnref in llc only controls which instructions to generate and it doesn't affect llvm-mc and assembly parsing.)

This adds assembly parsing support for the new EH (exnref) proposal.

`try_table` parsing is a little tricky because catch clause lists use
`()` and the multivalue block return types also use `()`. This handles
all combinations below:
- No return type (void) + no catch list
- No return type (void) + catch list
- Single return type + no catch list
- Single return type + catch list
- Multivalue return type + no catch list
- Multivalue return type + catch list

This does not include AsmTypeCheck support yet. That's the reason why
this adds a new test file and use `--no-type-check` in the command line.
After the type checker is added as a follow-up, I plan to merge this
file with the existing
/~https://github.com/llvm/llvm-project/blob/main/llvm/test/MC/WebAssembly/eh-assembly.s.
(Turning on `-mattr=+exception-handling` adds support for all
legacy and new EH instructions in the assembly. `-wasm-enable-exnref`
in `llc` only controls which instructions to generate and it doesn't
affect `llvm-mc` and assembly parsing.)
@aheejin aheejin requested a review from dschuff September 14, 2024 01:51
@llvmbot llvmbot added backend:WebAssembly mc Machine (object) code labels Sep 14, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 14, 2024

@llvm/pr-subscribers-mc

Author: Heejin Ahn (aheejin)

Changes

This adds assembly parsing support for the new EH (exnref) proposal.

try_table parsing is a little tricky because catch clause lists use () and the multivalue block return types also use (). This handles all combinations below:

  • No return type (void) + no catch list
  • No return type (void) + catch list
  • Single return type + no catch list
  • Single return type + catch list
  • Multivalue return type + no catch list
  • Multivalue return type + catch list

This does not include AsmTypeCheck support yet. That's the reason why this adds a new test file and use --no-type-check in the command line. After the type checker is added as a follow-up, I plan to merge this file with the existing
/~https://github.com/llvm/llvm-project/blob/main/llvm/test/MC/WebAssembly/eh-assembly.s. (Turning on -mattr=+exception-handling adds support for all legacy and new EH instructions in the assembly. -wasm-enable-exnref in llc only controls which instructions to generate and it doesn't affect llvm-mc and assembly parsing.)


Full diff: /~https://github.com/llvm/llvm-project/pull/108668.diff

2 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (+137-10)
  • (added) llvm/test/MC/WebAssembly/eh-assembly-new.s (+146)
diff --git a/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp b/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
index 5299e6ea06f0bd..03ea5b09c4fd4a 100644
--- a/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
+++ b/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
@@ -69,12 +69,23 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
     std::vector<unsigned> List;
   };
 
+  struct CaLOpElem {
+    int64_t Opcode;
+    const MCExpr *Tag;
+    int64_t Dest;
+  };
+
+  struct CaLOp {
+    std::vector<CaLOpElem> List;
+  };
+
   union {
     struct TokOp Tok;
     struct IntOp Int;
     struct FltOp Flt;
     struct SymOp Sym;
     struct BrLOp BrL;
+    struct CaLOp CaL;
   };
 
   WebAssemblyOperand(SMLoc Start, SMLoc End, TokOp T)
@@ -85,12 +96,16 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
       : Kind(Float), StartLoc(Start), EndLoc(End), Flt(F) {}
   WebAssemblyOperand(SMLoc Start, SMLoc End, SymOp S)
       : Kind(Symbol), StartLoc(Start), EndLoc(End), Sym(S) {}
-  WebAssemblyOperand(SMLoc Start, SMLoc End)
-      : Kind(BrList), StartLoc(Start), EndLoc(End), BrL() {}
+  WebAssemblyOperand(SMLoc Start, SMLoc End, BrLOp B)
+      : Kind(BrList), StartLoc(Start), EndLoc(End), BrL(B) {}
+  WebAssemblyOperand(SMLoc Start, SMLoc End, CaLOp C)
+      : Kind(CatchList), StartLoc(Start), EndLoc(End), CaL(C) {}
 
   ~WebAssemblyOperand() {
     if (isBrList())
       BrL.~BrLOp();
+    if (isCatchList())
+      CaL.~CaLOp();
   }
 
   bool isToken() const override { return Kind == Token; }
@@ -153,7 +168,15 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
   }
 
   void addCatchListOperands(MCInst &Inst, unsigned N) const {
-    // TODO
+    assert(N == 1 && isCatchList() && "Invalid CatchList!");
+    Inst.addOperand(MCOperand::createImm(CaL.List.size()));
+    for (auto Ca : CaL.List) {
+      Inst.addOperand(MCOperand::createImm(Ca.Opcode));
+      if (Ca.Opcode == wasm::WASM_OPCODE_CATCH ||
+          Ca.Opcode == wasm::WASM_OPCODE_CATCH_REF)
+        Inst.addOperand(MCOperand::createExpr(Ca.Tag));
+      Inst.addOperand(MCOperand::createImm(Ca.Dest));
+    }
   }
 
   void print(raw_ostream &OS) const override {
@@ -174,7 +197,7 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
       OS << "BrList:" << BrL.List.size();
       break;
     case CatchList:
-      // TODO
+      OS << "CaList:" << CaL.List.size();
       break;
     }
   }
@@ -228,6 +251,7 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     Loop,
     Try,
     CatchAll,
+    TryTable,
     If,
     Else,
     Undefined,
@@ -304,6 +328,8 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
       return {"try", "end_try/delegate"};
     case CatchAll:
       return {"catch_all", "end_try"};
+    case TryTable:
+      return {"try_table", "end_try_table"};
     case If:
       return {"if", "end_if"};
     case Else:
@@ -571,6 +597,7 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     // proper nesting.
     bool ExpectBlockType = false;
     bool ExpectFuncType = false;
+    bool ExpectCatchList = false;
     std::unique_ptr<WebAssemblyOperand> FunctionTable;
     if (Name == "block") {
       push(Block);
@@ -593,12 +620,19 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     } else if (Name == "catch_all") {
       if (popAndPushWithSameSignature(Name, Try, CatchAll))
         return true;
+    } else if (Name == "try_table") {
+      push(TryTable);
+      ExpectBlockType = true;
+      ExpectCatchList = true;
     } else if (Name == "end_if") {
       if (pop(Name, If, Else))
         return true;
     } else if (Name == "end_try") {
       if (pop(Name, Try, CatchAll))
         return true;
+    } else if (Name == "end_try_table") {
+      if (pop(Name, TryTable))
+        return true;
     } else if (Name == "delegate") {
       if (pop(Name, Try))
         return true;
@@ -622,7 +656,18 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
       ExpectFuncType = true;
     }
 
-    if (ExpectFuncType || (ExpectBlockType && Lexer.is(AsmToken::LParen))) {
+    // Returns true if the next tokens are a catch clause
+    auto PeekCatchList = [&]() {
+      if (Lexer.isNot(AsmToken::LParen))
+        return false;
+      AsmToken NextTok = Lexer.peekTok();
+      return NextTok.getKind() == AsmToken::Identifier &&
+             NextTok.getIdentifier().starts_with("catch");
+    };
+
+    // Parse a multivalue block type
+    if (ExpectFuncType ||
+        (Lexer.is(AsmToken::LParen) && ExpectBlockType && !PeekCatchList())) {
       // This has a special TYPEINDEX operand which in text we
       // represent as a signature, such that we can re-build this signature,
       // attach it to an anonymous symbol, which is what WasmObjectWriter
@@ -648,6 +693,23 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
           Loc.getLoc(), Loc.getEndLoc(), WebAssemblyOperand::SymOp{Expr}));
     }
 
+    // If we are expecting a catch clause list, try to parse it here.
+    //
+    // If there is a multivalue block return type before this catch list, it
+    // should have been parsed above. If there is no return type before
+    // encountering this catch list, this means the type is void.
+    // The case when there is a single block return value and then a catch list
+    // will be handled below in the 'while' loop.
+    if (ExpectCatchList && PeekCatchList()) {
+      if (ExpectBlockType) {
+        ExpectBlockType = false;
+        addBlockTypeOperand(Operands, NameLoc, WebAssembly::BlockType::Void);
+      }
+      if (parseCatchList(Operands))
+        return true;
+      ExpectCatchList = false;
+    }
+
     while (Lexer.isNot(AsmToken::EndOfStatement)) {
       auto &Tok = Lexer.getTok();
       switch (Tok.getKind()) {
@@ -661,7 +723,15 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
           if (BT == WebAssembly::BlockType::Invalid)
             return error("Unknown block type: ", Id);
           addBlockTypeOperand(Operands, NameLoc, BT);
+          ExpectBlockType = false;
           Parser.Lex();
+          // Now that we've parsed a single block return type, if we are
+          // expecting a catch clause list, try to parse it.
+          if (ExpectCatchList && PeekCatchList()) {
+            if (parseCatchList(Operands))
+              return true;
+            ExpectCatchList = false;
+          }
         } else {
           // Assume this identifier is a label.
           const MCExpr *Val;
@@ -703,8 +773,8 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
       }
       case AsmToken::LCurly: {
         Parser.Lex();
-        auto Op =
-            std::make_unique<WebAssemblyOperand>(Tok.getLoc(), Tok.getEndLoc());
+        auto Op = std::make_unique<WebAssemblyOperand>(
+            Tok.getLoc(), Tok.getEndLoc(), WebAssemblyOperand::BrLOp{});
         if (!Lexer.is(AsmToken::RCurly))
           for (;;) {
             Op->BrL.List.push_back(Lexer.getTok().getIntVal());
@@ -724,10 +794,18 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
           return true;
       }
     }
-    if (ExpectBlockType && Operands.size() == 1) {
-      // Support blocks with no operands as default to void.
+
+    // If we are still expecting to parse a block type or a catch list at this
+    // point, we set them to the default/empty state.
+
+    // Support blocks with no operands as default to void.
+    if (ExpectBlockType)
       addBlockTypeOperand(Operands, NameLoc, WebAssembly::BlockType::Void);
-    }
+    // If no catch list has been parsed, add an empty catch list operand.
+    if (ExpectCatchList)
+      Operands.push_back(std::make_unique<WebAssemblyOperand>(
+          NameLoc, NameLoc, WebAssemblyOperand::CaLOp{}));
+
     if (FunctionTable)
       Operands.push_back(std::move(FunctionTable));
     Parser.Lex();
@@ -752,6 +830,55 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     return false;
   }
 
+  bool parseCatchList(OperandVector &Operands) {
+    auto Op = std::make_unique<WebAssemblyOperand>(
+        Lexer.getTok().getLoc(), SMLoc(), WebAssemblyOperand::CaLOp{});
+    SMLoc EndLoc;
+
+    while (Lexer.is(AsmToken::LParen)) {
+      if (expect(AsmToken::LParen, "("))
+        return true;
+
+      auto CatchStr = expectIdent();
+      if (CatchStr.empty())
+        return true;
+      int64_t CatchOpcode =
+          StringSwitch<int64_t>(CatchStr)
+              .Case("catch", wasm::WASM_OPCODE_CATCH)
+              .Case("catch_ref", wasm::WASM_OPCODE_CATCH_REF)
+              .Case("catch_all", wasm::WASM_OPCODE_CATCH_ALL)
+              .Case("catch_all_ref", wasm::WASM_OPCODE_CATCH_ALL_REF)
+              .Default(-1);
+      if (CatchOpcode == -1)
+        return error(
+            "Expected catch/catch_ref/catch_all/catch_all_ref, instead got: " +
+            CatchStr);
+
+      const MCExpr *Tag = nullptr;
+      if (CatchOpcode == wasm::WASM_OPCODE_CATCH ||
+          CatchOpcode == wasm::WASM_OPCODE_CATCH_REF) {
+        if (Parser.parseExpression(Tag))
+          return error("Cannot parse symbol: ", Lexer.getTok());
+      }
+
+      auto &DestTok = Lexer.getTok();
+      if (DestTok.isNot(AsmToken::Integer))
+        return error("Expected integer constant, instead got: ", DestTok);
+      int64_t Dest = DestTok.getIntVal();
+      Parser.Lex();
+
+      EndLoc = Lexer.getTok().getEndLoc();
+      if (expect(AsmToken::RParen, ")"))
+        return true;
+
+      Op->CaL.List.push_back({CatchOpcode, Tag, Dest});
+    }
+
+    Op->EndLoc = EndLoc;
+    Operands.push_back(std::move(Op));
+    return false;
+  }
+
   bool CheckDataSection() {
     if (CurrentState != DataSection) {
       auto WS = cast<MCSectionWasm>(getStreamer().getCurrentSectionOnly());
diff --git a/llvm/test/MC/WebAssembly/eh-assembly-new.s b/llvm/test/MC/WebAssembly/eh-assembly-new.s
new file mode 100644
index 00000000000000..8069b666d73e53
--- /dev/null
+++ b/llvm/test/MC/WebAssembly/eh-assembly-new.s
@@ -0,0 +1,146 @@
+# RUN: llvm-mc -triple=wasm32-unknown-unknown -mattr=+exception-handling --no-type-check < %s | FileCheck %s
+
+  .tagtype  __cpp_exception i32
+  .tagtype  __c_longjmp i32
+  .functype  eh_test () -> ()
+  .functype  foo () -> ()
+
+eh_test:
+  # try_table with all four kinds of catch clauses
+  block exnref
+    block
+      block () -> (i32, exnref)
+        block i32
+          try_table (catch __cpp_exception 0) (catch_ref __c_longjmp 1) (catch_all 2) (catch_all_ref 3)
+            i32.const 0
+            throw     __cpp_exception
+          end_try_table
+        end_block
+        drop
+      end_block
+      drop
+      drop
+    end_block
+  end_block
+  drop
+
+  # You can use the same kind of catch clause more than once
+  block
+    block exnref
+      block
+        try_table (catch_all 0) (catch_all_ref 1) (catch_all 2)
+          call  foo
+        end_try_table
+      end_block
+    end_block
+    drop
+  end_block
+
+  # Two catch clauses targeting the same block
+  block
+    try_table (catch_all 0) (catch_all 0)
+    end_try_table
+  end_block
+
+  # try_table with a return type
+  block
+    try_table f32 (catch_all 0)
+      f32.const 0.0
+    end_try_table
+    drop
+  end_block
+
+  # try_table with a multivalue type return
+  block
+    try_table () -> (i32, f32) (catch_all 0)
+      i32.const 0
+      f32.const 0.0
+    end_try_table
+    drop
+    drop
+  end_block
+
+  # catch-less try_tables
+  try_table
+    call  foo
+  end_try_table
+
+  try_table i32
+    i32.const 0
+  end_try_table
+  drop
+
+  try_table () -> (i32, f32)
+    i32.const 0
+    f32.const 0.0
+  end_try_table
+  drop
+  drop
+
+  end_function
+
+# CHECK-LABEL: eh_test:
+# CHECK-NEXT:    block           exnref
+# CHECK-NEXT:    block
+# CHECK-NEXT:    block           () -> (i32, exnref)
+# CHECK-NEXT:    block           i32
+# CHECK-NEXT:    try_table        (catch __cpp_exception 0) (catch_ref __c_longjmp 1) (catch_all 2) (catch_all_ref 3)
+# CHECK:         i32.const       0
+# CHECK-NEXT:    throw           __cpp_exception
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+
+# CHECK:         block
+# CHECK-NEXT:    block           exnref
+# CHECK-NEXT:    block
+# CHECK-NEXT:    try_table        (catch_all 0) (catch_all_ref 1) (catch_all 2)
+# CHECK:         call    foo
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+
+# CHECK:         block
+# CHECK-NEXT:    try_table        (catch_all 0) (catch_all 0)
+# CHECK:         end_try_table
+# CHECK-NEXT:    end_block
+
+# CHECK:         block
+# CHECK-NEXT:    try_table       f32 (catch_all 0)
+# CHECK:         f32.const       0x0p0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+
+# CHECK:         block
+# CHECK-NEXT:    try_table       () -> (i32, f32) (catch_all 0)
+# CHECK:         i32.const       0
+# CHECK-NEXT:    f32.const       0x0p0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+
+# CHECK:         try_table
+# CHECK-NEXT:    call    foo
+# CHECK-NEXT:    end_try_table
+
+# CHECK:         try_table       i32
+# CHECK-NEXT:    i32.const       0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+
+# CHECK:         try_table       () -> (i32, f32)
+# CHECK-NEXT:    i32.const       0
+# CHECK-NEXT:    f32.const       0x0p0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    drop

@llvmbot
Copy link
Member

llvmbot commented Sep 14, 2024

@llvm/pr-subscribers-backend-webassembly

Author: Heejin Ahn (aheejin)

Changes

This adds assembly parsing support for the new EH (exnref) proposal.

try_table parsing is a little tricky because catch clause lists use () and the multivalue block return types also use (). This handles all combinations below:

  • No return type (void) + no catch list
  • No return type (void) + catch list
  • Single return type + no catch list
  • Single return type + catch list
  • Multivalue return type + no catch list
  • Multivalue return type + catch list

This does not include AsmTypeCheck support yet. That's the reason why this adds a new test file and use --no-type-check in the command line. After the type checker is added as a follow-up, I plan to merge this file with the existing
/~https://github.com/llvm/llvm-project/blob/main/llvm/test/MC/WebAssembly/eh-assembly.s. (Turning on -mattr=+exception-handling adds support for all legacy and new EH instructions in the assembly. -wasm-enable-exnref in llc only controls which instructions to generate and it doesn't affect llvm-mc and assembly parsing.)


Full diff: /~https://github.com/llvm/llvm-project/pull/108668.diff

2 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (+137-10)
  • (added) llvm/test/MC/WebAssembly/eh-assembly-new.s (+146)
diff --git a/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp b/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
index 5299e6ea06f0bd..03ea5b09c4fd4a 100644
--- a/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
+++ b/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
@@ -69,12 +69,23 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
     std::vector<unsigned> List;
   };
 
+  struct CaLOpElem {
+    int64_t Opcode;
+    const MCExpr *Tag;
+    int64_t Dest;
+  };
+
+  struct CaLOp {
+    std::vector<CaLOpElem> List;
+  };
+
   union {
     struct TokOp Tok;
     struct IntOp Int;
     struct FltOp Flt;
     struct SymOp Sym;
     struct BrLOp BrL;
+    struct CaLOp CaL;
   };
 
   WebAssemblyOperand(SMLoc Start, SMLoc End, TokOp T)
@@ -85,12 +96,16 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
       : Kind(Float), StartLoc(Start), EndLoc(End), Flt(F) {}
   WebAssemblyOperand(SMLoc Start, SMLoc End, SymOp S)
       : Kind(Symbol), StartLoc(Start), EndLoc(End), Sym(S) {}
-  WebAssemblyOperand(SMLoc Start, SMLoc End)
-      : Kind(BrList), StartLoc(Start), EndLoc(End), BrL() {}
+  WebAssemblyOperand(SMLoc Start, SMLoc End, BrLOp B)
+      : Kind(BrList), StartLoc(Start), EndLoc(End), BrL(B) {}
+  WebAssemblyOperand(SMLoc Start, SMLoc End, CaLOp C)
+      : Kind(CatchList), StartLoc(Start), EndLoc(End), CaL(C) {}
 
   ~WebAssemblyOperand() {
     if (isBrList())
       BrL.~BrLOp();
+    if (isCatchList())
+      CaL.~CaLOp();
   }
 
   bool isToken() const override { return Kind == Token; }
@@ -153,7 +168,15 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
   }
 
   void addCatchListOperands(MCInst &Inst, unsigned N) const {
-    // TODO
+    assert(N == 1 && isCatchList() && "Invalid CatchList!");
+    Inst.addOperand(MCOperand::createImm(CaL.List.size()));
+    for (auto Ca : CaL.List) {
+      Inst.addOperand(MCOperand::createImm(Ca.Opcode));
+      if (Ca.Opcode == wasm::WASM_OPCODE_CATCH ||
+          Ca.Opcode == wasm::WASM_OPCODE_CATCH_REF)
+        Inst.addOperand(MCOperand::createExpr(Ca.Tag));
+      Inst.addOperand(MCOperand::createImm(Ca.Dest));
+    }
   }
 
   void print(raw_ostream &OS) const override {
@@ -174,7 +197,7 @@ struct WebAssemblyOperand : public MCParsedAsmOperand {
       OS << "BrList:" << BrL.List.size();
       break;
     case CatchList:
-      // TODO
+      OS << "CaList:" << CaL.List.size();
       break;
     }
   }
@@ -228,6 +251,7 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     Loop,
     Try,
     CatchAll,
+    TryTable,
     If,
     Else,
     Undefined,
@@ -304,6 +328,8 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
       return {"try", "end_try/delegate"};
     case CatchAll:
       return {"catch_all", "end_try"};
+    case TryTable:
+      return {"try_table", "end_try_table"};
     case If:
       return {"if", "end_if"};
     case Else:
@@ -571,6 +597,7 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     // proper nesting.
     bool ExpectBlockType = false;
     bool ExpectFuncType = false;
+    bool ExpectCatchList = false;
     std::unique_ptr<WebAssemblyOperand> FunctionTable;
     if (Name == "block") {
       push(Block);
@@ -593,12 +620,19 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     } else if (Name == "catch_all") {
       if (popAndPushWithSameSignature(Name, Try, CatchAll))
         return true;
+    } else if (Name == "try_table") {
+      push(TryTable);
+      ExpectBlockType = true;
+      ExpectCatchList = true;
     } else if (Name == "end_if") {
       if (pop(Name, If, Else))
         return true;
     } else if (Name == "end_try") {
       if (pop(Name, Try, CatchAll))
         return true;
+    } else if (Name == "end_try_table") {
+      if (pop(Name, TryTable))
+        return true;
     } else if (Name == "delegate") {
       if (pop(Name, Try))
         return true;
@@ -622,7 +656,18 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
       ExpectFuncType = true;
     }
 
-    if (ExpectFuncType || (ExpectBlockType && Lexer.is(AsmToken::LParen))) {
+    // Returns true if the next tokens are a catch clause
+    auto PeekCatchList = [&]() {
+      if (Lexer.isNot(AsmToken::LParen))
+        return false;
+      AsmToken NextTok = Lexer.peekTok();
+      return NextTok.getKind() == AsmToken::Identifier &&
+             NextTok.getIdentifier().starts_with("catch");
+    };
+
+    // Parse a multivalue block type
+    if (ExpectFuncType ||
+        (Lexer.is(AsmToken::LParen) && ExpectBlockType && !PeekCatchList())) {
       // This has a special TYPEINDEX operand which in text we
       // represent as a signature, such that we can re-build this signature,
       // attach it to an anonymous symbol, which is what WasmObjectWriter
@@ -648,6 +693,23 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
           Loc.getLoc(), Loc.getEndLoc(), WebAssemblyOperand::SymOp{Expr}));
     }
 
+    // If we are expecting a catch clause list, try to parse it here.
+    //
+    // If there is a multivalue block return type before this catch list, it
+    // should have been parsed above. If there is no return type before
+    // encountering this catch list, this means the type is void.
+    // The case when there is a single block return value and then a catch list
+    // will be handled below in the 'while' loop.
+    if (ExpectCatchList && PeekCatchList()) {
+      if (ExpectBlockType) {
+        ExpectBlockType = false;
+        addBlockTypeOperand(Operands, NameLoc, WebAssembly::BlockType::Void);
+      }
+      if (parseCatchList(Operands))
+        return true;
+      ExpectCatchList = false;
+    }
+
     while (Lexer.isNot(AsmToken::EndOfStatement)) {
       auto &Tok = Lexer.getTok();
       switch (Tok.getKind()) {
@@ -661,7 +723,15 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
           if (BT == WebAssembly::BlockType::Invalid)
             return error("Unknown block type: ", Id);
           addBlockTypeOperand(Operands, NameLoc, BT);
+          ExpectBlockType = false;
           Parser.Lex();
+          // Now that we've parsed a single block return type, if we are
+          // expecting a catch clause list, try to parse it.
+          if (ExpectCatchList && PeekCatchList()) {
+            if (parseCatchList(Operands))
+              return true;
+            ExpectCatchList = false;
+          }
         } else {
           // Assume this identifier is a label.
           const MCExpr *Val;
@@ -703,8 +773,8 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
       }
       case AsmToken::LCurly: {
         Parser.Lex();
-        auto Op =
-            std::make_unique<WebAssemblyOperand>(Tok.getLoc(), Tok.getEndLoc());
+        auto Op = std::make_unique<WebAssemblyOperand>(
+            Tok.getLoc(), Tok.getEndLoc(), WebAssemblyOperand::BrLOp{});
         if (!Lexer.is(AsmToken::RCurly))
           for (;;) {
             Op->BrL.List.push_back(Lexer.getTok().getIntVal());
@@ -724,10 +794,18 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
           return true;
       }
     }
-    if (ExpectBlockType && Operands.size() == 1) {
-      // Support blocks with no operands as default to void.
+
+    // If we are still expecting to parse a block type or a catch list at this
+    // point, we set them to the default/empty state.
+
+    // Support blocks with no operands as default to void.
+    if (ExpectBlockType)
       addBlockTypeOperand(Operands, NameLoc, WebAssembly::BlockType::Void);
-    }
+    // If no catch list has been parsed, add an empty catch list operand.
+    if (ExpectCatchList)
+      Operands.push_back(std::make_unique<WebAssemblyOperand>(
+          NameLoc, NameLoc, WebAssemblyOperand::CaLOp{}));
+
     if (FunctionTable)
       Operands.push_back(std::move(FunctionTable));
     Parser.Lex();
@@ -752,6 +830,55 @@ class WebAssemblyAsmParser final : public MCTargetAsmParser {
     return false;
   }
 
+  bool parseCatchList(OperandVector &Operands) {
+    auto Op = std::make_unique<WebAssemblyOperand>(
+        Lexer.getTok().getLoc(), SMLoc(), WebAssemblyOperand::CaLOp{});
+    SMLoc EndLoc;
+
+    while (Lexer.is(AsmToken::LParen)) {
+      if (expect(AsmToken::LParen, "("))
+        return true;
+
+      auto CatchStr = expectIdent();
+      if (CatchStr.empty())
+        return true;
+      int64_t CatchOpcode =
+          StringSwitch<int64_t>(CatchStr)
+              .Case("catch", wasm::WASM_OPCODE_CATCH)
+              .Case("catch_ref", wasm::WASM_OPCODE_CATCH_REF)
+              .Case("catch_all", wasm::WASM_OPCODE_CATCH_ALL)
+              .Case("catch_all_ref", wasm::WASM_OPCODE_CATCH_ALL_REF)
+              .Default(-1);
+      if (CatchOpcode == -1)
+        return error(
+            "Expected catch/catch_ref/catch_all/catch_all_ref, instead got: " +
+            CatchStr);
+
+      const MCExpr *Tag = nullptr;
+      if (CatchOpcode == wasm::WASM_OPCODE_CATCH ||
+          CatchOpcode == wasm::WASM_OPCODE_CATCH_REF) {
+        if (Parser.parseExpression(Tag))
+          return error("Cannot parse symbol: ", Lexer.getTok());
+      }
+
+      auto &DestTok = Lexer.getTok();
+      if (DestTok.isNot(AsmToken::Integer))
+        return error("Expected integer constant, instead got: ", DestTok);
+      int64_t Dest = DestTok.getIntVal();
+      Parser.Lex();
+
+      EndLoc = Lexer.getTok().getEndLoc();
+      if (expect(AsmToken::RParen, ")"))
+        return true;
+
+      Op->CaL.List.push_back({CatchOpcode, Tag, Dest});
+    }
+
+    Op->EndLoc = EndLoc;
+    Operands.push_back(std::move(Op));
+    return false;
+  }
+
   bool CheckDataSection() {
     if (CurrentState != DataSection) {
       auto WS = cast<MCSectionWasm>(getStreamer().getCurrentSectionOnly());
diff --git a/llvm/test/MC/WebAssembly/eh-assembly-new.s b/llvm/test/MC/WebAssembly/eh-assembly-new.s
new file mode 100644
index 00000000000000..8069b666d73e53
--- /dev/null
+++ b/llvm/test/MC/WebAssembly/eh-assembly-new.s
@@ -0,0 +1,146 @@
+# RUN: llvm-mc -triple=wasm32-unknown-unknown -mattr=+exception-handling --no-type-check < %s | FileCheck %s
+
+  .tagtype  __cpp_exception i32
+  .tagtype  __c_longjmp i32
+  .functype  eh_test () -> ()
+  .functype  foo () -> ()
+
+eh_test:
+  # try_table with all four kinds of catch clauses
+  block exnref
+    block
+      block () -> (i32, exnref)
+        block i32
+          try_table (catch __cpp_exception 0) (catch_ref __c_longjmp 1) (catch_all 2) (catch_all_ref 3)
+            i32.const 0
+            throw     __cpp_exception
+          end_try_table
+        end_block
+        drop
+      end_block
+      drop
+      drop
+    end_block
+  end_block
+  drop
+
+  # You can use the same kind of catch clause more than once
+  block
+    block exnref
+      block
+        try_table (catch_all 0) (catch_all_ref 1) (catch_all 2)
+          call  foo
+        end_try_table
+      end_block
+    end_block
+    drop
+  end_block
+
+  # Two catch clauses targeting the same block
+  block
+    try_table (catch_all 0) (catch_all 0)
+    end_try_table
+  end_block
+
+  # try_table with a return type
+  block
+    try_table f32 (catch_all 0)
+      f32.const 0.0
+    end_try_table
+    drop
+  end_block
+
+  # try_table with a multivalue type return
+  block
+    try_table () -> (i32, f32) (catch_all 0)
+      i32.const 0
+      f32.const 0.0
+    end_try_table
+    drop
+    drop
+  end_block
+
+  # catch-less try_tables
+  try_table
+    call  foo
+  end_try_table
+
+  try_table i32
+    i32.const 0
+  end_try_table
+  drop
+
+  try_table () -> (i32, f32)
+    i32.const 0
+    f32.const 0.0
+  end_try_table
+  drop
+  drop
+
+  end_function
+
+# CHECK-LABEL: eh_test:
+# CHECK-NEXT:    block           exnref
+# CHECK-NEXT:    block
+# CHECK-NEXT:    block           () -> (i32, exnref)
+# CHECK-NEXT:    block           i32
+# CHECK-NEXT:    try_table        (catch __cpp_exception 0) (catch_ref __c_longjmp 1) (catch_all 2) (catch_all_ref 3)
+# CHECK:         i32.const       0
+# CHECK-NEXT:    throw           __cpp_exception
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+
+# CHECK:         block
+# CHECK-NEXT:    block           exnref
+# CHECK-NEXT:    block
+# CHECK-NEXT:    try_table        (catch_all 0) (catch_all_ref 1) (catch_all 2)
+# CHECK:         call    foo
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    end_block
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+
+# CHECK:         block
+# CHECK-NEXT:    try_table        (catch_all 0) (catch_all 0)
+# CHECK:         end_try_table
+# CHECK-NEXT:    end_block
+
+# CHECK:         block
+# CHECK-NEXT:    try_table       f32 (catch_all 0)
+# CHECK:         f32.const       0x0p0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+
+# CHECK:         block
+# CHECK-NEXT:    try_table       () -> (i32, f32) (catch_all 0)
+# CHECK:         i32.const       0
+# CHECK-NEXT:    f32.const       0x0p0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    end_block
+
+# CHECK:         try_table
+# CHECK-NEXT:    call    foo
+# CHECK-NEXT:    end_try_table
+
+# CHECK:         try_table       i32
+# CHECK-NEXT:    i32.const       0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+
+# CHECK:         try_table       () -> (i32, f32)
+# CHECK-NEXT:    i32.const       0
+# CHECK-NEXT:    f32.const       0x0p0
+# CHECK-NEXT:    end_try_table
+# CHECK-NEXT:    drop
+# CHECK-NEXT:    drop

aheejin added a commit to aheejin/llvm-project that referenced this pull request Sep 15, 2024
The plan was to make `eh-assembly.s` contain both the legacy and the new
tests, but the new tests require `--no-type-check` because the type
checker for the new EH is in progress. In case this drags on further
than expected, this renames the current file to `-legacy.s` in order to
follow the current naming scheme in `test/CodeGen/WebAssembly`.

After landing this first, `eh-assembly-new.s` in llvm#108668 will be renamed
to `eh-assembly.s`.
aheejin added a commit to aheejin/llvm-project that referenced this pull request Sep 15, 2024
The plan was to make `eh-assembly.s` contain both the legacy and the new
tests, but the new tests require `--no-type-check` because the type
checker for the new EH is in progress. In case this drags on further
than expected, this renames the current file to `-legacy.s` in order to
follow the current naming scheme in `test/CodeGen/WebAssembly`.

After landing this first, `eh-assembly-new.s` in llvm#108668 will be renamed
to `eh-assembly.s`.
aheejin added a commit that referenced this pull request Sep 16, 2024
The plan was to make `eh-assembly.s` contain both the legacy and the new
tests, but the new tests require `--no-type-check` because the type
checker for the new EH is in progress. In case this drags on further
than expected, this renames the current file to `-legacy.s` in order to
follow the current naming scheme in `test/CodeGen/WebAssembly`.

After landing this first, `eh-assembly-new.s` in #108668 will be renamed
to `eh-assembly.s`.
Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on what might be the most likely error cases to arise in practice? I noticed there are no tests of error cases; I don't think I'd necessarily want to try to be really comprehensive on covering every possible error cases but if we think some categories are more likely, it might be good to have something.

struct CaLOpElem {
int64_t Opcode;
const MCExpr *Tag;
int64_t Dest;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume Dest here is the branching depth of the targeted block. Does it have exactly the same meaning (in the context of the asm parser) as the BrList operands? If so, maybe it should have the same type?

Actually the level of abstraction here seems slightly funny because the tag is represented as an MCExpr* (rather than a raw int) but the destination is a raw int. But I guess that makes sense in that MC explicitly models the tags as MCExprs but doesn't model basic blocks; and BrList already works this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made them int64_t just because the token's parser's return value was int64_t:

int64_t getIntVal() const {
assert(Kind == Integer && "This token isn't an integer!");
return IntVal.getZExtValue();
}

And yeah I agree giving them the real type is better. Changed Opcode to uint8_t and Dest to unsigned to reflect the real binary spec.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2024

Any thoughts on what might be the most likely error cases to arise in practice? I noticed there are no tests of error cases; I don't think I'd necessarily want to try to be really comprehensive on covering every possible error cases but if we think some categories are more likely, it might be good to have something.

Added some error cases.
Tried to check an invalid tag too, but that's a realm of AsmTypeCheck.

@@ -378,7 +378,7 @@ void WebAssemblyInstPrinter::printCatchList(const MCInst *MI, unsigned OpNo,
const MCSymbolRefExpr *TagExpr = nullptr;
const MCSymbolWasm *TagSym = nullptr;
assert(Op.isExpr());
TagExpr = dyn_cast<MCSymbolRefExpr>(Op.getExpr());
TagExpr = cast<MCSymbolRefExpr>(Op.getExpr());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by fix

@aheejin aheejin merged commit defb8fb into llvm:main Sep 17, 2024
8 checks passed
@aheejin aheejin deleted the eh_asmparser branch September 17, 2024 18:35
hamphet pushed a commit to hamphet/llvm-project that referenced this pull request Sep 18, 2024
This adds assembly parsing support for the new EH (exnref) proposal.

`try_table` parsing is a little tricky because catch clause lists use
`()` and the multivalue block return types also use `()`. This handles
all combinations below:
- No return type (void) + no catch list
- No return type (void) + catch list
- Single return type + no catch list
- Single return type + catch list
- Multivalue return type + no catch list
- Multivalue return type + catch list

This does not include AsmTypeCheck support yet. That's the reason why
this adds a new test file and use `--no-type-check` in the command line.
After the type checker is added as a follow-up, I plan to merge
/~https://github.com/llvm/llvm-project/blob/main/llvm/test/MC/WebAssembly/eh-assembly-legacy.s
with this file. (Turning on `-mattr=+exception-handling` adds support
for all legacy and new EH instructions in the assembly.
`-wasm-enable-exnref` in `llc` only controls which instructions to
generate and it doesn't affect `llvm-mc` and assembly parsing.)
tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
This adds assembly parsing support for the new EH (exnref) proposal.

`try_table` parsing is a little tricky because catch clause lists use
`()` and the multivalue block return types also use `()`. This handles
all combinations below:
- No return type (void) + no catch list
- No return type (void) + catch list
- Single return type + no catch list
- Single return type + catch list
- Multivalue return type + no catch list
- Multivalue return type + catch list

This does not include AsmTypeCheck support yet. That's the reason why
this adds a new test file and use `--no-type-check` in the command line.
After the type checker is added as a follow-up, I plan to merge
/~https://github.com/llvm/llvm-project/blob/main/llvm/test/MC/WebAssembly/eh-assembly-legacy.s
with this file. (Turning on `-mattr=+exception-handling` adds support
for all legacy and new EH instructions in the assembly.
`-wasm-enable-exnref` in `llc` only controls which instructions to
generate and it doesn't affect `llvm-mc` and assembly parsing.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:WebAssembly mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants