Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TBDGen: Introduce option to emit API descriptor as supplementary output #68994

Merged
merged 3 commits into from
Oct 5, 2023

Conversation

tshortli
Copy link
Contributor

@tshortli tshortli commented Oct 5, 2023

An "API descriptor" file is JSON describing the externally accessible symbols of a module and metadata associated with those symbols like availability and SPI status. This output was previously only generated by the swift-api-extract alias of swift-frontend, which is designed to take an already built module as input. Post-processing a built module to extract this information is inefficient because the module and the module's dependencies need to be deserialized in order to visit the entire AST. We can generate this output more efficiently as a supplementary output of the -emit-module job that originally produced the module (since the AST is already available in-memory). The new -emit-api-descriptor flag can be used to request this output.

The output of -emit-api-descriptor differs from the output of swift-api-extract run on an existing module in a couple of important ways:

  • The value for the file key in the descriptor JSON is now the path to the source file that defines the declaration responsible for the symbol. In swift-api-extract mode, the value for this key is the path to the module or swiftinterface which is unavailable during an -emit-module job since the module is usually not being emitted to its final installed location.
  • Some additional symbols may be included in the API descriptor JSON because more of the AST is available when emitting the module.

Resolves rdar://110916764

An "API descriptor" file is JSON describing the externally accessible symbols
of a module and metadata associated with those symbols like availability and
SPI status. This output was previously only generated by the
`swift-api-extract` alias of `swift-frontend`, which is desgined to take an
already built module as input. Post-processing a built module to extract this
information is inefficient because the module and the module's dependencies
need to be deserialized in order to visit the entire AST. We can generate this
output more efficiently as a supplementary output of the -emit-module job that
originally produced the module (since the AST is already available in-memory).
The -emit-api-descriptor flag can be used to request this output.

This change lays the groundwork by introducing frontend flags. Follow up
changes are needed to make API descriptor emission during -emit-module
functional.

Part of rdar://110916764.
…ule.

Make the changes to APIGenRecorder that are necessary to make it capable of
emitting API descriptors during -emit-module jobs. The output in this mode
differs from the output when run on an existing module in a couple of important
ways:

- The value for the `file` key in the descriptor JSON is now the path to the
  source file that defines the declaration responsible for the symbol. In
  `swift-api-extract` mode, the value for this key is the path to the module or
  swiftinterface which is unavailable during an -emit-module job since the module
  is usually not being emitted to its final installed location.
- Some additional symbols may be included in the API descriptor JSON because
  more of the AST is available when emitting the module.

Resolves rdar://110916764
Instead, use the `%validate-json` lit substitution to validate and format the
API descriptor file before running it through FileCheck. This allows us to
avoid needing to introduce a dedicated frontend option just to control whether
the output of -emit-api-descriptor is pretty printed.
@tshortli
Copy link
Contributor Author

tshortli commented Oct 5, 2023

@swift-ci please test

Copy link
Contributor

@zixu-w zixu-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Allan! 🎉

Some additional symbols may be included in the API descriptor JSON because more of the AST is available when emitting the module.

Do you have ideas of roughly what are the kinds of these additional symbols that were not available from serializing the module? I see in the test case one additional symbol for method lookup function for a class.

Copy link
Contributor

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tshortli
Copy link
Contributor Author

tshortli commented Oct 5, 2023

Do you have ideas of roughly what are the kinds of these additional symbols that were not available from serializing the module? I see in the test case one additional symbol for method lookup function for a class.

Yeah, the method lookup function is the example that prompted me to call this out. I haven't identified other examples, but that was enough to indicate to me that there could also be other examples that the test cases don't demonstrate.

@@ -98,6 +98,7 @@ TYPE("pch", PCH, "pch", "")
TYPE("none", Nothing, "", "")

TYPE("abi-baseline-json", SwiftABIDescriptor, "abi.json", "")
TYPE("api-json", SwiftAPIDescriptor, "", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension right now for partial SDKDB files is .sdkdb . Should this be listed in the extension field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not identify a reason it was important to specify an extension here. There are some utilities in the compiler to take a filename and look up its corresponding file_types entry using the contents of this table, but AFAIK, we don't need to do that with an API descriptor since the path to one is always specified explicitly in the arguments to the frontend. We could certainly add .sdkdb here but it feels slightly over-specified to me since ultimately this output is just JSON to the compiler, it doesn't otherwise know about SDKDB as a concept.

@artemcm @xymus @nkcsgexi Am I missing any reasons this needs to be specified?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't remember any specific reasons for being explicit about file extensions other than inferring output kinds. However, it seems to be good documentation, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I named it originally partial.sdkdb for the intermediate JSON format. Up for a better name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given what I've heard that we do not need to encode the suffix of this file in the Swift compiler. From the compiler's point of view it's just another type of supplementary file that it knows how to write to a given path when requested, but the filename can be opaque to the compiler.

// CHECK-SPI-NEXT: "linkage": "exported",
// CHECK-SPI-NEXT: "introduced": "10.10"
// CHECK-SPI-NEXT: },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is concerning when a public symbol disappears. Does the symbol _main exist in the binary file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_main was erroneously added before because the swift-frontend invocation did not specify -parse-as-library (normally a library would not have a _main entry point).

@tshortli tshortli merged commit e01f234 into swiftlang:main Oct 5, 2023
@tshortli tshortli deleted the api-extract-supplementary-output branch October 5, 2023 23:45
@tshortli
Copy link
Contributor Author

tshortli commented Oct 6, 2023

See swiftlang/swift-driver#1460 for the driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants