Skip to content

Fast dependency scanning for Swift #28515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Apr 25, 2020

Conversation

DougGregor
Copy link
Member

@DougGregor DougGregor commented Dec 1, 2019

Implement a new "fast" dependency scanning option,
-scan-dependencies, in the Swift frontend that determines all
of the source file and module dependencies for a given set of
Swift sources. It covers all forms of dependencies:

  1. Swift (serialized) module files, by reading the module header
  2. Swift interface files, by parsing the source code to find imports
  3. Swift source modules, by parsing the source code to find imports
  4. Clang modules, using Clang's fast dependency scanning tool
  5. Bridging headers, including the source files and modules they depend on

A single -scan-dependencies operation maps out the full
dependency graph for the given Swift source files, including all
of the Swift and Clang modules that may need to be built, such
that all of the work can be scheduled up front by the Swift
driver or any other build system that understands this
option. The dependency graph is emitted as JSON, which can be
consumed by these other tools. Swift code with Codable
data structures is used for validating this JSON, and can
(eventually) be picked up by other tools.

This operation also supports -emit-dependencies, the flag that
produces make-style dependencies to determine all of the input
files that can affect the compilation, making it usable as a fast
dependency scanner even for clients not interesting in prebuilding
modules.

In a source file that imports just Cocoa on macOS, fast
dependency scanning with this patch is ~11x faster than
the current dependency scanner (with -resolve-imports), taking
it from 5.7s (and writing 48MB of Clang modules) down to .49s.

@DougGregor
Copy link
Member Author

@Bigcheese this totally doesn't work yet, but it's getting closer to mapping out the various Clang + Swift module builds that are involved in a Swift compile. You can see my super hacky interaction with Clang's fast dependency scanner in ClangModuleDependencyScanner.cpp.

@DougGregor
Copy link
Member Author

Hey, it works a little bit now with a Clang change to handle @import directives in Clang's dependency scanner.

@DougGregor
Copy link
Member Author

The necessary Clang changes are at swiftlang/llvm-project#392

@DougGregor DougGregor force-pushed the fast-dependency-scanning branch from 86e6426 to 52869a3 Compare December 4, 2019 01:06
@slavapestov
Copy link
Contributor

Would this allow the frontend to get out of the business of building interface files altogether?

@DougGregor
Copy link
Member Author

Would this allow the frontend to get out of the business of building interface files altogether?

Yes, that's the intent. The driver would coordinate the interface file builds so the frontend wouldn't have to do it.

@DougGregor
Copy link
Member Author

I collapsed everything because it doesn't make sense for this to ride on top of -emit-imported-modules; rather, that flag should (separately) go away. Instead, implement a new flag (-scan-dependencies) whose primary output is the JSON dependency graph and which will eventually also emit normal makefile dependencies (.d file) as well, for clients that need it.

@jerrymarino
Copy link

So this should this allow us to directly run the frontend in external build systems directly? Seems like we can use the new .d files

@DougGregor
Copy link
Member Author

So this should this allow us to directly run the frontend in external build systems directly? Seems like we can use the new .d files

We'd prefer that external build systems go through the driver in general, because the frontend interface is the supported way to call the compiler. However, we can expose this as a driver option once it's been tested a bit better. I would certainly appreciate if others would test it out and see what I've missed.

@DougGregor DougGregor force-pushed the fast-dependency-scanning branch from 93aef39 to 64aad80 Compare March 29, 2020 05:38
@DougGregor
Copy link
Member Author

Brought it up to date with master. It no longer needs cherry-picks on top of Clang

@DougGregor
Copy link
Member Author

@swift-ci please smoke test

2 similar comments
@DougGregor
Copy link
Member Author

@swift-ci please smoke test

@DougGregor
Copy link
Member Author

@swift-ci please smoke test

@davidungar
Copy link
Contributor

Lovely, readable code. But, why were the storage classes factored out, instead of just having a base ModuleDependencies class and two subclasses? Is there not always a one-to-one relationship to the storage? I'm missing something here...

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't for correctness, but overall it looks pretty good. I threw in a few suggestions that could incrementally improve the clarity, IMO. The most important of those would be the places where a "why" comment could help.

const std::string compiledModulePath;

/// The set of modules on which this module depends.
std::vector<std::string> moduleDependencies;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, now I see that a ModuleDependencies of some sort links a module that uses things to modules that define things. Hmmm.... it's the reverse of what I think I'd need to do dependency propagation, which goes from def to use. This class goes from use to def. A "why" comment explaining why this choice of direction might be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It describes "what has to be built before me". I'll clarify this in the overall file/class comment.

StringRef moduleName, bool isUnderlyingClangModule,
ModuleDependenciesCache &cache) {
for (auto &loader : getImpl().ModuleLoaders) {
if (isUnderlyingClangModule &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "why" comment could be helpful here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code stinks; it'll be removed when we refactor more.

}
}

bool ModuleDependenciesCache::hasDependencies(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would hasDependenciesOnOtherModules be clearer or just redundant? I'm not sure...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant, IMO.

}
}

llvm::ErrorOr<StringRef> ClangModuleDependenciesCacheImpl::getImportHackFile() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImportHackFile?! I hadn't seen that term before. Is it one of our standard terms? Where one go to find it? Maybe it's in a comment on the function declaration & I missed it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang will provide us with an API that makes this unnecessary. It's a HACK.

return cache.findDependencies(moduleName, ModuleDependenciesKind::Clang);
}

bool ClangImporter::addBridgingHeaderDependencies(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know me; I would prefer to break up this function into more understandable (for me) pieces.

ClangImporterOptions importerOpts;

// Retrieve the bridging header.
std::string bridgingHeader = *targetModule.getBridgingHeader();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?


// Determine the command-line arguments for dependency scanning.
auto &ctx = Impl.SwiftContext;
std::vector<std::string> commandLineArgs =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if all these consts are getting tedious. As you know, the idea is that if there are more than a few lines after the assignment, if something is const the first assignment is enough to know where it's coming from without having to inspect all uses. (Esp. in C++ where one can pass a reference, sigh.) Forgive me for stating the obvious.

shouldImplicityImportSwiftOnoneSupportModule(CompilerInvocation &Invocation) {
if (Invocation.getImplicitModuleImportKind() !=
SourceFile::ImplicitModuleImportKind::Stdlib)
bool CompilerInvocation::shouldImportSwiftONoneSupport() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +1259 to +1257
bool EvaluateConditionals =
Action == FrontendOptions::ActionType::EmitImportedModules
|| Action == FrontendOptions::ActionType::ScanDependencies;
Instance.performParseOnly(EvaluateConditionals,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow in here and through 1271 is confusing. The predicate here is FrontendOptions::shouldActionOnlyParse(Action) and I would expect a return right after line 1262.
But instead, it falls through to 1271: if (Action == FrontendOptions::ActionType::Parse) and it that is true it returns. Which leaves me pondering: what's the difference between the two predicates? Is one a subset of the other? There must be a clearer way to write this. Maybe also do a return right here?
OK, I see it, now. Will leave the above confused ramblings in FWIW.

The whole business starting at 1255 could use a comment to the effect of do the the right about of parsing, etc. for the mode, or some such. (I would factor it out into a function with that name.)

Then, a blank line after 1258 would make it clear that the comment above it only applies to that one line.
(Not your fault!)

Then, a "why" comment might be nice about why conditionals are only being evaluated in these two cases.

I guess what's going on is that this function could be split in a "parse" phase, through 1272 and an after-parse phase, and that demarcation would clarify what's going on. (Not your fault, but this change just makes it incrementally worse, like an amoeba that's overdue for fission.)

Implement a new "fast" dependency scanning option,
`-scan-dependencies`, in the Swift frontend that determines all
of the source file and module dependencies for a given set of
Swift sources. It covers four forms of modules:

1) Swift (serialized) module files, by reading the module header
2) Swift interface files, by parsing the source code to find imports
3) Swift source modules, by parsing the source code to find imports
4) Clang modules, using Clang's fast dependency scanning tool

A single `-scan-dependencies` operation maps out the full
dependency graph for the given Swift source files, including all
of the Swift and Clang modules that may need to be built, such
that all of the work can be scheduled up front by the Swift
driver or any other build system that understands this
option. The dependency graph is emitted as JSON, which can be
consumed by these other tools.
If requested, produce the normal make-style dependency file from the
fast dependency scanner, using the same dependency-tracking
infrastructure.
When there is a bridging header associated with the module, scan and record
its dependencies. Note them in a separate structure to capture the specific
dependencies of the bridging header.
@DougGregor DougGregor force-pushed the fast-dependency-scanning branch from 9febd18 to 5351d54 Compare April 24, 2020 19:59
@DougGregor
Copy link
Member Author

Rebasing to fix merge conflicts, adding some comments, tweaking tests... and let's get this landed.

@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor DougGregor changed the title [WIP] Dependency scanning for Swift Dependency scanning for Swift Apr 24, 2020
@DougGregor DougGregor changed the title Dependency scanning for Swift Fast dependency scanning for Swift Apr 24, 2020
@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor
Copy link
Member Author

@swift-ci please test platform Windows

1 similar comment
@DougGregor
Copy link
Member Author

@swift-ci please test platform Windows

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 5351d54

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 5351d54

@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor DougGregor merged commit 172c047 into swiftlang:master Apr 25, 2020
@DougGregor DougGregor deleted the fast-dependency-scanning branch April 25, 2020 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants