Skip to content

Fix a rarely occurring crash when processing certain symbols for many platforms #1227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

d-ronnqvist
Copy link
Contributor

Bug/issue #, if applicable: rdar://151854292

Summary

This fixes a very rare crash that sometimes happened when processing symbol graphs for many platforms for both Swift and Objective-C where one or more symbols were the source is an anonymous structure with members nested inside a C union:

union Outer {
    struct {
        uint32_t member;
    } inner;
};

Dependencies

None

Testing

  • Define a C union with one or more inner anonymous structures with one or more members. For example:

    #import <stdint.h>
    
    union Outer {
        struct {
            uint32_t member1;
            uint32_t member2;
            uint32_t member3;
        } inner1;
        struct {
            uint32_t member1;
            uint32_t member2;
            uint32_t member3;
        } inner2;
    };
  • Extract symbol graphs files for both Swift and C. Repeat for multiple platforms (for example, iOS, macOS, etc.)

  • Build documentation and pass all of the extracted Swift and C symbol graph files for all platforms.

    • DocC shouldn't crash with a "Symbol with identifier '...' has no reference" fatal error.
      Note: This failure depends on the non-deterministic order that files are read so you may need to repeat this (docc convert) step repeatedly. Even with optimal conditions, this crash doesn't reproduce most of the time.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • [] Updated documentation if necessary Not necessary.

… when processing members of structs nested in unions for many platforms.

rdar://151854292
@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

// If both identifiers are in the same language, they are the same symbol.
$0.symbol!.identifier.interfaceLanguage == symbol.identifier.interfaceLanguage
// If both have the same path components and kind, their differences don't matter for link resolution purposes.
|| ($0.symbol!.pathComponents == symbol.pathComponents && $0.symbol!.kind.identifier == symbol.kind.identifier)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This || expression sometimes find the language representation of a symbol that's different from the language of the current symbol graph file. This in turn results in cloning a language specific symbol that's in the "wrong" language which removes the only language from that node. Later, when computing the paths for all symbols, the cloned node without any language is skipped, resulting in a symbol that never receives a reference.


let paths = tree.caseInsensitiveDisambiguatedPaths()
XCTAssertEqual(paths[outerContainerID], "/ModuleName/Outer")
XCTAssertEqual(paths[innerContainerID], "/ModuleName/Outer/__Unnamed_struct_inner") // FIXME: If the Swift name starts with "__" we should prefer the C/Objective-C name
Copy link
Contributor Author

@d-ronnqvist d-ronnqvist May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is existing, unchanged, behavior so I didn't attempt to fix it in this PR.

Copy link
Contributor

@patshaughnessy patshaughnessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change seems reasonable; however, I can't reproduce a failure without this fix, using either the new unit test or other more realistic data. I'd feel better about this change if we could write a unit test that reliably fails 100% of the time without the fix, and then passes with the fix.

The problem seems to be that the test symbol data in testMissingMemberOfAnonymousStructInsideUnion never triggers the failure scenario in the original code:

$0.symbol!.identifier.interfaceLanguage == symbol.identifier.interfaceLanguage
|| ($0.symbol!.pathComponents == symbol.pathComponents && $0.symbol!.kind.identifier == symbol.kind.identifier)

To see the failure (and to see the fix with this change), we need a unit test that:

  • includes a symbol first in this loop that has a different interface language,
  • but has matching path components and kind values.

Is there a way to craft the symbols in the unit test to hit this condition and cause the crash without the fix? That would increase our confidence that the fix is correct.

@QuietMisdreavus
Copy link
Contributor

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor

David's OOO until next week, so i've updated this PR with Pat's suggestion.

In my testing, it turns out the original test wasn't triggering the original issue, nor did it reflect real-world symbol graphs. When i build symbol graphs for the test code in the PR description, the Swift symbol graphs do not contain a symbol for the __Unnamed_struct_inner1 types, due to the prefixed underscores. Instead, it does include the property inner1, with a reference to the hidden type in its declaration. (And, as David indicated in his code comments, without the member1 grandchild symbols due to the hidden unnamed struct symbol.) I modified the test to reflect this situation, and after a few thousand repetitions in Xcode i hit the failure case. When i put David's change back in, i was not able to hit the failure case after 5000 repetitions.

As i was typing this up, the Linux CI came up with a failure in the same place i saw beforehand, so i think there's something else at play here...

@QuietMisdreavus
Copy link
Contributor

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor

I've pushed a change to rework the PR and fix the underlying issue that was still being triggered. Here's what i found out:

In the unit test scenario, if a Swift symbol graph is processed before a Clang symbol graph, and then the grandchild relationship in the Clang graph is processed before its parent's relationship, it creates a scenario where the Clang-only grandchild symbol is added to the then-combined middle node. Then, the middle node's memberOf relationship is processed, causing it to be cloned to become a child of the Clang-only union symbol. In the cloning code added in #1203, this causes the grandchild symbol to be re-added to the Clang clone of the middle symbol, tripping the "adding a node to multiple parents" assertion added in that PR.

To resolve this issue, i reworked the cloning logic to also recursively clone the node's children based on passed-in logic (specifically the symbol-graph check). I also added in the concept of "transplanting" a node from one parent to another, i.e. removing it as child of its current parent to add it to a different parent. This is done in the case in this test, when an existing child's languages no longer overlap with its parent. This creates a valid hierarchy which is able to successfully render without tripping any more assertions. This also changes the scope of this PR considerably. 😅

With this change in place, i was able to verify that running the included test several dozen thousand times continues to pass without hitting any assertions. I'm more confident in this version because it properly handles cloned nodes' children in a way that keeps the hierarchy valid.

@QuietMisdreavus QuietMisdreavus force-pushed the fix-missing-member-of-inner-anonymous-union-crash branch from f42f7ed to 42ab08a Compare May 30, 2025 23:13
@QuietMisdreavus
Copy link
Contributor

@swift-ci Please test

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist
Copy link
Contributor Author

In my testing, it turns out the original test wasn't triggering the original issue, nor did it reflect real-world symbol graphs. When i build symbol graphs for the test code in the PR description, the Swift symbol graphs do not contain a symbol for the __Unnamed_struct_inner1 types, due to the prefixed underscores. Instead, it does include the property inner1, with a reference to the hidden type in its declaration. (And, as David indicated in his code comments, without the member1 grandchild symbols due to the hidden unnamed struct symbol.)

Great catch. I think I misread the shrunk down inputs when I created the test case. In the shrunk down symbol graph file I see the var inner: Outer.__Unnamed_struct_inner { get set } symbol but not the structure definition, just like you updated the test to be.

To resolve this issue, i reworked the cloning logic to also recursively clone the node's children based on passed-in logic (specifically the symbol-graph check). I also added in the concept of "transplanting" a node from one parent to another, i.e. removing it as child of its current parent to add it to a different parent. This is done in the case in this test, when an existing child's languages no longer overlap with its parent. This creates a valid hierarchy which is able to successfully render without tripping any more assertions. This also changes the scope of this PR considerably. 😅

I reworked that logic once more to both be more correct by:

  • allowing the recursive copies to look up the right language representation of their symbols

and to be easier to reason about by:

  • doing all modifications (languages, counterparts, etc.) inside the deepCopy(...) function so that the reader doesn't need to jump back and forth between the caller, the callee, and the passed in closure.
  • inlining the shouldClone... logic and rewriting it to be based it entirely on languages so that the action of placing the node in the language specific subhierarchy is right next to the decision about where in that subhierarchy it belongs.
  • keeping DisambiguationContainer as an add-only type, so that we don't need to think about removals throughout the rest of the implementation

With this change in place, i was able to verify that running the included test several dozen thousand times continues to pass without hitting any assertions. I'm more confident in this version because it properly handles cloned nodes' children in a way that keeps the hierarchy valid.

I ran the latest version a couple of thousand times today using both the shrunk down inputs in the test and the full symbol graphs from the original project where this issue was found.

Copy link
Contributor

@QuietMisdreavus QuietMisdreavus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your rewrite looks good! I was also able to run the test several thousand times without failure, so i think we finally got it this time. 🤞

@d-ronnqvist d-ronnqvist merged commit fc79e78 into swiftlang:main Jun 3, 2025
2 checks passed
@d-ronnqvist d-ronnqvist deleted the fix-missing-member-of-inner-anonymous-union-crash branch June 3, 2025 08:19
d-ronnqvist added a commit to d-ronnqvist/swift-docc that referenced this pull request Jun 3, 2025
… platforms (swiftlang#1227)

* Fix a rare crash that sometimes (depending on file read order) happen when processing members of structs nested in unions for many platforms.

rdar://151854292

* update test to correctly trigger underlying issue

* recursively clone children and transplant those with mismatched languages

* Simplify and correct deep cloning of path hierarchy nodes

---------

Co-authored-by: Vera Mitchell <[email protected]>
d-ronnqvist added a commit that referenced this pull request Jun 3, 2025
… platforms (#1227) (#1229)

* Fix a rare crash that sometimes (depending on file read order) happen when processing members of structs nested in unions for many platforms.

rdar://151854292

* update test to correctly trigger underlying issue

* recursively clone children and transplant those with mismatched languages

* Simplify and correct deep cloning of path hierarchy nodes

---------

Co-authored-by: Vera Mitchell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants