-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix x86 crash #5081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix x86 crash #5081
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5081 +/- ##
=======================================
Coverage 75.66% 75.66%
=======================================
Files 993 993
Lines 178157 178157
Branches 19125 19125
=======================================
+ Hits 134800 134805 +5
+ Misses 38136 38134 -2
+ Partials 5221 5218 -3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean fix!
@@ -65,9 +65,9 @@ IPredictor IModelCombiner.CombineModels(IEnumerable<IPredictor> models) | |||
foreach (var t in tree.TrainedEnsemble.Trees) | |||
{ | |||
var bytes = new byte[t.SizeInBytes()]; | |||
int position = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Can you please explain this issue a bit more? Why does this happen in x64 but not in x86? This is managed memory. Why is it corrupting the unamanaged heap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is index out of range issue that corrupts memory but this one is index was out of range in former (start to use byte array from index -1)...
This memory corrupted (byte array) is allocated in managed but used as unmanaged (from fixed section in C# and pointer) like below:
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.FastTree/TreeEnsemble/InternalRegressionTree.cs#L101
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.FastTree/Utils/ToByteArrayExtensions.cs#L113
when position is -1, the pointer ((int*)(pBuffer + position)) is accessing memory it should not.
I'm still not sure why this issue not repro in x64. In theory this can also corrupt x64 memory.
In reply to: 419011127 [](ancestors = 419011127)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SZArrays on x64 have a 4-byte padding between the length and the start of the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspected this was the issue from reading above, but special thanks to @stephentoub for finding the actual code that proves it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eerhardt also confirmed this in email.
@sharwell, @stephentoub and @eerhardt Thanks for shedding more light on the issue.
fixes #1216.
TreeEnsembleCombiner has a bug that causing byte array out of range and corrupts heap