-
Notifications
You must be signed in to change notification settings - Fork 162
[CIR][CUDA] Register __global__ functions #1441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
voidPtrTy = PointerType::get(voidTy); | ||
voidPtrPtrTy = PointerType::get(voidPtrTy); | ||
intTy = typeSizeInfo.getIntType(&getContext()); | ||
charTy = typeSizeInfo.getCharType(&getContext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions:
- Make
CIRDataLayout
have atypeSizeInfo
data member caching it. - Remove all the variables and (a) just instantiate the types whenever these variables are being used (there aren't many of them) or (b) ask
typeSizeInfo
fromCIRDataLayout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some lowering test cases doesn't have a TypeSizeInfoAttr
in it, so I added a if-statement to guard against this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were those testcases directly written in CIR? If so you should just add the attribute to them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 42 of them, so it seems not quite feasible to me.
Shall we leave it as it is for now, and tidy these test cases up later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of them fail? You need to update only the ones that fail. Unfortunately we don't want to build up even more technical debt!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, all 42 ones fail, as they just say module {
without any attributes. I'll add them up then.
Edit: Now it's done and 69 test files has been changed to add the attribute.
734e244
to
de26454
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more round of review!
@@ -112,6 +113,8 @@ class StructLayoutMap { | |||
|
|||
CIRDataLayout::CIRDataLayout(mlir::ModuleOp modOp) : layout{modOp} { | |||
reset(modOp.getDataLayoutSpec()); | |||
typeSizeInfo = mlir::cast<TypeSizeInfoAttr>( | |||
modOp->getAttr(cir::CIRDialect::getTypeSizeInfoAttrName())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best we don't force these attributes to be always around for any given tests, too taxing, I believe we can use another way to look at this problem, sorry that I didn't get this the first time and you had to change the tests. Instead, if no typeSizeInfo
is available, create one here; either hardcode or query the layout
for good proxies. Suggestions: for size_t use the pointer type size and the other ones can be just hardcoded to 8 and 32.
#cir.type_size_info<
char = 8,
int = 32, size_t = 64>}
This is part 2 of CUDA lowering. Still more to come! This PR generates `__cuda_register_globals` for functions only, without touching variables. It also fixes two discrepancies mentioned in Part 1, namely: - Now CIR will not generate registration code if there's nothing to register; - `__cuda_fatbin_wrapper` now becomes a constant.
This is part 2 of CUDA lowering. Still more to come!
This PR generates
__cuda_register_globals
for functions only, without touching variables.It also fixes two discrepancies mentioned in Part 1, namely:
__cuda_fatbin_wrapper
now becomes a constant.