Skip to content

Conversation

rdspring1
Copy link
Collaborator

  • Create dynamic shared memory pointers and allocate at beginning of kernel
  • Calculate total dynamic shared memory size and pass to launch configuration
  • Track dynamic shared memory nodes in GPU Lower
  • Check if dynamic shared memory size is acceptable for current GPU
  • Test multiple dynamic shared memory buffers with different data types

@rdspring1 rdspring1 force-pushed the rds_smem_dynamic_20_8_18 branch 3 times, most recently from d2089d9 to 867322c Compare August 19, 2020 01:20
Copy link
Owner

@csarofeen csarofeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this PR is really good. I just have some minor comments and some questions for @tlemo

@rdspring1 rdspring1 force-pushed the rds_smem_dynamic_20_8_18 branch from 89ba159 to ed56014 Compare August 19, 2020 18:47
@rdspring1
Copy link
Collaborator Author

I incorporated the changes from #302 into this PR. It resolves the conflicts between the two PRs. @FDecaYed

@rdspring1 rdspring1 force-pushed the rds_smem_dynamic_20_8_18 branch 3 times, most recently from c00b004 to 9e00143 Compare August 21, 2020 20:39
@rdspring1 rdspring1 force-pushed the rds_smem_dynamic_20_8_18 branch 4 times, most recently from f34596f to 3e00fcf Compare August 24, 2020 20:00
Check if shared memory usage is within limits for current GPU

Gather buffers in Single Pass

Use single dynamic smem for reduction/broadcast workspace

Align dynamic shared memory by data type
@rdspring1 rdspring1 force-pushed the rds_smem_dynamic_20_8_18 branch from 3e00fcf to 9fca930 Compare August 24, 2020 20:13
@rdspring1 rdspring1 merged commit 3136899 into 20_8_18_devel Aug 24, 2020
@csarofeen csarofeen deleted the rds_smem_dynamic_20_8_18 branch June 9, 2021 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants