Skip to content

Support for Cuda Separate Compilation #508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

YoshikawaMasashi
Copy link

related to #505

This PR change the code to support for cuda separate compilation.
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#using-separate-compilation-in-cuda

Before change, cc-rs run these command

$ nvcc {some option} -o {out_dir}/a.o -c a.cu
$ nvcc {some option} -o {out_dir}/b.o -c b.cu
$ ar crs {out_dir}/libkernel.o.a {out_dir}/a.o {out_dir}/b.o

But, this command fail by unresolved extern function error.
To perform separate compilation, we need to compile with -dc option, and to link object file with -dlink option.

After change, cc-rs run these command

$ nvcc {some option} -dc -o {out_dir}/a.o -c a.cu
$ nvcc {some option} -dc -o {out_dir}/b.o -c b.cu
$ nvcc {some option} -dlink {out_dir}/a.o {out_dir}/b.o -o {out_dir}/link.o
$ ar crs {out_dir}/libkernel.o.a {out_dir}/a.o {out_dir}/b.o {out_dir}/link.o

code

build.rs(we need println!("cargo:rustc-link-lib=cudadevrt"); for separate compilation)

extern crate cc;

fn main() {
    cc::Build::new()
        .cuda(true)
        .file("a.cu")
        .file("b.cu")
        .compile("kernel.o");

    println!("cargo:rustc-link-search=native=/usr/local/cuda/lib64");
    println!("cargo:rustc-link-lib=cudart");
    println!("cargo:rustc-link-lib=cudadevrt");
}

a.cu

#include "b.h"

__device__ double a_func(double x) {
    return b_func(x) + 1;
}


extern "C" {
    __global__ void
    batch_a_func(
        const double* x, double* y
    ) {
        int i = threadIdx.x;

        y[i] = a_func(x[i]);
    }
}

b.cu

#include "b.h"

__device__ double b_func(double x) {
    return x + 1;
}

b.h

__device__ double b_func(double x);

@alexcrichton
Copy link
Member

Thanks for this!

This seems pretty reasonable to me, but TBH I know next-to-nothing about CUDA. Would it be possible to add a new builder on CI which tests this since the support is becoming somewhat nontrivial?

src/lib.rs Outdated
}
cmd.arg("-o");
let out_dir = self.get_out_dir()?;
let out_dir = out_dir.join("link.o");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could "link.o" shared between here and below be refactored to a common definition? Additionally could it be named something like __cc_internal_link.o to try to avoid name clashes with normal objects?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I renamed to __cc_internal_link.o.

@YoshikawaMasashi
Copy link
Author

To test compilation of cuda, we need nvcc, which is compiler of CUDA. And to test execution of compiled code, we need an environment that has Nvidia GPU.

nvcc is included in the CUDA Toolkit, the SDK for CUDA. And we can use this in the nvidia/cuda
docker image
.
And this SDK may be introduced into Github Action's virtual environment in the future. (PR)

However, nvidia GPU is not available in Github Action's virtual environment now. We can use only 2-core CPU, 7 GB of RAM memory and 14 GB of SSD disk space. So it is difficult to test execution cuda code...

To build CI of cuda execution test, other OSS are using third party CI service, for example CircleCI. But there is a cost to using GPUs in CircleCI...

@alexcrichton
Copy link
Member

Oh I'd be fine not actually executing the CUDA code, it should be ok to just get everything to a linked state I think? Would that be possible to do on CI today? (just exercising this library, which builds, rather than end-to-end testing)

@YoshikawaMasashi
Copy link
Author

We can build CI that has nvcc(cuda compiler) by also using github actions.
Fow example, this yaml perform cargo test at environment that has nvcc.

name: CI
on: [push, pull_request]

jobs:
  cuda_test:
    name: CUDA Test
    runs-on: ubuntu-latest
    container: nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04
    steps:
    - uses: actions/checkout@master
    - run: nvcc --version # cuda install check
    - run: apt-get update && apt-get install -y curl
    - name: Install rustup
      run: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
    - name: Install Rust (rustup)
      run: source $HOME/.cargo/env && rustup update stable --no-self-update && rustup default stable
      shell: bash
    - name: rustup target
      run: source $HOME/.cargo/env && rustup target add x86_64-unknown-linux-gnu
      shell: bash
    - name: Install g++-multilib
      run: |
        set -e
        # Remove the ubuntu-toolchain-r/test PPA, which is added by default.
        # Some packages were removed, and this is causing the g++multilib
        # install to fail. Similar issue:
        # https://github.com/scikit-learn/scikit-learn/issues/13928.
        sudo add-apt-repository --remove ppa:ubuntu-toolchain-r/test
        sudo apt-get install g++-multilib
      if: matrix.build == 'linux32'
    - name: cargo build
      run: source $HOME/.cargo/env && cargo build
      shell: bash
    - name: cargo test 
      run: source $HOME/.cargo/env && cargo test
      shell: bash 

Of course we have to write test codes to perform cuda compile test. I am trying to do. And I need some time to understanding current cc-rs test code, and write cuda test code.

@alexcrichton
Copy link
Member

Yeah that looks perfect! I'd be happy to have that on CI here

@dot-asm dot-asm mentioned this pull request Jul 25, 2021
@YoshikawaMasashi
Copy link
Author

This PR is no longer needed (#612). thank you:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants