Skip to content

VCSM Improvements - dmabuf support #1806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6by9 opened this issue Jan 19, 2017 · 6 comments
Closed

VCSM Improvements - dmabuf support #1806

6by9 opened this issue Jan 19, 2017 · 6 comments

Comments

@6by9
Copy link
Contributor

6by9 commented Jan 19, 2017

(This is more a placeholder for a job that needs doing than a bug)

VCSM ideally wants to be able to support importing and exporting dmabufs to make multimedia work sensibly with the upstream GL/DRM/KMS stuff.
For import:

  • the buffers must be contiguous in memory
  • VC side needs to be passed the physical address to wrap into a MEM_HANDLE.
  • VCSM to hold the reference to the buffer to stop it being released underneath VC.
  • need a callback mechanism from VC on releasing the last GPU side reference count on the buffer to release the dmabuf reference.

On exporting:

  • VCSM already has the physical address and size of the buffer, so wrapping that into a dmabuf shouldn't be that tricky.
  • reference counting needs to be considered, as the VCSM allocation must not be released on VC until the last dmabuf reference is released.
  • how does cache handling fit into dmabuf cleanly? VCSM never uses the caching aliases on VC, so it should only be ARM caches that need to be considered.
@6by9 6by9 changed the title VCSM Improvements VCSM Improvements - dmabuf support Jan 19, 2017
@6by9
Copy link
Contributor Author

6by9 commented Sep 13, 2017

Importing of dmabufs done.

https://github.com/6by9/yavta is a simplish demo app that is plumbing V4L2 (driver MUST use the videobuf2-dma-contig allocator) into MMAL.
https://github.com/6by9/drm-v4l2-test is similar in that it tries to connect to the brave new GL world, with DRM allocating a CMA buffer, and that being imported and used with a MMAL source component.

@usedbytes
Copy link

Are there any plans for dma-buf export?

Or alternatively, I'd be happy to hear any suggestions for alternative ways to manage my use case: Sharing a GLES FBO cross-process.

I'm currently using EGL_IMAGE_BRCM_VCSM to create an FBO, render to it, then map on the CPU. I'd like to share that FBO to another process.

I don't think the GLES driver supports EGL_EXT_image_dma_buf_import (correct me if I'm wrong), and EGL_IMAGE_BRCM_VCSM seems to only support creating VCSM buffers, not importing them (so I can't import a dma-buf to VCSM then give that to GLES).

Can VCSM handles be shared directly cross-process? I had a try, but it doesn't look like handles can be used in a different process.

@6by9
Copy link
Contributor Author

6by9 commented Sep 18, 2018

One of the many things in progress is a rewrite of vcsm that uses CMA instead of gpu_mem. All handles then become dmabuf fds, so can be shared around in the normal manner. It's using the same hooks to import into vcsm at the moment.
It's 90% there, but other priorities have pushed it to the back burner.

The GLES driver isn't being developed further. I believe you are correct that it doesn't support EGL_EXT_image_dma_buf_import.

Current VCSM handles are linked to the allocating PID.
There is vcsm_malloc_share. I haven't worked through the full details of the driver (https://github.com/raspberrypi/linux/blob/rpi-4.14.y/drivers/char/broadcom/vc_sm/vmcs_sm.c#L1688), so can't say what the correct procedure is for importing a shared handle into the new process context.

@usedbytes
Copy link

Thanks for the super speedy reply, and thanks for the pointer to the driver code. I did call vcsm_malloc_share, but I had no idea what I was meant to do after that :-) I'll see what I can figure out from the driver.

The VCSM rewrite sounds good, though I guess I should probably try and switch my stack to V4L2 + VC4/Mesa before too long, as that seems to be the right "future" direction.

@6by9
Copy link
Contributor Author

6by9 commented Sep 18, 2018

I suspect that a vcsm_malloc_share handle can just be used with the normal vcsm calls for mapping, locking, unlocking, etc, but don't know for definite.

Switching to vc4/mesa would be sensible if you can.
Switching to V4L2 may not be required depending on your use case, and it'll only cover the camera and codecs. At least with MMAL you can import dmabufs from other places for zero copy, and has a larger range of processing steps than our current intent with V4L2.

@usedbytes
Copy link

usedbytes commented Sep 18, 2018

I figured out vcsm_malloc_share - I was using it the wrong way round before. I thought calling it in the "exporting" process would give me a valid handle to use in the "importing" process, but it is the opposite. In the "importing" process, you call it with a handle from the "exporting" process and are given a local handle to use in the current process. More like a "vcsm_malloc_import" perhaps.

Process 1:

hnd = vcsm_malloc();
// hnd = 32768

Process 2:

hnd = vcsm_malloc_share(32768);
// hnd == some local alias for 32768

My use case is camera -> GPU -> other process. Your drm-v4l2-test example gets me some of it (dmabuf import to MMAL), I'd just need to figure out how to use VC4 and make it texture-from and render-to dmabufs, which I suppose should be that EGL dma_buf_import extension.

Update: gist for VCSM sharing incase anyone stumbles across this and wants to do the same: https://gist.github.com/usedbytes/0bf272864bfbc8b561c10e48d590116f

@6by9 6by9 closed this as completed Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants