Skip to content

Oops with vchiq_prepare_bulk_data (tc358743 + OMX JPEG encoder) #4669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mdevaev opened this issue Nov 5, 2021 · 73 comments
Closed

Oops with vchiq_prepare_bulk_data (tc358743 + OMX JPEG encoder) #4669

mdevaev opened this issue Nov 5, 2021 · 73 comments

Comments

@mdevaev
Copy link
Contributor

mdevaev commented Nov 5, 2021

Describe the bug
The latest version of 5.10 kernel crashing when using OMX JPEG encoding and TC358743 with 720p video source resolution.

To reproduce
Take TC358743 and ustreamer.

git clone https://github.com/pikvm/ustreamer
cd ustreamer
make WITH_OMX=1

Add cma=128M to cmdline.txt and gpu_mem=256 to config.txt and try this command:

ustreamer --encoder=omx --dv-timings --persistent -m uyvy

You will see the crash.

System

  • Raspberry Pi 4 B+ 4G
  • Arch Linux ARM / PiKVM OS
  • vcgencmd version:
    Nov  2 2021 13:22:15
    Copyright (c) 2012 Broadcom
    version e50fe24ee2b6974f3ba6615ba0f1d8f45c485f69 (clean) (release) (start)
    
  • Linux pikvm 5.10.76-4-raspberrypi4-ARCH #1 SMP Fri Nov 5 08:11:58 MSK 2021 armv7l GNU/Linux

Logs
OOPS 1

[  853.623484] ------------[ cut here ]------------
[  853.628118] kernel BUG at kernel/dma/mapping.c:208!
[  853.633006] Internal error: Oops - BUG: 0 [#1] SMP ARM
Entering kdb (current=0xc3823c00, pid 1113) on processor 0 Oops: (null)
due to oops @ 0xc02a4f2c
CPU: 0 PID: 1113 Comm: ILCS_HOST Tainted: G         C        5.10.76-4-raspberrypi4-ARCH #1
Hardware name: BCM2711
PC is at dma_unmap_sg_attrs+0x40/0x44
LR is at 0x0
pc : [<c02a4f2c>]    lr : [<00000000>]    psr: 20080013
sp : c48c9d64  ip : 00000080  fp : 00000000
r10: 000000fc  r9 : 00000000  r8 : 00000000
r7 : 00000000  r6 : 00000006  r5 : c1a530f4  r4 : f0e2d0c8
r3 : 10801080  r2 : 10801080  r1 : 10801080  r0 : c1e69a10
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 30c5383d  Table: 04139c00  DAC: fffffffd
CPU: 0 PID: 1113 Comm: ILCS_HOST Tainted: G         C        5.10.76-4-raspberrypi4-ARCH #1
Hardware name: BCM2711
[<c020f0d8>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14)
[<c020ae58>] (show_stack) from [<c0e08748>] (dump_stack+0xa4/0xc4)
[<c0e08748>] (dump_stack) from [<c0301eec>] (kdb_main_loop+0x4bc/0x9a0)
[<c0301eec>] (kdb_main_loop) from [<c0304fa0>] (kdb_stub+0x258/0x458)
[<c0304fa0>] (kdb_stub) from [<c02f9ff4>] (kgdb_cpu_enter+0x3e0/0x6fc)
more> ^M[<c02f9ff4>] (kgdb_cpu_enter) from [<c02fa9b4>] (kgdb_handle_exception+0xe0/0x13c)
more> ^M[<c02fa9b4>] (kgdb_handle_exception) from [<c020e794>] (kgdb_notify+0x28/0x48)
more> ^M[<c020e794>] (kgdb_notify) from [<c02503bc>] (notify_die+0x90/0xd0)
more> ^M[<c02503bc>] (notify_die) from [<c020af8c>] (die+0x130/0x34c)
more> ^M[<c020af8c>] (die) from [<c0200b94>] (__und_svc_finish+0x0/0x2c)
more> ^MException stack(0xc48c9cd0 to 0xc48c9d18)
more> ^M9cc0:                                     c1e69a10 10801080 10801080 10801080
more> ^M9ce0: f0e2d0c8 c1a530f4 00000006 00000000 00000000 00000000 000000fc 00000000
more> ^M9d00: 00000080 c48c9d64 00000000 c02a4f2c 20080013 ffffffff
more> ^M[<c0200b94>] (__und_svc_finish) from [<c02a4f2c>] (dma_unmap_sg_attrs+0x40/0x44)
more> ^M[<c02a4f2c>] (dma_unmap_sg_attrs) from [<c0c6509c>] (cleanup_pagelistinfo+0x98/0xb4)
more> ^M[<c0c6509c>] (cleanup_pagelistinfo) from [<c0c656c0>] (vchiq_prepare_bulk_data+0x190/0x6dc)
more> ^M[<c0c656c0>] (vchiq_prepare_bulk_data) from [<c0c5f0e4>] (vchiq_bulk_transfer+0x258/0x3e4)
more> ^M[<c0c5f0e4>] (vchiq_bulk_transfer) from [<c0c63530>] (vchiq_ioctl+0x724/0x17fc)
more> ^M[<c0c63530>] (vchiq_ioctl) from [<c044b198>] (sys_ioctl+0x34c/0x904)
more> ^M[<c044b198>] (sys_ioctl) from [<c0200040>] (ret_fast_syscall+0x0/0x4c)
more> ^MException stack(0xc48c9fa8 to 0xc48c9ff0)
more> ^M9fa0:                   b6b4b538 afefab1c 00000009 c014c406 afefab1c 000b900e
more> ^M9fc0: b6b4b538 afefab1c ac4ba020 00000036 00000000 3f155c40 ae504ce0 00000000
more> ^M9fe0: b6b4b240 afefab0c b6b38f44 b6c609fc
more> ^M
more> ^M[0]kdb>

OOPS 2

[  145.337313] ------------[ cut here ]------------
[  145.341993] kernel BUG at include/linux/scatterlist.h:95!
[  145.347427] Internal error: Oops - BUG: 0 [#1] SMP ARM
Entering kdb (current=0xc3c1da00, pid 755) on processor 0 Oops: (null)
due to oops @ 0xc0c65774
CPU: 0 PID: 755 Comm: ILCS_HOST Tainted: G         C        5.10.76-4-raspberrypi4-ARCH #1
Hardware name: BCM2711
PC is at vchiq_prepare_bulk_data+0x244/0x6dc
LR is at 0x0
pc : [<c0c65774>]    lr : [<00000000>]    psr: 00080013
sp : c4699d90  ip : 00000001  fp : c1934378
r10: 000006dc  r9 : 00000003  r8 : 00033700
r7 : f0e570e0  r6 : 00000035  r5 : f0e571b0  r4 : 00000035
r3 : 00001000  r2 : f0e571c8  r1 : 00000000  r0 : 3b713b9f
Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 30c5383d  Table: 03bb43c0  DAC: fffffffd
CPU: 0 PID: 755 Comm: ILCS_HOST Tainted: G         C        5.10.76-4-raspberrypi4-ARCH #1
Hardware name: BCM2711
[<c020f0d8>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14)
[<c020ae58>] (show_stack) from [<c0e08748>] (dump_stack+0xa4/0xc4)
[<c0e08748>] (dump_stack) from [<c0301eec>] (kdb_main_loop+0x4bc/0x9a0)
[<c0301eec>] (kdb_main_loop) from [<c0304fa0>] (kdb_stub+0x258/0x458)
[<c0304fa0>] (kdb_stub) from [<c02f9ff4>] (kgdb_cpu_enter+0x3e0/0x6fc)
more> ^M[<c02f9ff4>] (kgdb_cpu_enter) from [<c02fa9b4>] (kgdb_handle_exception+0xe0/0x13c)
more> ^M[<c02fa9b4>] (kgdb_handle_exception) from [<c020e794>] (kgdb_notify+0x28/0x48)
more> ^M[<c020e794>] (kgdb_notify) from [<c02503bc>] (notify_die+0x90/0xd0)
more> ^M[<c02503bc>] (notify_die) from [<c020af8c>] (die+0x130/0x34c)
more> ^M[<c020af8c>] (die) from [<c0200b94>] (__und_svc_finish+0x0/0x2c)
more> ^MException stack(0xc4699d00 to 0xc4699d48)
more> ^M9d00: 3b713b9f 00000000 f0e571c8 00001000 00000035 f0e571b0 00000035 f0e570e0
more> ^M9d20: 00033700 00000003 000006dc c1934378 00000001 c4699d90 00000000 c0c65774
more> ^M9d40: 00080013 ffffffff
more> ^M[<c0200b94>] (__und_svc_finish) from [<c0c65774>] (vchiq_prepare_bulk_data+0x244/0x6dc)
more> ^M[<c0c65774>] (vchiq_prepare_bulk_data) from [<c0c5f0e4>] (vchiq_bulk_transfer+0x258/0x3e4)
more> ^M[<c0c5f0e4>] (vchiq_bulk_transfer) from [<c0c63530>] (vchiq_ioctl+0x724/0x17fc)
more> ^M[<c0c63530>] (vchiq_ioctl) from [<c044b198>] (sys_ioctl+0x34c/0x904)
more> ^M[<c044b198>] (sys_ioctl) from [<c0200040>] (ret_fast_syscall+0x0/0x4c)
more> ^MException stack(0xc4699fa8 to 0xc4699ff0)
more> ^M9fa0:                   b6b4c538 afefab1c 00000009 c014c406 afefab1c 0002a00e
more> ^M9fc0: b6b4c538 afefab1c ad2ff020 00000036 00000000 3f155be0 ae504c70 00000000
more> ^M9fe0: b6b4c240 afefab0c b6b39f44 b6c619fc
more> ^M
more> ^M[0]kdb>
@mdevaev mdevaev changed the title Oops with vchiq_prepare_bulk_data / DMA Oops with vchiq_prepare_bulk_data (tc358743 + OMX JPEG encoder) Nov 5, 2021
@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 5, 2021

I have updated the description and localized the problem. It occurs when we try to use OMX JPEG encoding when capturing video from TC358743.

@pelwell
Copy link
Contributor

pelwell commented Nov 5, 2021

The crash logs suggest the failures are the consequences of memory corruption.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 5, 2021

It doesn't show up on 5.10.52, maybe it's some new bug? On the new kernel, failure occurs 100% of the time when using OMX.

@6by9
Copy link
Contributor

6by9 commented Nov 5, 2021

Tripped when encoding JPEG with ustreamer
vcdbg log assert shows:

2485196.711: assert( vchiq: invalid pagelist at dad47000 - length 10801080, type 1080 (bulk TX) ) failed; ../../../../../interface/vchiq_arm/vchiq_2835_vc.c::vchiq_transfer_bulk line 152 rev e50fe24
vcdbg_ctx_get_dump_stack: dump_stack failed

Kernel oops

[ 6769.026131] 8<--- cut here ---
[ 6769.030113] Unable to handle kernel paging request at virtual address 10801080
[ 6769.031060] pgd = 838bd1c0
[ 6769.031891] [10801080] *pgd=80000000004003, *pmd=00000000
[ 6769.032725] Internal error: Oops: 206 [#1] SMP ARM
[ 6769.033538] Modules linked in: cmac bnep hci_uart btbcm bluetooth ecdh_generic ecc 8021q garp stp llc tc358743 snd_soc_hdmi_codec raspberrypi_hwmon brcmfmac brcmutil cfg80211 vc4 rfkill cec drm_kms_helper v3d gpu_sched bcm2835_v4l2(C) bcm2835_codec(C) drm videobuf2_vmalloc bcm2835_unicam i2c_brcmstb i2c_mux_pinctrl i2c_mux dwc2 v4l2_dv_timings roles v4l2_fwnode bcm2835_isp(C) bcm2835_mmal_vchiq(C) rpivid_hevc(C) v4l2_mem2mem videobuf2_dma_contig drm_panel_orientation_quirks videobuf2_memops videobuf2_v4l2 vc_sm_cma(C) videobuf2_common snd_soc_core snd_compress i2c_bcm2835 snd_pcm_dmaengine snd_pcm snd_timer videodev snd syscopyarea sysfillrect sysimgblt fb_sys_fops backlight mc uio_pdrv_genirq uio nvmem_rmem i2c_dev ip_tables x_tables ipv6
[ 6769.037938] CPU: 3 PID: 80 Comm: vchiq-slot/0 Tainted: G         C        5.10.76-v7l+ #1477
[ 6769.038808] Hardware name: BCM2711
[ 6769.039685] PC is at swiotlb_bounce+0x58/0x1f8
[ 6769.040564] LR is at 0xa6b20
[ 6769.041458] pc : [<c02a55f8>]    lr : [<000a6b20>]    psr: 60000013
[ 6769.042352] sp : c1af9d40  ip : 00000024  fp : c1af9d84
[ 6769.043236] r10: 00000000  r9 : c8aa6b20  r8 : d4890000
[ 6769.044138] r7 : 00000fe0  r6 : a6b20020  r5 : ffffffff  r4 : 00000c8a
[ 6769.045038] r3 : 37f71080  r2 : d8890000  r1 : c13d3880  r0 : a6b20020
[ 6769.045942] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 6769.046851] Control: 30c5383d  Table: 0384fdc0  DAC: 55555555
[ 6769.047768] Process vchiq-slot/0 (pid: 80, stack limit = 0xdc64cf67)
[ 6769.048708] Stack: (0xc1af9d40 to 0xc1afa000)
[ 6769.049639] 9d40: c1aa2e10 14890000 00000fe0 c13d3880 c120aed4 00000000 c1af9d94 00000c8a
[ 6769.050590] 9d60: ffffffff a6b20020 00000c8a 14890000 00000000 00000000 c1af9dac c1af9d88
[ 6769.051556] 9d80: c02a6258 c02a55ac 00000fe0 00000002 dad47100 c1aa2e10 14890000 00000000
[ 6769.052522] 9da0: c1af9e34 c1af9db0 c02a36fc c02a617c 00000fe0 00000002 00000000 c1af9dc8
[ 6769.053504] 9dc0: 14890000 00000000 00000002 c137c750 00000000 00000004 14890000 00000000
[ 6769.054481] 9de0: 14890000 00000000 00000000 0000001f c137c740 fffff000 00000fe0 00000000
[ 6769.055476] 9e00: 14890000 00000000 80000013 dad473e8 ffffffff 00000000 c1426960 c58b9000
[ 6769.056457] 9e20: 0000001f dad47000 c1af9e54 c1af9e38 c02a1bf0 c02a34cc 00000000 c0b9587c
[ 6769.057442] 9e40: c58b9050 dad473e8 c1af9eac c1af9e58 c09edbd0 c02a1b98 00000000 c0270aa0
[ 6769.058409] 9e60: c58b9000 c0b90214 00000000 00000008 c1af9e94 c1af9e80 c0b90214 dad47084
[ 6769.059385] 9e80: c58b91a0 c58b90e8 00000008 dad18978 c58b9100 c58b9000 c140641c dad00194
[ 6769.060368] 9ea0: c1af9f74 c1af9eb0 c09e8334 c09edb38 c020bbb4 c020ca78 c1205048 00000000
[ 6769.061359] 9ec0: c1a51480 c02446e8 c09e7350 c140641c c1af9f14 c0ea8f14 c0ea8f38 c0ea8f60
[ 6769.062356] 9ee0: c0ea8e20 c0ea9370 c0ea939c c0e55548 c1406514 c0ea8404 c58b91a0 c140643c
[ 6769.063352] 9f00: c14064ac c1406470 dad00194 c0c6c73c c142675c dad00020 c0ea7ff0 006059b8
[ 6769.064348] 9f20: 0000000e 00000057 c1320e58 00000010 00000000 c22fae80 c0270254 c1af9f3c
[ 6769.065348] 9f40: c1af9f3c c08c6bac c1af9f74 c1a51480 c2315700 00000000 c1af8000 c09e7350
[ 6769.066335] 9f60: c140641c c190dc1c c1af9fac c1af9f78 c0245b8c c09e735c c1a514a4 c1a514a4
[ 6769.067330] 9f80: ffffe000 c2315700 c0245a1c 00000000 00000000 00000000 00000000 00000000
[ 6769.068320] 9fa0: 00000000 c1af9fb0 c02000ec c0245a28 00000000 00000000 00000000 00000000
[ 6769.069305] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 6769.070293] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 6769.071260] Backtrace: 
[ 6769.072243] [<c02a55a0>] (swiotlb_bounce) from [<c02a6258>] (swiotlb_tbl_sync_single+0xe8/0x114)
[ 6769.073254]  r10:00000000 r9:00000000 r8:14890000 r7:00000c8a r6:a6b20020 r5:ffffffff
[ 6769.074260]  r4:00000c8a
[ 6769.075265] [<c02a6170>] (swiotlb_tbl_sync_single) from [<c02a36fc>] (dma_direct_unmap_sg+0x23c/0x344)
[ 6769.076291]  r7:00000000 r6:14890000 r5:c1aa2e10 r4:dad47100
[ 6769.077317] [<c02a34c0>] (dma_direct_unmap_sg) from [<c02a1bf0>] (dma_unmap_sg_attrs+0x64/0x70)
[ 6769.078369]  r10:dad47000 r9:0000001f r8:c58b9000 r7:c1426960 r6:00000000 r5:ffffffff
[ 6769.079414]  r4:dad473e8
[ 6769.080450] [<c02a1b8c>] (dma_unmap_sg_attrs) from [<c09edbd0>] (vchiq_complete_bulk+0xa4/0x304)
[ 6769.081490]  r4:dad473e8
[ 6769.082496] [<c09edb2c>] (vchiq_complete_bulk) from [<c09e8334>] (slot_handler_func+0xfe4/0x1710)
[ 6769.083521]  r10:dad00194 r9:c140641c r8:c58b9000 r7:c58b9100 r6:dad18978 r5:00000008
[ 6769.084505]  r4:c58b90e8
[ 6769.085459] [<c09e7350>] (slot_handler_func) from [<c0245b8c>] (kthread+0x170/0x174)
[ 6769.086412]  r10:c190dc1c r9:c140641c r8:c09e7350 r7:c1af8000 r6:00000000 r5:c2315700
[ 6769.087369]  r4:c1a51480
[ 6769.088322] [<c0245a1c>] (kthread) from [<c02000ec>] (ret_from_fork+0x14/0x28)
[ 6769.089308] Exception stack(0xc1af9fb0 to 0xc1af9ff8)
[ 6769.090281] 9fa0:                                     00000000 00000000 00000000 00000000
[ 6769.091274] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 6769.092266] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 6769.093248]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0245a1c
[ 6769.094239]  r4:c2315700
[ 6769.095211] Code: e003039c e5912000 e50b1038 e59b7004 (e7923003) 
[ 6769.096232] ---[ end trace 99e68b5b56a11422 ]---

Edit: Firmware looks to have died, but hasn't flushed the cache so vcdbg log msg tail end is corrupted.

@pelwell
Copy link
Contributor

pelwell commented Nov 5, 2021

[ 6769.030113] Unable to handle kernel paging request at virtual address 10801080

Yep - that looks like (black) pixel data in a place where no pixel data should be.

@6by9
Copy link
Contributor

6by9 commented Nov 5, 2021

Rebuilt my kernel at 2a29770 (5.10.77) and top of tree firmware (with a few local mods). Crashed.

Reset my firmware back ("17 and 10 different commits each, respectively", top commit from origin is bcdfe55d5). I'll bisect it.

@6by9
Copy link
Contributor

6by9 commented Nov 5, 2021

Working at 1920x1080 seems to trigger things more frequently. (I was using 1280x720 before).

I've just had a

[   29.183297] bcm2835-dma fe007b00.dma: swiotlb buffer is full (sz: -1249856128 bytes), total 32768 (slots), used 6 (slots)

logged as well, which has also killed everything.

What I thought was working with 5.10.77 and my old firmware does fail at 1080p.

I'm reverting back to the firmware from 5.10.52, but one thought I do have is that the buffers allocated by bcm2835-unicam and consumed by the H264 encoder do NOT have to be aligned to a multiple of 16 lines, whilst the IL JPEG DOES need 16 line alignment. It's not that you're asking it to copy more data than is allocated?

Firmware from 5.10.52 has just crashed with the kernel logging swiotlb buffer is full again.

@pelwell
Copy link
Contributor

pelwell commented Nov 5, 2021

sz: -1249856128 is more evidence of corruption (0xb580b580), as I suspect is any mention of swiotlb.

@6by9
Copy link
Contributor

6by9 commented Nov 5, 2021

raspberrypi/rpi-firmware@c59a637 seems to be the first to really trip this, so it would imply something changed with #4641. Slightly odd as that didn't touch memory allocation of buffer usage, but there you go.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 6, 2021

@6by9 I have already encountered alignment problems and spied code in yavta and gstreamer. At least it worked before the update.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 9, 2021

Please let me know if I can help with testing or anything else.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 17, 2021

By the way, how would you like the new V4L2 M2M interface to support JPEG encoding? Since you're throwing away proprietary libraries, it would be great to have a replacement for this feature. My idea was to make /dev/video21 for ril.image_encode. I will be happy to make draft implementation and patch if you are interested in taking this upstream.

I saw that libcamera uses libjpeg to encode images and on the RPi forum I was offered to use encoding on cores, but as you saw, 1080p encoding consumes a huge amount of CPU resources and takes more time, which is a disaster for PiKVM, we really need hardware encoding to keep zero latency.

PS: I'm sorry to ask this here. It's just that if M2M works and is accepted into the RPi upstream, I can forget about the problems of OMX :) Unless the crash described is related to something lower-level and will still require a fix.

@6by9
Copy link
Contributor

6by9 commented Nov 17, 2021

The encoder role in bcm2835-codec already supports MJPEG via video_encode.

image_encode messes around with EXIF headers and similar, which are all irrelevant under V4L2 as there are no controls to pass them in. All V4L2 is really after is the basic encoded image, which MJPEG gives you. It also does rate control rather than manually messing with quantisation parameters.

I realise that you were running multiple image_encode components in parallel as feeding in YUYV (or similar) requires a format conversion first, so effectively you were queuing those up. The structure of the MJPEG codec may not give the required performance there, but I couldn't without trying it. There is an optimisation that could be made for MJPEG if it was close.

V4L2 does avoid VCHIQ bulk transfers, so should avoid this issue.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 17, 2021

Thanks for the explanation.

I couldn't get MJPEG from encoder in version 5.10.52, did it appear later? Whatever encoding parameters I used, it still turned out to be H264. v4l2-ctl --list-formats also showed that /dev/video11 supports only H264.

As for the speed - in any case, I'll try and compare. Even if V4L2 is a bit slower, I expect a speed boost (or compensation) due to the possibility of using DMA, since with OMX I couldn't make it work.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 23, 2021

Okay, it really is. Even if I set the capture buffer format as V4L2_PIX_FMT_MJPEG, the encoder still produces H264. This is even true on new kernels. How do I achieve MJPEG? What parameters should I use?

Here I made a special version of ustreamer (m2m branch). To run, it's enough to simply use ustreamer --encoder=v4l2 --workers=1 --verbose. About the size of the packets it will be clear that it is H264.

https://github.com/pikvm/ustreamer/blob/m2m/src/ustreamer/m2m.c#L152

@6by9
Copy link
Contributor

6by9 commented Nov 23, 2021

gst-launch-1.0 -vvv -e videotestsrc ! video/x-raw,width=640,height=480,format=I420 ! v4l2jpegenc ! filesink location=foo.mjpeg
The output file starts

 ff d8 ff db 00 84 00 10 0b 0c 0e 0c 0a 10 0e 0d

which certainly looks like JPEG / MJPEG to me. H264 would start with a start code (00 00 00 01).

If the driver doesn't like something about the format passed with VIDIOC_S_FMT, the spec requires it to correct it to something that it does like and return that. It is not permitted to fail the VIDIOC_S_FMT call unless the type is invalid.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 23, 2021

Maybe something is wrong with my code, because even judging by the start bytes it produces H264. I'll build a gstreamer and try it.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 23, 2021

Okay, in my case, v4l2jpegenc is unavailable, since apparently gst checks the capabilities of the encoder and disables the unavailable component. At the same time, I have v4l2jpegdec and v4l2h264enc available. Do I need some kind of latest firmware version? Or is it enabled by some kernel options?

[root@pikvm ~]# vcgencmd version
Nov  2 2021 13:22:15
Copyright (c) 2012 Broadcom
version e50fe24ee2b6974f3ba6615ba0f1d8f45c485f69 (clean) (release) (start)
[root@pikvm ~]# uname -a
Linux pikvm 5.10.76-4-raspberrypi4-ARCH #1 SMP Fri Nov 5 08:11:58 MSK 2021 armv7l GNU/Linux

@6by9
Copy link
Contributor

6by9 commented Nov 23, 2021

I was just using a Bullseye image and checked with gst-inspect which components starting v4l2 existed. IIRC Bullseye is using GStreamer 1.18.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 23, 2021

1.18 too, but I'm using Arch. I understand that the problem is probably in my kernel, I just want to figure out what exactly I have to enable/tune/modify in it to get a jpeg.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 23, 2021

It's very strange. I just installed Bullseye and tried gstreamer. The result is the same as for Arch. Maybe the problem is something specific to a particular revision of the board?

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 5.10.63-v7l+ #1459 SMP Wed Oct 6 16:41:57 BST 2021 armv7l GNU/Linux

pi@raspberrypi:~ $ vcgencmd version
Oct 29 2021 10:47:33
Copyright (c) 2012 Broadcom
version b8a114e5a9877e91ca8f26d1a5ce904b2ad3cf13 (clean) (release) (start)

pi@raspberrypi:~ $ sudo cat /proc/device-tree/model
Raspberry Pi 4 Model B Rev 1.4

pi@raspberrypi:~ $ gst-inspect-1.0 | grep v4l2
video4linux2:  v4l2src: Video (video4linux2) Source
video4linux2:  v4l2sink: Video (video4linux2) Sink
video4linux2:  v4l2radio: Radio (video4linux2) Tuner
video4linux2:  v4l2deviceprovider (GstDeviceProviderFactory)
video4linux2:  v4l2jpegdec: V4L2 JPEG Decoder
video4linux2:  v4l2h264dec: V4L2 H264 Decoder
video4linux2:  v4l2h264enc: V4L2 H.264 Encoder
video4linux2:  v4l2convert: V4L2 Video Converter
video4linux2:  v4l2video18convert: V4L2 Video Converter

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 24, 2021

I tested fresh Bullseye on Raspberry Pi 4 with 2, 4 and 8 Gb RAM with 32 bit. I also tried 4 Gb and 64 bit. I tried the kernel from the repository and from rpi-update. JPEG encoder was not available anywhere.

Maybe I should load some modules or use special kernel parameters?

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 24, 2021

I found out that start_x=1 is required to enable jpeg encoder=1. Why?

PS I'm exploring jpeg encoder a bit more and have completely migrated my software. Then OMX issues can be closed with wontfix %)

@6by9
Copy link
Contributor

6by9 commented Nov 24, 2021

Encoding used to be only available when used with the camera, hence start_x=1 (or start_debug=1). H264 encode obviously got added to the base image at some point.
I can tweak the firmware so that it doesn't advertise MJPEG unless using the correct variant.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 24, 2021

Maybe it's worth adding mpeg encoding in the base image as well as h264? Just because using this for tc358743 is a fairly common case, and it would be strange to include start_x for the sake of it.

@mdevaev
Copy link
Contributor Author

mdevaev commented Nov 24, 2021

The JPEG encoder is working, but I can't change the image quality in any way - it's terrible. I made a patch to configure MMAL_PARAMETER_JPEG_Q_FACTOR, but it didn't affect anything at all.

image

Interestingly, gstreamer also gives such a picture:

image

Should I configure something else so that MMAL doesn't ignore the q-factor? Maybe this is due to the fact that MJPG/ril.video_encode, not JPEG/ril.image_encode?

@6by9
Copy link
Contributor

6by9 commented Nov 24, 2021

MJPEG is configured via bitrate, not via a Q factor.

@6by9
Copy link
Contributor

6by9 commented Dec 14, 2021

Your invalid length issue sort of confirms that you aren't handling buffer sizes correctly.

4147200 = 1920 * 1080 * 2
4177920 = 1920 * 1088 * 2

image_encode requires the height to be aligned to a multiple of 16 lines. The Unicam driver doesn't.
You've amended the format is passed to IL/MMAL, so it will be happy, but the size of buffer that it is now looking at exceeds the actual allocation. When passed to IL it asks for a memcpy of the buffer, and that will try copying additional data, assuming that the pages are mapped.

You can use VIDIOC_CREATE_BUFS instead of VIDIOC_REQBUFS to request V4L2 to allocate buffers that are larger than necessary.

@mdevaev
Copy link
Contributor Author

mdevaev commented Dec 14, 2021

Your invalid length issue sort of confirms that you aren't handling buffer sizes correctly.

My code already takes into account the need for alignment, and this works fine with V4L2_MEMORY_MMAP. I used the same logic that is already implemented for alignment/cropping with DECODER role. Is that not enough?

You can use VIDIOC_CREATE_BUFS instead of VIDIOC_REQBUFS to request V4L2 to allocate buffers that are larger than necessary.

Ideally, I would like the userspace code not to know about the specifics of working with buffers.

PS: Your idea of multiplying by -1 did not help, and there is another problem with DMA for video_encode, for which I made a separate issue.

@mdevaev mdevaev closed this as completed Dec 14, 2021
@mdevaev mdevaev reopened this Dec 14, 2021
@mdevaev
Copy link
Contributor Author

mdevaev commented Dec 14, 2021

Your invalid length issue sort of confirms that you aren't handling buffer sizes correctly.

My code already takes into account the need for alignment, and this works fine with V4L2_MEMORY_MMAP. I used the same logic that is already implemented for alignment/cropping with DECODER role. Is that not enough?

You can use VIDIOC_CREATE_BUFS instead of VIDIOC_REQBUFS to request V4L2 to allocate buffers that are larger than necessary.

Ideally, I would like the userspace code not to know about the specifics of working with buffers.

PS: Your idea of multiplying by -1 did not help, and there is another problem with DMA for video_encode, for which I made a separate issue.

@6by9
Copy link
Contributor

6by9 commented Dec 14, 2021

REQBUFS will give you a buffer that is the size that that V4L2 device wants.
You mmap it based on that size.
You then tell another component to copy that plus a bit more - what does it get in that extra bit of memory? If you go across a page boundary then you've potentially told it to copy memory that has no page table entry.
You may get away with it, or the following bit of memory may be used for something vital.

When importing DMABUFs the kernel can check the size of the underlying allocation to ensure that userspace doesn't allocate X, and then ask another subsystem to import X+Y. That's why it complains.

JPEG and H264 encoding natively need to work in macroblocks, hence frequently having the requirement for the height (and stride) to be a multiple of 16.
YUYV formats are converted, but relaxing the restrictions on those only gets very messy. I420/YV12 get presented directly to the hardware, and it does manipulate the extra padding pixels as it is required by the JPEG spec (the edge pixel gets repeated in each direction).

@mdevaev
Copy link
Contributor Author

mdevaev commented Dec 14, 2021

I see, thank you for explanation. Could you also say something about the DMA/video_encode crash and the second problem with qfactor?

@mdevaev
Copy link
Contributor Author

mdevaev commented Dec 23, 2021

@6by9 sup?

@mdevaev
Copy link
Contributor Author

mdevaev commented Jan 8, 2022

@6by9 Could we somehow solve the problem with IJG because negative values don't work, and this is a blocker for our further work related to the migration to Bullseye.

@6by9
Copy link
Contributor

6by9 commented Jan 28, 2022

Firmware patch created adding MMAL_PARAMETER_JPEG_IJG_SCALING.
It should ripple out through rpi-update soon.

@mdevaev
Copy link
Contributor Author

mdevaev commented Jan 28, 2022

Thank you! I'll check it.

@mdevaev
Copy link
Contributor Author

mdevaev commented Jan 29, 2022

Will the definition of the MMAL_PARAMETER_JPEG_IJG_SCALING be added to the mmal-parameters.h like other similar? I don't see it right now.

popcornmix added a commit to raspberrypi/firmware that referenced this issue Feb 2, 2022
kernel: Patching lan78xx for SOF_TIMESTAMPING_TX_SOFTWARE support
See: raspberrypi/linux#4856

kernel: i2c: bcm2835: Make clock-stretch timeout configurable
See: raspberrypi/linux#4855

kernel: drm/vc4: Add DRM 210101010 RGB formats for hvs5.
See: raspberrypi/linux#4859

kernel: vc4-kms-dpi overlay updates
See: raspberrypi/linux#4860

kernel: Add Support for the Geekworm MZP280 DPI Display
See: raspberrypi/linux#4853

kernel: DRM: Clean up handling of panel orientation
See: raspberrypi/linux#4862

kernel: Add support for the MAX30102 heart rate and blood oxygen sensor
See: raspberrypi/linux#4535

firmware: mmal: Add mapping for IL OMX_IndexParamBrcmEnableIJGTableScaling param
See: raspberrypi/linux#4669

userland: Handle overlay parameters embedded in overlay_map.dtb
See: raspberrypi/linux#4860
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this issue Feb 2, 2022
kernel: Patching lan78xx for SOF_TIMESTAMPING_TX_SOFTWARE support
See: raspberrypi/linux#4856

kernel: i2c: bcm2835: Make clock-stretch timeout configurable
See: raspberrypi/linux#4855

kernel: drm/vc4: Add DRM 210101010 RGB formats for hvs5.
See: raspberrypi/linux#4859

kernel: vc4-kms-dpi overlay updates
See: raspberrypi/linux#4860

kernel: Add Support for the Geekworm MZP280 DPI Display
See: raspberrypi/linux#4853

kernel: DRM: Clean up handling of panel orientation
See: raspberrypi/linux#4862

kernel: Add support for the MAX30102 heart rate and blood oxygen sensor
See: raspberrypi/linux#4535

firmware: mmal: Add mapping for IL OMX_IndexParamBrcmEnableIJGTableScaling param
See: raspberrypi/linux#4669

userland: Handle overlay parameters embedded in overlay_map.dtb
See: raspberrypi/linux#4860
@popcornmix
Copy link
Collaborator

It's pushed to rpi-update (but that doesn't affect kernel tree).

@6by9
Copy link
Contributor

6by9 commented Feb 2, 2022

Added to userland - raspberrypi/userland@8fa944c

Until it's used in the kernel there is little point in adding it there.

@mdevaev
Copy link
Contributor Author

mdevaev commented Feb 2, 2022

Got it

@mdevaev
Copy link
Contributor Author

mdevaev commented Feb 3, 2022

@6by9 Something doesn't seem to be working. I fixed my patch to the latest kernel version and installed a new firmware. The return value of changing the IJG param is -3:

        u32 enable = 1;
        int x = vchiq_mmal_port_parameter_set(ctx->dev->instance,
                          &ctx->component->output[0],
                          MMAL_PARAMETER_JPEG_IJG_SCALING,
                          &enable,
                          sizeof(enable));

MMAL_PARAMETER_JPEG_IJG_SCALING is 0x10061, same as userspace header (checked).

Also tried &ctx->component->control without any luck.

[root@pikvm(g:m2m) g:/]# vcgencmd version
Feb  1 2022 13:21:08
Copyright (c) 2012 Broadcom
version 2a652e2a1ddcf0f358bc6278802ff7d8f1397c2e (clean) (release) (start_x)

Am I doing something wrong?

mkreisl added a commit to xbianonpi/xbian-package-firmware that referenced this issue Feb 5, 2022
- firmware: ldconfig: Discard subsequent chunks from a truncated line
  See: #1669

- firmware: cec: Fail set_passive_mode when running with kms

- firmware: Firmware: Remove PWM/audio traits for CM4

- firmware: usb: Fix non-BCM2711 MSD support
  See: raspberrypi/usbboot#102

- firmware: arm-loader: Fix kernel8.img selection on 2837 with arm_64bit=1
  See: #1671

- firmware: improve firmware camera detection

- firmware: arm_loader: Load vl805 overlay on CM4
  See: https://forums.raspberrypi.com/viewtopic.php?t=326088

- firmware: gencmdserv: Add mailbox interface to gencmd

- firmware: arm_loader: Only clip min/max to the same value for turbo clocks

- firmware: dtoverlay: Don't mix non-fatal errors and offsets
  See: #1686

- firmware: platform: Limit max clock-id to CLOCK_VEC for now
  See: #1688

- firmware: mmal: Add mapping for IL OMX_IndexParamBrcmEnableIJGTableScaling param
  See: raspberrypi/linux#4669
@mdevaev
Copy link
Contributor Author

mdevaev commented Feb 14, 2022

Sup?

@6by9
Copy link
Contributor

6by9 commented Feb 18, 2022

I had got the mapping wrong and hadn't passed the port parameter through. Firmware patch being reviewed now.
I've confirmed that the diff to Raspistill

diff --git a/host_applications/linux/apps/raspicam/RaspiStill.c b/host_applications/linux/apps/raspicam/RaspiStill.c
index a88bcc8..7d6c000 100644
--- a/host_applications/linux/apps/raspicam/RaspiStill.c
+++ b/host_applications/linux/apps/raspicam/RaspiStill.c
@@ -1093,6 +1093,14 @@ static MMAL_STATUS_T create_encoder_component(RASPISTILL_STATE *state)
       goto error;
    }
 
+   status = mmal_port_parameter_set_boolean(encoder_output, MMAL_PARAMETER_JPEG_IJG_SCALING, 1);
+
+   if (status != MMAL_SUCCESS)
+   {
+      vcos_log_error("Unable to set JPEG IJG scaling");
+      goto error;
+   }
+

sets the right flag in the firmware, and it is then passed into the encoder.

@mdevaev
Copy link
Contributor Author

mdevaev commented Feb 18, 2022

Thank you! I will be waiting for the firmware update.

@mdevaev
Copy link
Contributor Author

mdevaev commented Feb 28, 2022

Any news about it? When will this fix be released?

@pelwell
Copy link
Contributor

pelwell commented Mar 1, 2022

I'm not sure what the holdup with the firmware is, but there should be a new release today or tomorrow.

@pelwell
Copy link
Contributor

pelwell commented Mar 1, 2022

rpi-firmware has been bumped - sudo rpi-update will get you a build with 6by9's patch.

@mdevaev
Copy link
Contributor Author

mdevaev commented Mar 1, 2022

It seems it's working now, thank you.

The next question: could you enable image encoder without start_x=1? You did this for H264, and it is necessary for the image encoder to work out of the box.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants