Skip to content

VC4: KMS/FKMS Fix weird hang with frozen display and spining cursor #1840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Electron752
Copy link
Contributor

The issue with the spinning cursor is that most distros(Debian Gnome/XFCE4) can flood the driver with cursor updates. The userland software never expects the cursor updates to be synchronized with the vblank. If they are synchronized, then the display will hang for very long periods of time.

The workaround on both kms and fkms is to update the cursor without any vblank synchronization.

I tested both KMS and FKMS on Debian sid with both the XFCE4 desktop and the Gnome desktop.

Steve Glendinning and others added 30 commits February 6, 2017 18:13
smsc95xx is adjusting truesize when it shouldn't, and following a recent patch from Eric this is now triggering warnings.

This patch stops smsc95xx from changing truesize.

Signed-off-by: Steve Glendinning <[email protected]>
Without this patch, removing a device tree overlay can crash here.

Signed-off-by: Phil Elwell <[email protected]>
See commit dae803e -- the warning is
expected sometimes when using CMA.  However, that commit still spams
my kernel log with these warnings.

Signed-off-by: Eric Anholt <[email protected]>
The old arch-specific IRQ macros included a dsb to ensure the
write to clear the mailbox interrupt completed before returning
from the interrupt. The BCM2836 irqchip driver needs the same
precaution to avoid spurious interrupts.

Spurious interrupts are still possible for other reasons,
though, so trap them early.
Add a duplicate irq range with an offset on the hwirq's so the
driver can detect that enable_fiq() is used.
Tested with downstream dwc_otg USB controller driver.

Signed-off-by: Noralf Trønnes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Stephen Warren <[email protected]>
Although the GPIO controller can generate three interrupts (four counting
the common one), the device tree files currently only specify two. In the
absence of the third, simply don't register that interrupt (as opposed to
registering 0), which has the effect of making it impossible to generate
interrupts for GPIOs 46-53 which, since they share pins with the SD card
interface, is unlikely to be a problem.
The spi-bcm2835 driver automatically uses GPIO chip-selects due to
some unreliability of the native ones. In doing so it chooses the
same pins as the native chip-selects would use, but the existing
code always uses pins 7 and 8, wherever the SPI function is mapped.

Search the pinctrl group assigned to the driver for pins that
correspond to native chip-selects, and use those for GPIO chip-
selects.

Signed-off-by: Phil Elwell <[email protected]>
Select software CS in bcm2708_common.dtsi, and disable the automatic
conversion in the driver to allow hardware CS to be re-enabled with an
overlay.

See: raspberrypi#1547

Signed-off-by: Phil Elwell <[email protected]>
The VideoCore bootloader passes in Serial number and
Revision number through Device Tree. Make these available to
userspace through /proc/cpuinfo.

Mainline status:

There is a commit in linux-next that standardize passing the serial
number through Device Tree (string: /serial-number):
ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo

There was an attempt to do the same with the revision number, but it
didn't get in:
[PATCH v2 1/2] arm: devtree: Set system_rev from DT revision

Signed-off-by: Noralf Trønnes <[email protected]>
Load driver early since at least bcm2708_fb doesn't support deferred
probing and even if it did, we don't want the video driver deferred.
Support the legacy DMA API which is needed by bcm2708_fb.
Don't mask out channel 2.

Signed-off-by: Noralf Trønnes <[email protected]>
Without this alias, Device Tree won't cause the driver
to be loaded.

See: raspberrypi#1510
The Raspberry Pi firmware looks at the RSTS register to know which
partition to boot from. The reboot syscall command
LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument.

Add support for passing in a partition number 0..63 to boot from.
Partition 63 is a special partiton indicating halt.
If the partition doesn't exist, the firmware falls back to partition 0.

Signed-off-by: Noralf Trønnes <[email protected]>
so that special/critical clocks can get enabled early on
in the boot process avoiding the risk of disabling a clock,
pll_divider or pll when a claiming driver fails to install
propperly - maybe it needs to defer.

Signed-off-by: Martin Sperl <[email protected]>
Avoids the 0x40000 cycles of warmup again if firmware has already used it
Signed-off-by: popcornmix <[email protected]>
Signed-off-by: Noralf Trønnes <[email protected]>

bcm2709: Drop platform smp and timer init code

irq-bcm2836 handles this through these functions:
bcm2835_init_local_timer_frequency()
bcm2836_arm_irqchip_smp_init()

Signed-off-by: Noralf Trønnes <[email protected]>

bcm270x: Use watchdog for reboot/poweroff

The watchdog driver already has support for reboot/poweroff.
Make use of this and remove the code from the platform files.

Signed-off-by: Noralf Trønnes <[email protected]>
Signed-off-by: popcornmix <[email protected]>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <[email protected]>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <[email protected]>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: raspberrypi/firmware#21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb23.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <[email protected]>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <[email protected]>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <[email protected]>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <[email protected]>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <[email protected]>
Signed-off-by: popcornmix <[email protected]>

bcm2708_fb : Implement blanking support using the mailbox property interface

bcm2708_fb: Add pan and vsync controls

bcm2708_fb: DMA acceleration for fb_copyarea

Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425
Also used Simon's dmaer_master module as a reference for tweaking DMA
settings for better performance.

For now busylooping only. IRQ support might be added later.
With non-overclocked Raspberry Pi, the performance is ~360 MB/s
for simple copy or ~260 MB/s for two-pass copy (used when dragging
windows to the right).

In the case of using DMA channel 0, the performance improves
to ~440 MB/s.

For comparison, VFP optimized CPU copy can only do ~114 MB/s in
the same conditions (hindered by reading uncached source buffer).

Signed-off-by: Siarhei Siamashka <[email protected]>

bcm2708_fb: report number of dma copies

Add a counter (exported via debugfs) reporting the
number of dma copies that the framebuffer driver
has done, in order to help evaluate different
optimization strategies.

Signed-off-by: Luke Diamand <[email protected]>

bcm2708_fb: use IRQ for DMA copies

The copyarea ioctl() uses DMA to speed things along. This
was busy-waiting for completion. This change supports using
an interrupt instead for larger transfers. For small
transfers, busy-waiting is still likely to be faster.

Signed-off-by: Luke Diamand <[email protected]>

bcm2708: Make ioctl logging quieter

video: fbdev: bcm2708_fb: Don't panic on error

No need to panic the kernel if the video driver fails.
Just print a message and return an error.

Signed-off-by: Noralf Trønnes <[email protected]>

fbdev: bcm2708_fb: Add ARCH_BCM2835 support

Add Device Tree support.
Pass the device to dma_alloc_coherent() in order to get the
correct bus address on ARCH_BCM2835.
Use the new DMA legacy API header file.
Including <mach/platform.h> is not necessary.

Signed-off-by: Noralf Trønnes <[email protected]>

BCM270x_DT: Add bcm2708-fb device

Add bcm2708-fb to Device Tree and don't add the
platform device when booting in DT mode.

Signed-off-by: Noralf Trønnes <[email protected]>
Electron752 and others added 14 commits February 6, 2017 18:14
In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <[email protected]>
IRQ-CPU mapping is round robined on ARM64 to increase
concurrency and allow multiple interrupts to be serviced
at a time.  This reduces the need for FIQ.

Signed-off-by: Michael Zoran <[email protected]>
If it breaks on anybody, they can use the standard device tree
overlays to switch back to the dwc2 driver.

Signed-off-by: Michael Zoran <[email protected]>
The introduction of CM3 makes gadget mode on 2709 a useful option,
so enable the building of the required modules. Note that these
modules are not loaded by default and must be enabled with a DT
overlay.

Signed-off-by: Phil Elwell <[email protected]>
The CM1 dtb contains an empty audio_pins node, but no reference to it.
Adding the usual pinctrl reference from the audio node enables the
audremap overlay (and others) to easily turn on audio.

Signed-off-by: Phil Elwell <[email protected]>
Usage: `dtoverlay=adau7002-simple`
As there is now an overlay requiring it, build the codec module.

Signed-off-by: Phil Elwell <[email protected]>
Some example usage:

SPI0.0
dtparam=spi=on
dtoverlay=mcp3008:spi0-0-present

SPI0.1
dtparam=spi=on
dtoverlay=mcp3008:spi0-1-present

SPI0.0 and SPI0.1
dtparam=spi=on
dtoverlay=mcp3008:spi0-0-present,spi0-1-present

SPI1.0
dtparam=spi=on
dtoverlay=spi1-1cs
dtoverlay=mcp3008:spi1-0-present

SPI1.2
dtparam=spi=on
dtoverlay=spi1-1cs:cs0_pin=16
dtoverlay=mcp3008:spi1-0-present

SPI1.0 and SPI1.1
dtoverlay=spi1-2cs
dtoverlay=mcp3008:spi1-0-present,spi1-1-present

Changing the speed

SPI0.0
dtparam=spi=on
dtoverlay=mcp3008:spi0-0-present,spi0-0-speed=2000000
The open function was spamming syslog every time
called, so have removed call completely.
The current implementation can hang if the userland sends
a lot of cursor updates. This is because the cursor updates
are waiting for the vblank.

This is unnecessary and unexpected, so simply swap out the frame
buffer without waiting.

Signed-off-by: Michael Zoran <[email protected]>
The current implementation can hang if the userland sends
a lot of cursor updates. This is because the cursor updates
are waiting for the vblank.

This is unnecessary and unexpected, so simply swap out the frame
buffer without waiting.

Signed-off-by: Michael Zoran <[email protected]>
@popcornmix
Copy link
Collaborator

As @anholt maintains these files, I'll only accept patches if he accepts them. I don't want to diverge from his versions of files.
Probably better to report issues to https://github.com/anholt/linux/issues and we can then get fixes from @anholt.

@Electron752
Copy link
Contributor Author

I understand your point of view and it makes sense.

I reported the issue of the hangs awhile back on his page and didn't get much response. I also directly e-mailed him the KMS fix yesterday and he said it was my version of linux that's broken although I'm not sure what the recommended alternative to Raspbian is since both XFCE4 and Gnome are very broken.

I think their is a school of thought that downstream is only for Raspbian and upstream is for other distributions but only when booting with u-boot. That's a bit unfortunate since some of the enhancements in downstream work great with other distributions.

Perhaps we can keep the issue active for a few days or so and maybe someone else can chime in to confirm the issue. Or perhaps @anholt can respond as to why I'm seeing the weird hangs and why the patch I've provided won't work.

@ED6E0F17
Copy link

I may have hit this issue on Debian Stretch (arm v7), but I do not intend to test the patch until the Anholt backports have been merged.

@Electron752
Copy link
Contributor Author

I would like to see the backports merged as well. Running KMS on the LCD panel is cool. I really have no idea why that merge is being delayed. I don't understand why having a few issues is holding it up, especially since a few similar version was merged into 4.4.

@ktb83
Copy link

ktb83 commented Feb 11, 2017

@Electron752 Wow, very interesting. I'm definitely taking out the 64 bit build again to give this a shot. You really got Gnome 3.x running (even if it's rough)?

@Electron752
Copy link
Contributor Author

With the patch, Gnome isn't really rough at all. Alot of things seem to run faster with the patch. I think it's because the user land software will sometimes flood the driver with cursor updates. Waiting for the vblank on every cursor update is way too slow.

I have no idea why this issue isn't being seen on Raspbian.

@Electron752
Copy link
Contributor Author

As a side point, the bottle neck with running that kind of software appears to be more disk access speed and amount of RAM in the system. I see things will start filling up the swap file even with casual use.

@ktb83
Copy link

ktb83 commented Feb 13, 2017

Confirmed. This makes Xfce usable for me. I do get a warning from line 198 in vc4_firmware_kms.c:

lines 198-199:

	WARN_ON_ONCE(fb->pitches[0] != state->crtc_w * 4);
	WARN_ON_ONCE(fb->bits_per_pixel != 32);

@ED6E0F17
Copy link

With a very quick test of the Anholt backports and this, everything seems to be working well. I do get:

[  119.211359] ------------[ cut here ]------------
[  119.211439] WARNING: CPU: 2 PID: 454 at /linux/drivers/gpu/drm/vc4/vc4_plane.c:721 vc4_plane_async_set_fb+0xa4/0xa8
[  119.211456] Modules linked in:
[  119.211491] CPU: 2 PID: 454 Comm: Xorg Not tainted 4.9.9-armv7+ #36
[  119.211507] Hardware name: BCM2835
[  119.211552] [<80110a48>] (unwind_backtrace) from [<8010cacc>] (show_stack+0x20/0x24)
[  119.211584] [<8010cacc>] (show_stack) from [<804bee6c>] (dump_stack+0xc4/0x108)
[  119.211622] [<804bee6c>] (dump_stack) from [<80121438>] (__warn+0xf8/0x110)
[  119.211653] [<80121438>] (__warn) from [<80121520>] (warn_slowpath_null+0x30/0x38)
[  119.211683] [<80121520>] (warn_slowpath_null) from [<805ad628>] (vc4_plane_async_set_fb+0xa4/0xa8)
[  119.211714] [<805ad628>] (vc4_plane_async_set_fb) from [<805ad794>] (vc4_update_plane+0x168/0x170)
[  119.211745] [<805ad794>] (vc4_update_plane) from [<8059bd8c>] (__setplane_internal+0x190/0x238)
[  119.211775] [<8059bd8c>] (__setplane_internal) from [<8059bf54>] (drm_mode_cursor_universal+0x120/0x1b4)
[  119.211805] [<8059bf54>] (drm_mode_cursor_universal) from [<8059c074>] (drm_mode_cursor_common+0x8c/0x194)
[  119.211836] [<8059c074>] (drm_mode_cursor_common) from [<8059c740>] (drm_mode_cursor2_ioctl+0x18/0x1c)
[  119.211868] [<8059c740>] (drm_mode_cursor2_ioctl) from [<8057e718>] (drm_ioctl+0x214/0x478)
[  119.211905] [<8057e718>] (drm_ioctl) from [<802ae0f8>] (do_vfs_ioctl+0xac/0x810)
[  119.211936] [<802ae0f8>] (do_vfs_ioctl) from [<802ae8d8>] (SyS_ioctl+0x7c/0x8c)
[  119.211965] [<802ae8d8>] (SyS_ioctl) from [<80108480>] (ret_fast_syscall+0x0/0x1c)
[  119.211984] ---[ end trace 002118488d65a68a ]---

Copy link
Contributor

@anholt anholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think currently we never have a single update that does both cursor BO and position, I think I'd like us to be protected from that, so:

  • only check fb != plane_fb, rather than position
  • Drop the return 0.

Also, we need to make sure the new FB gets referenced by the plane state, so it sticks on future plane updates. drm_atomic_set_fb_for_plane(plane->state, fb); just before the vc4_plane_async_set_fb() call should do that.

With those changes, could you submit the patch to the DRI mailing list? I'd love to have you get credit for this patch upstream.

@Electron752
Copy link
Contributor Author

Sure, I can probably submit a patch. It's probably better that it goes upstream anyway.

It might take me a day or two to have it ready.

@popcornmix
Copy link
Collaborator

Ping me here once upstream has accepted patch and I'll merge/cherry-pick it to our tree.

@Electron752
Copy link
Contributor Author

Electron752 commented Mar 2, 2017

@anholt / @popcornmix : The KMS fix has been accepted in upstream, but it's going to be awhile for it to propagate through all the branches. It's currently in drm-misc-next, so I'm guessing it won't be available until 4.12 at least.

Any thoughts on what to do about fkms since it has the same issue? fkms is downstream only so I don't have anything to submit to upstream. I've been told it's the xorg/xserver that's animating the spinning cursor, so I don't think this is an issue with Raspbian at this time. But I'm not sure what will happen when/if things move to a newer debian base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.