Skip to content

Conversation

sumedhbala-delphix
Copy link
Contributor

Merge of Master into stage via 'git reset --hard origin/master'

ianmay81 and others added 30 commits June 22, 2022 11:55
Ignore: yes
Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: William Breathitt Gray <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1895991
Properties: no-test-build
Signed-off-by: William Breathitt Gray <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Ignore: yes
Signed-off-by: Kleber Sacilotto de Souza <[email protected]>
Signed-off-by: Kleber Sacilotto de Souza <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1898781

AWS F1 (x86_64) instance types need the fpga-mgr module.

Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Stefan Bader <[email protected]>
Acked-by: Colin Ian King <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1893817

Using write-combine is crucial for performance of PCI devices where
significant amounts of transactions go over PCI BARs.

arm64 supports write-combine PCI mappings, so the appropriate define
has been added which will expose write-combine mappings under sysfs
for prefetchable PCI resources.

Signed-off-by: Clint Sbisa <[email protected]>
Reference: https://lore.kernel.org/linux-pci/[email protected]/
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Stefan Bader <[email protected]>
Acked-by: Andrea Righi <[email protected]>
Acked-by: Colin Ian King <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
Ignore: yes
Signed-off-by: Andrea Righi <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1900671
Properties: no-test-build
Signed-off-by: Andrea Righi <[email protected]>
The Intel BlueZ project recommends in [1] to disable highspeed support
as part of the fixes for the security issues. This does the required
changes.

[1] https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00435.html

CVE-2020-24490
CVE-2020-12351
CVE-2020-12352

NOTE: apply the same change that has been applied to the generic kernel.
In amd64 bluetooth is disabled in the annotations file we need to
specify '-' for CONFIG_BT, while arm64 has bluetooth enabled, so it
needs to be 'n'.

Signed-off-by: Andrea Righi <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The new functions use device_{online,offline}() which are userspace safe.

This is in preparation to move cpu_{up, down} kernel users to use a safer
interface that is not racy with userspace.

Suggested-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Qais Yousef <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

(cherry picked from commit 93ef142)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The Nitro Enclaves driver handles the enclave lifetime management. This
includes enclave creation, termination and setting up its resources such
as memory and CPU.

An enclave runs alongside the VM that spawned it. It is abstracted as a
process running in the VM that launched it. The process interacts with
the NE driver, that exposes an ioctl interface for creating an enclave
and setting up its resources.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* No changes.

v7 -> v8

* Add NE custom error codes for user space memory regions not backed by
  pages multiple of 2 MiB, invalid flags and enclave CID.
* Add max flag value for enclave image load info.

v6 -> v7

* Clarify in the ioctls documentation that the return value is -1 and
  errno is set on failure.
* Update the error code value for NE_ERR_INVALID_MEM_REGION_SIZE as it
  gets in user space as value 25 (ENOTTY) instead of 515. Update the
  NE custom error codes values range to not be the same as the ones
  defined in include/linux/errno.h, although these are not propagated
  to user space.

v5 -> v6

* Fix typo in the description about the NE CPU pool.
* Update documentation to kernel-doc format.
* Remove the ioctl to query API version.

v4 -> v5

* Add more details about the ioctl calls usage e.g. error codes, file
  descriptors used.
* Update the ioctl to set an enclave vCPU to not return a file
  descriptor.
* Add specific NE error codes.

v3 -> v4

* Decouple NE ioctl interface from KVM API.
* Add NE API version and the corresponding ioctl call.
* Add enclave / image load flags options.

v2 -> v3

* Remove the GPL additional wording as SPDX-License-Identifier is
  already in place.

v1 -> v2

* Add ioctl for getting enclave image load metadata.
* Update NE_ENCLAVE_START ioctl name to NE_START_ENCLAVE.
* Add entry in Documentation/userspace-api/ioctl/ioctl-number.rst for NE
  ioctls.
* Update NE ioctls definition based on the updated ioctl range for major
  and minor.

Reviewed-by: Alexander Graf <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Alexandru Vasile <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(backported from commit 15b760c)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The Nitro Enclaves (NE) driver communicates with a new PCI device, that
is exposed to a virtual machine (VM) and handles commands meant for
handling enclaves lifetime e.g. creation, termination, setting memory
regions. The communication with the PCI device is handled using a MMIO
space and MSI-X interrupts.

This device communicates with the hypervisor on the host, where the VM
that spawned the enclave itself runs, e.g. to launch a VM that is used
for the enclave.

Define the MMIO space of the NE PCI device, the commands that are
provided by this device. Add an internal data structure used as private
data for the PCI device driver and the function for the PCI device
command requests handling.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* Fix indent for the NE PCI device command types enum.

v7 -> v8

* No changes.

v6 -> v7

* Update the documentation to include references to the NE PCI device id
  and MMIO bar.

v5 -> v6

* Update documentation to kernel-doc format.

v4 -> v5

* Add a TODO for including flags in the request to the NE PCI device to
  set a memory region for an enclave. It is not used for now.

v3 -> v4

* Remove the "packed" attribute and include padding in the NE data
  structures.

v2 -> v3

* Remove the GPL additional wording as SPDX-License-Identifier is
  already in place.

v1 -> v2

* Update path naming to drivers/virt/nitro_enclaves.
* Update NE_ENABLE_OFF / NE_ENABLE_ON defines.

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Alexandru-Catalin Vasile <[email protected]>
Signed-off-by: Alexandru Ciobotaru <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 0a44561)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The Nitro Enclaves driver keeps an internal info per each enclave.

This is needed to be able to manage enclave resources state, enclave
notifications and have a reference of the PCI device that handles
command requests for enclave lifetime management.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* Add data structure to keep references to both Nitro Enclaves misc and
  PCI devices.

v7 -> v8

* No changes.

v6 -> v7

* Update the naming and add more comments to make more clear the logic
  of handling full CPU cores and dedicating them to the enclave.

v5 -> v6

* Update documentation to kernel-doc format.
* Include in the enclave memory region data structure the user space
  address and size for duplicate user space memory regions checks.

v4 -> v5

* Include enclave cores field in the enclave metadata.
* Update the vCPU ids data structure to be a cpumask instead of a list.

v3 -> v4

* Add NUMA node field for an enclave metadata as the enclave memory and
  CPUs need to be from the same NUMA node.

v2 -> v3

* Remove the GPL additional wording as SPDX-License-Identifier is
  already in place.

v1 -> v2

* Add enclave memory regions and vcpus count for enclave bookkeeping.
* Update ne_state comments to reflect NE_START_ENCLAVE ioctl naming
  update.

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Alexandru-Catalin Vasile <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 1df6248)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The Nitro Enclaves PCI device is used by the kernel driver as a means of
communication with the hypervisor on the host where the primary VM and
the enclaves run. It handles requests with regard to enclave lifetime.

Setup the PCI device driver and add support for MSI-X interrupts.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* Init the reference to the ne_pci_dev in the ne_devs data structure.

v7 -> v8

* Add NE PCI driver shutdown logic.

v6 -> v7

* No changes.

v5 -> v6

* Update documentation to kernel-doc format.

v4 -> v5

* Remove sanity checks for situations that shouldn't happen, only if
  buggy system or broken logic at all.

v3 -> v4

* Use dev_err instead of custom NE log pattern.
* Update NE PCI driver name to "nitro_enclaves".

v2 -> v3

* Remove the GPL additional wording as SPDX-License-Identifier is
  already in place.
* Remove the WARN_ON calls.
* Remove linux/bug include that is not needed.
* Update static calls sanity checks.
* Remove "ratelimited" from the logs that are not in the ioctl call
  paths.
* Update kzfree() calls to kfree().

v1 -> v2

* Add log pattern for NE.
* Update PCI device setup functions to receive PCI device data structure and
  then get private data from it inside the functions logic.
* Remove the BUG_ON calls.
* Add teardown function for MSI-X setup.
* Update goto labels to match their purpose.
* Implement TODO for NE PCI device disable state check.
* Update function name for NE PCI device probe / remove.

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Alexandru-Catalin Vasile <[email protected]>
Signed-off-by: Alexandru Ciobotaru <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 89308c1)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The Nitro Enclaves PCI device exposes a MMIO space that this driver
uses to submit command requests and to receive command replies e.g. for
enclave creation / termination or setting enclave resources.

Add logic for handling PCI device command requests based on the given
command type.

Register an MSI-X interrupt vector for command reply notifications to
handle this type of communication events.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* No changes.

v7 -> v8

* Update function signature for submit request and retrive reply
  functions as they only returned 0, no error code.
* Include command type value in the error logs of ne_do_request().

v6 -> v7

* No changes.

v5 -> v6

* Update documentation to kernel-doc format.

v4 -> v5

* Remove sanity checks for situations that shouldn't happen, only if
  buggy system or broken logic at all.

v3 -> v4

* Use dev_err instead of custom NE log pattern.
* Return IRQ_NONE when interrupts are not handled.

v2 -> v3

* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Remove "ratelimited" from the logs that are not in the ioctl call
  paths.

v1 -> v2

* Add log pattern for NE.
* Remove the BUG_ON calls.
* Update goto labels to match their purpose.
* Add fix for kbuild report:
  https://lore.kernel.org/lkml/202004231644.xTmN4Z1z%[email protected]/

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Alexandru-Catalin Vasile <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit ad2b698)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

In addition to the replies sent by the Nitro Enclaves PCI device in
response to command requests, out-of-band enclave events can happen e.g.
an enclave crashes. In this case, the Nitro Enclaves driver needs to be
aware of the event and notify the corresponding user space process that
abstracts the enclave.

Register an MSI-X interrupt vector to be used for this kind of
out-of-band events. The interrupt notifies that the state of an enclave
changed and the driver logic scans the state of each running enclave to
identify for which this notification is intended.

Create an workqueue to handle the out-of-band events. Notify user space
enclave process that is using a polling mechanism on the enclave fd.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* Use the reference to the pdev directly from the ne_pci_dev instead of
  the one from the enclave data structure.

v7 -> v8

* No changes.

v6 -> v7

* No changes.

v5 -> v6

* Update documentation to kernel-doc format.

v4 -> v5

* Remove sanity checks for situations that shouldn't happen, only if
  buggy system or broken logic at all.

v3 -> v4

* Use dev_err instead of custom NE log pattern.
* Return IRQ_NONE when interrupts are not handled.

v2 -> v3

* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Remove "ratelimited" from the logs that are not in the ioctl call
  paths.

v1 -> v2

* Add log pattern for NE.
* Update goto labels to match their purpose.

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Alexandru-Catalin Vasile <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit e5d616d)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

The Nitro Enclaves driver provides an ioctl interface to the user space
for enclave lifetime management e.g. enclave creation / termination and
setting enclave resources such as memory and CPU.

This ioctl interface is mapped to a Nitro Enclaves misc device.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* Use the ne_devs data structure to get the refs for the NE misc device
  in the NE PCI device driver logic.

v7 -> v8

* Add define for the CID of the primary / parent VM.
* Update the NE PCI driver shutdown logic to include misc device
  deregister.

v6 -> v7

* Set the NE PCI device the parent of the NE misc device to be able to
  use it in the ioctl logic.
* Update the naming and add more comments to make more clear the logic
  of handling full CPU cores and dedicating them to the enclave.

v5 -> v6

* Remove the ioctl to query API version.
* Update documentation to kernel-doc format.

v4 -> v5

* Update the size of the NE CPU pool string from 4096 to 512 chars.

v3 -> v4

* Use dev_err instead of custom NE log pattern.
* Remove the NE CPU pool init during kernel module loading, as the CPU
  pool is now setup at runtime, via a sysfs file for the kernel
  parameter.
* Add minimum enclave memory size definition.

v2 -> v3

* Remove the GPL additional wording as SPDX-License-Identifier is
  already in place.
* Remove the WARN_ON calls.
* Remove linux/bug and linux/kvm_host includes that are not needed.
* Remove "ratelimited" from the logs that are not in the ioctl call
  paths.
* Remove file ops that do nothing for now - open and release.

v1 -> v2

* Add log pattern for NE.
* Update goto labels to match their purpose.
* Update ne_cpu_pool data structure to include the global mutex.
* Update NE misc device mode to 0660.
* Check if the CPU siblings are included in the NE CPU pool, as full CPU
  cores are given for the enclave(s).

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit bd47c99)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1903087

Add ioctl command logic for enclave VM creation. It triggers a slot
allocation. The enclave resources will be associated with this slot and
it will be used as an identifier for triggering enclave run.

Return a file descriptor, namely enclave fd. This is further used by the
associated user space enclave process to set enclave resources and
trigger enclave termination.

The poll function is implemented in order to notify the enclave process
when an enclave exits without a specific enclave termination command
trigger e.g. when an enclave crashes.

Changelog

v9 -> v10

* Update commit message to include the changelog before the SoB tag(s).

v8 -> v9

* Use the ne_devs data structure to get the refs for the NE PCI device.

v7 -> v8

* No changes.

v6 -> v7

* Use the NE misc device parent field to get the NE PCI device.
* Update the naming and add more comments to make more clear the logic
  of handling full CPU cores and dedicating them to the enclave.

v5 -> v6

* Update the code base to init the ioctl function in this patch.
* Update documentation to kernel-doc format.

v4 -> v5

* Release the reference to the NE PCI device on create VM error.
* Close enclave fd on copy_to_user() failure; rename fd to enclave fd
  while at it.
* Remove sanity checks for situations that shouldn't happen, only if
  buggy system or broken logic at all.
* Remove log on copy_to_user() failure.

v3 -> v4

* Use dev_err instead of custom NE log pattern.
* Update the NE ioctl call to match the decoupling from the KVM API.
* Add metadata for the NUMA node for the enclave memory and CPUs.

v2 -> v3

* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Update kzfree() calls to kfree().
* Remove file ops that do nothing for now - open.

v1 -> v2

* Add log pattern for NE.
* Update goto labels to match their purpose.
* Remove the BUG_ON calls.

Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Alexandru Vasile <[email protected]>
Signed-off-by: Andra Paraschiv <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 38907e1)
Signed-off-by: Kamal Mostafa <[email protected]>
Acked-by: Acked-by: Kleber Sacilotto de Souza <[email protected]>
Acked-by: Acked-by: William Breathitt Gray <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
delphix-devops-bot pushed a commit that referenced this pull request May 20, 2025
sebroy pushed a commit that referenced this pull request Jun 25, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 1, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 25, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 26, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 27, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 28, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 29, 2025
prakashsurya pushed a commit that referenced this pull request Jul 29, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 30, 2025
delphix-devops-bot pushed a commit that referenced this pull request Jul 31, 2025
prakashsurya pushed a commit that referenced this pull request Jul 31, 2025
delphix-devops-bot pushed a commit that referenced this pull request Aug 1, 2025
delphix-devops-bot pushed a commit that referenced this pull request Aug 2, 2025
delphix-devops-bot pushed a commit that referenced this pull request Aug 3, 2025
delphix-devops-bot pushed a commit that referenced this pull request Aug 4, 2025
prakashsurya pushed a commit that referenced this pull request Aug 4, 2025
delphix-devops-bot pushed a commit that referenced this pull request Aug 7, 2025
delphix-devops-bot pushed a commit that referenced this pull request Aug 8, 2025
delphix-devops-bot pushed a commit that referenced this pull request Sep 9, 2025
delphix-devops-bot pushed a commit that referenced this pull request Sep 26, 2025
BugLink: https://bugs.launchpad.net/bugs/2115678

[ Upstream commit 88f7f56d16f568f19e1a695af34a7f4a6ce537a6 ]

When a bio with REQ_PREFLUSH is submitted to dm, __send_empty_flush()
generates a flush_bio with REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC,
which causes the flush_bio to be throttled by wbt_wait().

An example from v5.4, similar problem also exists in upstream:

    crash> bt 2091206
    PID: 2091206  TASK: ffff2050df92a300  CPU: 109  COMMAND: "kworker/u260:0"
     #0 [ffff800084a2f7f0] __switch_to at ffff80004008aeb8
     #1 [ffff800084a2f820] __schedule at ffff800040bfa0c4
     #2 [ffff800084a2f880] schedule at ffff800040bfa4b4
     #3 [ffff800084a2f8a0] io_schedule at ffff800040bfa9c4
     #4 [ffff800084a2f8c0] rq_qos_wait at ffff8000405925bc
     #5 [ffff800084a2f940] wbt_wait at ffff8000405bb3a0
     #6 [ffff800084a2f9a0] __rq_qos_throttle at ffff800040592254
     #7 [ffff800084a2f9c0] blk_mq_make_request at ffff80004057cf38
     #8 [ffff800084a2fa60] generic_make_request at ffff800040570138
     #9 [ffff800084a2fae0] submit_bio at ffff8000405703b4
    #10 [ffff800084a2fb50] xlog_write_iclog at ffff800001280834 [xfs]
    #11 [ffff800084a2fbb0] xlog_sync at ffff800001280c3c [xfs]
    #12 [ffff800084a2fbf0] xlog_state_release_iclog at ffff800001280df4 [xfs]
    #13 [ffff800084a2fc10] xlog_write at ffff80000128203c [xfs]
    #14 [ffff800084a2fcd0] xlog_cil_push at ffff8000012846dc [xfs]
    #15 [ffff800084a2fda0] xlog_cil_push_work at ffff800001284a2c [xfs]
    #16 [ffff800084a2fdb0] process_one_work at ffff800040111d08
    #17 [ffff800084a2fe00] worker_thread at ffff8000401121cc
    #18 [ffff800084a2fe70] kthread at ffff800040118de4

After commit 2def284 ("xfs: don't allow log IO to be throttled"),
the metadata submitted by xlog_write_iclog() should not be throttled.
But due to the existence of the dm layer, throttling flush_bio indirectly
causes the metadata bio to be throttled.

Fix this by conditionally adding REQ_IDLE to flush_bio.bi_opf, which makes
wbt_should_throttle() return false to avoid wbt_wait().

Signed-off-by: Jinliang Zheng <[email protected]>
Reviewed-by: Tianxiang Peng <[email protected]>
Reviewed-by: Hao Peng <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
CVE-2025-38063
Signed-off-by: Manuel Diewald <[email protected]>
Signed-off-by: Mehmet Basaran <[email protected]>
delphix-devops-bot pushed a commit that referenced this pull request Sep 26, 2025
BugLink: https://bugs.launchpad.net/bugs/2119603

[ Upstream commit 0153f36041b8e52019ebfa8629c13bf8f9b0a951 ]

When the XDP program is loaded, the XDP callback adds new Tx queues.
This means that the callback must update the Tx scheduler with the new
queue number. In the event of a Tx scheduler failure, the XDP callback
should also fail and roll back any changes previously made for XDP
preparation.

The previous implementation had a bug that not all changes made by the
XDP callback were rolled back. This caused the crash with the following
call trace:

[  +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5
[  +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [#1] SMP NOPTI
[  +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ #14 PREEMPT(voluntary)
[  +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
[  +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice]

[...]

[  +0.002715] Call Trace:
[  +0.002452]  <IRQ>
[  +0.002021]  ? __die_body.cold+0x19/0x29
[  +0.003922]  ? die_addr+0x3c/0x60
[  +0.003319]  ? exc_general_protection+0x17c/0x400
[  +0.004707]  ? asm_exc_general_protection+0x26/0x30
[  +0.004879]  ? __ice_update_sample+0x39/0xe0 [ice]
[  +0.004835]  ice_napi_poll+0x665/0x680 [ice]
[  +0.004320]  __napi_poll+0x28/0x190
[  +0.003500]  net_rx_action+0x198/0x360
[  +0.003752]  ? update_rq_clock+0x39/0x220
[  +0.004013]  handle_softirqs+0xf1/0x340
[  +0.003840]  ? sched_clock_cpu+0xf/0x1f0
[  +0.003925]  __irq_exit_rcu+0xc2/0xe0
[  +0.003665]  common_interrupt+0x85/0xa0
[  +0.003839]  </IRQ>
[  +0.002098]  <TASK>
[  +0.002106]  asm_common_interrupt+0x26/0x40
[  +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690

Fix this by performing the missing unmapping of XDP queues from
q_vectors and setting the XDP rings pointer back to NULL after all those
queues are released.
Also, add an immediate exit from the XDP callback in case of ring
preparation failure.

Fixes: efc2214 ("ice: Add support for XDP")
Reviewed-by: Dawid Osuchowski <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Michal Kubiak <[email protected]>
Reviewed-by: Aleksandr Loktionov <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Jesse Brandeburg <[email protected]>
Tested-by: Saritha Sanigani <[email protected]> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
CVE-2025-38127
Signed-off-by: Manuel Diewald <[email protected]>
Signed-off-by: Mehmet Basaran <[email protected]>
delphix-devops-bot pushed a commit that referenced this pull request Sep 26, 2025
delphix-devops-bot pushed a commit that referenced this pull request Sep 27, 2025
delphix-devops-bot pushed a commit that referenced this pull request Sep 28, 2025
delphix-devops-bot pushed a commit that referenced this pull request Sep 29, 2025
delphix-devops-bot pushed a commit that referenced this pull request Sep 30, 2025
delphix-devops-bot pushed a commit that referenced this pull request Oct 1, 2025
delphix-devops-bot pushed a commit that referenced this pull request Oct 2, 2025
delphix-devops-bot pushed a commit that referenced this pull request Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.