Skip to content

RFC: Test of zswap deferred initialisation #3600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10,000 commits into from

Conversation

pelwell
Copy link
Contributor

@pelwell pelwell commented May 5, 2020

This PR is the result of some experimentation to try and defer the cost
of enabling the ZSWAP configuration option until zswap is actually
enabled. There is no guarantee it will get merged, but some feedback
that zswap still works and that the overhead is acceptable would make
it more likely.

See also the discussion at #3432.

[ From the first commit message ]

Enabling zswap support in the kernel configuration costs about 1.5MB
of RAM, even when zswap is not enabled at runtime. This cost can be
reduced significantly by deferring initialisation (including pool
creation) until the "enabled" parameter is set to true. There is a
small cost to this in that some __init functions have to remain in
memory after the init phase, just in case they are needed later,
but the total size is negligible.

@Syonyk
Copy link

Syonyk commented May 6, 2020

My first thought about maintaining a separate difference from upstream is that if this works as advertised, it should be pushed upstream. It's enabled in most stock x86 kernels, so late initialization if not used to save 1.5MB of memory seems entirely sane to me.

@pelwell
Copy link
Contributor Author

pelwell commented May 6, 2020

Yes, of course - I think that goes without saying - but this won't go anywhere without some feedback.

@popcornmix
Copy link
Collaborator

If some zswap users can confirm this behaves as expected then it looks good to me.
I'd hope upstream will be receptive as 1.5MB for a configured but unused feature seems extremely wasteful.

@Syonyk
Copy link

Syonyk commented May 6, 2020

I'll see if I can get this tested out in the next day or two - just depends on weather, lots going on outside if it's tolerable out there.

@swetoast
Copy link

swetoast commented May 9, 2020

compiled gonna run test until tomorrow

@swetoast
Copy link

swetoast commented May 10, 2020

Works as intended i would say it ok to merge

@pelwell
Copy link
Contributor Author

pelwell commented May 11, 2020

Thanks. Did anybody else try it?

mripard added 21 commits May 11, 2020 18:32
The pllb_arm clk_hw pointer in the raspberry_clk structure isn't used
anywhere but in the raspberrypi_register_pllb_arm.

Let's remove it, this will make our lives easier in future patches.

Cc: Michael Turquette <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The pllb_arm clock was created at probe time, but was never removed if
something went wrong later in probe, or if the driver was ever removed from
the system.

Now that we are using clk_hw_register, we can just use its managed variant
to take care of that for us.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The pllb_arm_lookup pointer in the struct raspberrypi_clk is not used for
anything but to store the returned pointer to clkdev_hw_create, and is not
used anywhere else in the driver.

Let's remove that global pointer from the structure.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Since we don't care about retrieving the clk_lookup structure pointer
returned by clkdev_hw_create, we can just use the clk_hw_register_clkdev
function.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The clkdev lookup created for the cpufreq device is never removed if
there's an issue later in probe or at module removal time.

Let's convert to the managed variant of the clk_hw_register_clkdev function
to make sure it happens.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
So far the driver has really only been providing a single clock, and stored
both the data associated to that clock in particular with the data
associated to the "controller".

Since we will change that in the future, let's decouple the clock data from
the provider data.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The driver has really only supported one clock so far and has hardcoded the
ID used in communications with the firmware in all the functions
implementing the clock framework hooks. Let's store that in the clock data
structure so that we can support more clocks later on.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The raspberry_clock_property only takes the clock ID as an argument, but
now that we have a clock data structure it makes more sense to just pass
that structure instead.

Cc: Michael Turquette <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The raspberrypi_fw_pll_is_on function doesn't only apply to PLL
registered in the driver, but any clock exposed by the firmware.

Since we also implement the is_prepared hook, make the function
consistent with the other function names.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Acked-by: Nicolas Saenz Julienne <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The driver only supports the pllb for now and all the clock framework hooks
are a mix of the generic firmware interface and the specifics of the pllb.
Since we will support more clocks in the future let's split the generic and
specific hooks

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The raspberrypi_register_pllb has been returning an integer so far to
notify whether the functions has exited successfully or not.

However, the OF provider functions in the clock framework require access to
the clk_hw structure so that we can expose those clocks to device tree
consumers.

Since we'll want that for the future clocks, let's return a clk_hw pointer
instead of the return code.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
For the upcoming registration of the clocks provided by the firmware, make
sure it's exposed to the device tree providers.

Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The RaspberryPi4 firmware actually exposes more clocks than are currently
handled by the driver and we will need to change some of them directly
based on the pixel rate for the display related clocks, or the load for the
GPU.

This rate change can have a number of side-effects, including adjusting the
various PLL voltages or the PLL parents. The firmware will also update
those clocks by itself for example if the SoC runs too hot.

In order to make Linux play as nice as possible with those constraints, it
makes sense to rely on the firmware clocks as much as possible.

Fortunately,t he firmware has an interface to discover the clocks it
exposes.

Let's use it to discover, register the clocks in the clocks framework and
then expose them through the device tree for consumers to use them.

Cc: Michael Turquette <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: [email protected]
Signed-off-by: Maxime Ripard <[email protected]>
Now that we have a clock driver for the clocks exposed by the firmware,
let's add the device tree nodes for it.

Signed-off-by: Maxime Ripard <[email protected]>
The reset-simple code can be useful for drivers outside of drivers/reset
that have a few reset controls as part of their features. Let's move it to
include/linux/reset.

Cc: Philipp Zabel <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The reset-simple code lacks a reset callback that is still pretty easy to
implement. The only real thing to consider is the delay needed for a device
to be reset, so let's expose that as part of the reset-simple driver data.

Cc: Philipp Zabel <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The BCM2711 has a unit controlling the HDMI0 and HDMI1 clock and reset
signals. Let's add a binding for it.

Cc: Philipp Zabel <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: [email protected]
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
The HDMI block has a block that controls clocks and reset signals to the
HDMI0 and HDMI1 controllers.

Let's expose that through a clock driver implementing a clock and reset
provider.

Cc: Michael Turquette <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Maxime Ripard <[email protected]>
Now that we have a driver for the DVP, let's add its DT node.

Signed-off-by: Maxime Ripard <[email protected]>
The BCM283x SoCs have a display pipeline composed of several controllers
with device tree bindings that are supported by Linux.

Now that we have the DT validation in place, let's split into separate
files and convert the device tree bindings for those controllers to
schemas.

This is just a 1:1 conversion though, and some bindings were incomplete so
it results in example validation warnings that are going to be addressed in
the following patches.

Cc: Rob Herring <[email protected]>
Cc: [email protected]
Signed-off-by: Maxime Ripard <[email protected]>
While the device tree and the driver expected a clock-names property, it
wasn't explicitly documented in the previous binding. Make sure it is now.

Cc: [email protected]
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Colin Ian King and others added 7 commits May 13, 2020 10:27
Commit f9d3b2c upstream.

The -ENOTTY error return path does not free the allocated
kdata as it returns directly. Fix this by returning via the
error handling label err.

Addresses-Coverity: ("Resource leak")
Fixes: c02a81f ("dma-buf: Add dma-buf heaps framework")
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: John Stultz <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Commit 7d411af upstream.

Fix the following sparse warning.

drivers/dma-buf/dma-heap.c:109:14: warning: symbol 'dma_heap_ioctl_cmds'
was not declared. Should it be static?

Acked-by: Andrew F. Davis <[email protected]>
Acked-by: John Stultz <[email protected]>
Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
 [sumits: rebased over IOCTL rename patches]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
The CMA dma-heap allocator can be used in place of vcsm-cma
doing the allocation side, thereby simplifying that driver.

Signed-off-by: Dave Stevenson <[email protected]>
Switch to the upstream cpufreq driver on non-BCM2835 Pis.

Signed-off-by: Phil Elwell <[email protected]>
From when bringing up the driver, there was a check in the isr
to ignore interrupts (claiming them handled) should the driver
not be streaming.

The VPU now will not register a camera driver if it finds a
CSI2 node enabled in device tree, therefore this flawed check is
redundant.

raspberrypi#3602

Signed-off-by: Dave Stevenson <[email protected]>
@pelwell
Copy link
Contributor Author

pelwell commented May 13, 2020

@Syonyk Did you find time since the chill hit to try this patch?

The SC16IS7XX hardware flow control is mishandled by the driver in
a number of ways:

  1. The set_baud method accidentally clears it when setting EFR bit.
  2. Even though hardware flow control is enabled, it isn't indicated
     back to the serial framework.
  3. Applying the flow control clears the EFR bit.
  4. The CTS support is not indicated in the return from
     sc16is7xx_get_mctrl.

Address all of those issues using a mixture of patches found on the
linked pages.

See: raspberrypi#2542
See: https://www.spinics.net/lists/linux-serial/msg21794.html

Signed-off-by: Phil Elwell <[email protected]>
@Syonyk
Copy link

Syonyk commented May 13, 2020

Sorry, took a bit to get time on a bad weather day. We've had people doing a bunch of property work around here.

It works, in that you can boot without zswap enabled, enable it once the system is up, and have it work, but I've managed to trip it up such that I can't enable it by using the kernel command line flags.

Adding zswap.enabled=1 to /boot/cmdline.txt will lead to a failure in initialization on boot, which is fine, but you can't then enable it later.

In early boot:

[    0.792727] zswap: default zpool zbud not available
[    0.792900] zswap: pool creation failed

Ok, no module available, that's fine.

But if you try to enable it later, in /sys/module/zswap/parameters, you get the following failure:

root@raspberrypi:/sys/module/zswap/parameters# echo Y > enabled
bash: echo: write error: No such device

And the corresponding dmesg output:

[  401.984101] zswap: can't enable, initialization failed

If you want to take a stab at fixing that, I've got a version of your tree up to date with kernel objects in it. I don't mind that the command line initialization fails, just that it then fails into a state where it can't be later loaded.

EDIT: To be clear, I don't know if this is an issue with your patches, or if the upstream kernel behaves like this as well. I'll get the Pi set up for remote access and mess with it a bit.

zbud.o isn't even 10kb... may be worth just building that in.

@Syonyk
Copy link

Syonyk commented May 14, 2020

Stock kernel behaves exactly the same with zbud as a module, so no change there. I suppose I ought to fix that...

In any case, I'm fine with this patch as submitted. I think it would be worth the space to include zbud in the kernel image, for the massive increase in desktop usability this sort of thing offers, but as long as it's somehow usable in the stock Raspbian kernel, I'm happy!

@pelwell
Copy link
Contributor Author

pelwell commented May 14, 2020

Failure to load zbud at init time shouldn't preclude one from trying again later. I think it can work better, I just didn't get round to it yet.

@Syonyk
Copy link

Syonyk commented May 14, 2020

I agree - I was mostly just verifying that your changes didn't impact that case by testing against the reference 4.19 code. It's broken upstream.

I would still mildly prefer to see zbud built as part of the kernel, such that zswap can initialize from the command line. It's a 10kb difference in the kernel, give or take, and you save that on module space.

6by9 and others added 12 commits May 15, 2020 15:20
Adding the Broadcast RGB range selection broke the VIC
field of the AVI infoframes on HDMI, zeroing them for all
modes on an HDMI monitor.

Correct this so that it is only zeroed if the range is
contrary to the standard range of the mode.

Signed-off-by: Dave Stevenson <[email protected]>
Adds in a couple of new MMAL parameter defines.

Signed-off-by: Dave Stevenson <[email protected]>
V4L2 wishes to have the codec header bytes in the same buffer as the
first encoded frame, so it does become 1-in 1-out for encoding.
The firmware now has an option to do this, so enable it.

Signed-off-by: Dave Stevenson <[email protected]>
The firmware by default is quite happy to fragment encoded
frames as the original MMAL and IL APIs support this.
V4L2 doesn't, so we need to enable the firmware option to avoid this.

Signed-off-by: Dave Stevenson <[email protected]>
V4L2 wishes to have the codec header bytes in the same buffer as the
first encoded frame, so it does become 1-in 1-out for encoding.
The firmware now has an option to do this, so enable it.

Signed-off-by: Dave Stevenson <[email protected]>
The arm bcm2711_defconfig and the arm64 bcmrpi3_defconfig have been
missing their NF_TABLES settings. Restore them.

See: raspberrypi#3615

Signed-off-by: Phil Elwell <[email protected]>
The CMA handling change broke the audio parameter - the fragment
numbering has changed - so fix it.

See: raspberrypi#2489

Signed-off-by: Phil Elwell <[email protected]>
Replaces obsolete function snd_soc_dai_set_tdm_slot

Signed-off-by: Joerg Schambacher <[email protected]>
The change to retrieve the pixel format always on g_fmt didn't
check whether the native or unpacked version of the format
had been requested, and always returned the packed one.
Correct this so that the packing setting is retained whereever
possible.

Fixes "9d59e89 media: bcm2835-unicam: Re-fetch mbus code from subdev
on a g_fmt call"

Signed-off-by: Dave Stevenson <[email protected]>
The "compressor" and "zpool" parameters of the zswap module each has
a custom setter function that calls __zswap_param_set with specific
parameters, but the "zpool" setter uses parameters that are correct for
the "compressor" parameter, and vice-versa.

Fix this by swapping the function bodies over.

Fixes: 90b0fc2 ("zswap: change zpool/compressor at runtime")

Signed-off-by: Phil Elwell <[email protected]>
Enabling zswap support in the kernel configuration costs about 1.5MB
of RAM, even when zswap is not enabled at runtime. This cost can be
reduced significantly by deferring initialisation (including pool
creation) until the "enabled" parameter is set to true. There is a
small cost to this in that some initialisation code has to remain in
memory after the init phase, just in case they are needed later,
but the total size increase is negligible.

See: raspberrypi#3432

Signed-off-by: Phil Elwell <[email protected]>
@pelwell pelwell force-pushed the zswaptest branch 2 times, most recently from 3dd0a1a to 99c315e Compare May 20, 2020 09:05
@pelwell pelwell closed this May 20, 2020
@pelwell pelwell deleted the zswaptest branch September 15, 2021 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.