forked from openzfs/zfs
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit b1b2ff8
committed
1
Squashed commit of the following:
commit 1e255365c9bf0e7858561d527c0ebdf8f90bc925
Author: Alexander Motin <[email protected]>
Date: Tue Jun 27 20:03:37 2023 -0400
ZIL: Fix another use-after-free.
lwb->lwb_issued_txg can not be accessed after lwb_state is set to
LWB_STATE_FLUSH_DONE and zl_lock is dropped, since the lwb may be
freed by zil_sync(). We must save the txg number before that.
This is similar to the 55b1842f92, but as I see the bug is not new.
It existed for quite a while, just was not triggered due to smaller
race window.
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14988
Closes #14999
commit 233893e7cb7a98895061100ef8363f0ac30204b5
Author: Alexander Motin <[email protected]>
Date: Tue Jun 27 20:00:30 2023 -0400
Use big transactions for small recordsize writes.
When ZFS appends files in chunks bigger than recordsize, it borrows
buffer from ARC and fills it before opening transaction. This
supposed to help in case of page faults to not hold transaction open
indefinitely. The problem appears when recordsize is set lower than
default 128KB. Since each block is committed in separate transaction,
per-transaction overhead becomes significant, and what is even worse,
active use of of per-dataset and per-pool locks to protect space use
accounting for each transaction badly hurts the code SMP scalability.
The same transaction size limitation applies in case of file rewrite,
but without even excuse of buffer borrowing.
To address the issue, disable the borrowing mechanism if recordsize
is smaller than default and the write request is 4x bigger than it.
In such case writes up to 32MB are executed in single transaction,
that dramatically reduces overhead and lock contention. Since the
borrowing mechanism is not used for file rewrites, and it was never
used by zvols, which seem to work fine, I don't think this change
should create significant problems, partially because in addition to
the borrowing mechanism there are also used pre-faults.
My tests with 4/8 threads writing several files same time on datasets
with 32KB recordsize in 1MB requests show reduction of CPU usage by
the user threads by 25-35%. I would measure it in GB/s, but at that
block size we are now limited by the lock contention of single write
issue taskqueue, which is a separate problem we are going to work on.
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14964
commit aea27422747921798a9b9e1b8e0f6230d5672ba5
Author: Laevos <[email protected]>
Date: Tue Jun 27 16:58:32 2023 -0700
Remove unnecessary commas in zpool-create.8
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Laevos <[email protected]>
Closes #15011
commit 38a821c0d8f6bb51a866354e76078abf6a6ba1fc
Author: Alexander Motin <[email protected]>
Date: Tue Jun 27 12:09:48 2023 -0400
Another set of vdev queue optimizations.
Switch FIFO queues (SYNC/TRIM) and active queue of vdev queue from
time-sorted AVL-trees to simple lists. AVL-trees are too expensive
for such a simple task. To change I/O priority without searching
through the trees, add io_queue_state field to struct zio.
To not check number of queued I/Os for each priority add vq_cqueued
bitmap to struct vdev_queue. Update it when adding/removing I/Os.
Make vq_cactive a separate array instead of struct vdev_queue_class
member. Together those allow to avoid lots of cache misses when
looking for work in vdev_queue_class_to_issue().
Introduce deadline of ~0.5s for LBA-sorted queues. Before this I
saw some I/Os waiting in a queue for up to 8 seconds and possibly
more due to starvation. With this change I no longer see it. I
had to slightly more complicate the comparison function, but since
it uses all the same cache lines the difference is minimal. For a
sequential I/Os the new code in vdev_queue_io_to_issue() actually
often uses more simple avl_first(), falling back to avl_find() and
avl_nearest() only when needed.
Arrange members in struct zio to access only one cache line when
searching through vdev queues. While there, remove io_alloc_node,
reusing the io_queue_node instead. Those two are never used same
time.
Remove zfs_vdev_aggregate_trim parameter. It was disabled for 4
years since implemented, while still wasted time maintaining the
offset-sorted tree of TRIM requests. Just remove the tree.
Remove locking from txg_all_lists_empty(). It is racy by design,
while 2 pair of locks/unlocks take noticeable time under the vdev
queue lock.
With these changes in my tests with volblocksize=4KB I measure vdev
queue lock spin time reduction by 50% on read and 75% on write.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14925
commit 1737e75ab4e09a2d20e7cc64fa83dae047a302e9
Author: Rich Ercolani <[email protected]>
Date: Mon Jun 26 16:57:12 2023 -0400
Add a delay to tearing down threads.
It's been observed that in certain workloads (zvol-related being a
big one), ZFS will end up spending a large amount of time spinning
up taskqs only to tear them down again almost immediately, then
spin them up again...
I noticed this when I looked at what my mostly-idle system was doing
and wondered how on earth taskq creation/destroy was a bunch of time...
So I added a configurable delay to avoid it tearing down tasks the
first time it notices them idle, and the total number of threads at
steady state went up, but the amount of time being burned just
tearing down/turning up new ones almost vanished.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #14938
commit 68b8e2ffab23cba6ae87f18c59b044c833934f2f
Author: Alexander Motin <[email protected]>
Date: Sat Jun 17 22:51:37 2023 -0400
Fix memory leak in zil_parse().
482da24e2 missed arc_buf_destroy() calls on log parse errors, possibly
leaking up to 128KB of memory per dataset during ZIL replay.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14987
commit ea0d03a8bd040e438bcaa43b8e449cbf717e14f3
Author: George Amanakis <[email protected]>
Date: Thu Jun 15 21:45:36 2023 +0200
Shorten arcstat_quiescence sleep time
With the latest L2ARC fixes, 2 seconds is too long to wait for
quiescence of arcstats like l2_size. Shorten this interval to avoid
having the persistent L2ARC tests in ZTS prematurely terminated.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14981
commit 3fa141285b8105b3cc11c1296b77ad6d24250f2c
Author: Alexander Motin <[email protected]>
Date: Thu Jun 15 13:49:03 2023 -0400
Remove ARC/ZIO physdone callbacks.
Those callbacks were introduced many years ago as part of a bigger
patch to smoothen the write throttling within a txg. They allow to
account completion of individual physical writes within a logical
one, improving cases when some of physical writes complete much
sooner than others, gradually opening the write throttle.
Few years after that ZFS got allocation throttling, working on a
level of logical writes and limiting number of writes queued to
vdevs at any point, and so limiting latency distribution between
the physical writes and especially writes of multiple copies.
The addition of scheduling deadline I proposed in #14925 should
further reduce the latency distribution. Grown memory sizes over
the past 10 years should also reduce importance of the smoothing.
While the use of physdone callback may still in theory provide
some smoother throttling, there are cases where we simply can not
afford it. Since dirty data accounting is protected by pool-wide
lock, in case of 6-wide RAIDZ, for example, it requires us to take
it 8 times per logical block write, creating huge lock contention.
My tests of this patch show radical reduction of the lock spinning
time on workloads when smaller blocks are written to RAIDZ pools,
when each of the disks receives 8-16KB chunks, but the total rate
reaching 100K+ blocks per second. Same time attempts to measure
any write time fluctuations didn't show anything noticeable.
While there, remove also io_child_count/io_parent_count counters.
They are used only for couple assertions that can be avoided.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14948
commit 9efc735904d194987f06870f355e08d94e39ab81
Author: Brian Behlendorf <[email protected]>
Date: Wed Jun 14 10:04:05 2023 -0500
ZTS: Skip send_raw_ashift on FreeBSD
On FreeBSD 14 this test runs slowly in the CI environment
and is killed by the 10 minute timeout. Skip the test on
FreeBSD until the slow down is resolved.
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14961
commit 9c54894bfc77f585806984f44c70a839543e6715
Author: Alexander Motin <[email protected]>
Date: Wed Jun 14 11:02:27 2023 -0400
Switch refcount tracking from lists to AVL-trees.
With large number of tracked references list searches under the lock
become too expensive, creating enormous lock contention.
On my tests with ZFS_DEBUG enabled this increases write throughput
with 32KB blocks from ~1.2GB/s to ~7.5GB/s.
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14970
commit 4e62540827a6ed15e08b2a627896d24bc661fa38
Author: George Amanakis <[email protected]>
Date: Wed Jun 14 17:01:17 2023 +0200
Store the L2ARC device ashift in the vdev label
If this is not done, and the pool has an ashift other than the default
(at the moment 9) then the following happens:
1) vdev_alloc() assigns the ashift of the pool to L2ARC device, but
upon export it is not stored anywhere
2) at the first import, vdev_open() sees an vdev_ashift() of 0 and
assigns the logical_ashift, which is 9
3) reading the contents of L2ARC, including the header fails
4) L2ARC buffers are not restored in ARC.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14313
Closes #14963
commit adaa3e64ea46f21cc5f544228c48363977b7733e
Author: George Amanakis <[email protected]>
Date: Sat Jun 10 02:05:47 2023 +0200
Fix the L2ARC write size calculating logic (2)
While commit bcd5321 adjusts the write size based on the size of the log
block, this happens after comparing the unadjusted write size to the
evicted (target) size.
In this case l2ad_hand will exceed l2ad_evict and violate an assertion
at the end of l2arc_write_buffers().
Fix this by adding the max log block size to the allocated size of the
buffer to be committed before comparing the result to the target
size.
Also reset the l2arc_trim_ahead ZFS module variable when the adjusted
write size exceeds the size of the L2ARC device.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14936
Closes #14954
commit 67118a7d6e74a6e818127096162478017610d13e
Author: Andrew Innes <[email protected]>
Date: Wed Jun 28 12:31:10 2023 +0800
Windows: Finally drop long disabled vdev cache.
Signed-off-by: Andrew Innes <[email protected]>
commit 5d80c98c28c931339138753a4e4c1156dbf951f4
Author: Alexander Motin <[email protected]>
Date: Fri Jun 9 15:40:55 2023 -0400
Finally drop long disabled vdev cache.
It was a vdev level read cache, designed to aggregate many small
reads by speculatively issuing bigger reads instead and caching
the result. But since it has almost no idea about what is going
on with exception of ZIO_FLAG_DONT_CACHE flag set by higher layers,
it was found to make more harm than good, for which reason it was
disabled for the past 12 years. These days we have much better
instruments to enlarge the I/Os, such as speculative and prescient
prefetches, I/O scheduler, I/O aggregation etc.
Besides just the dead code removal this removes one extra mutex
lock/unlock per write inside vdev_cache_write(), not otherwise
disabled and trying to do some work.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14953
commit 1f1ab33781b5736654b988e2e618ea79788fa1f7
Author: Brian Behlendorf <[email protected]>
Date: Fri Jun 9 11:10:01 2023 -0700
ZTS: Skip checkpoint_discard_busy
Until the ASSERT which is occasionally hit while running
checkpoint_discard_busy is resolved skip this test case.
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #12053
Closes #14952
commit b94049c2cbedbbe2af8e629bf974a6ed93f11acb
Author: Alexander Motin <[email protected]>
Date: Fri Jun 9 13:14:05 2023 -0400
Improve l2arc reporting in arc_summary.
- Do not report L2ARC as FAULTED in presence of in-flight writes.
- Report read and write I/Os, bytes and errors.
- Remove few numbers not important to average user.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #12304
Closes #14946
commit 31044b5cfb6f91d376034c4d6374f61baaf03232
Author: Andrew Innes <[email protected]>
Date: Wed Jun 28 12:00:39 2023 +0800
Windows: Use list_remove_head() where possible.
Signed-off-by: Andrew Innes <[email protected]>
commit 32eda54d0d75a94b6aa71dc80aa958095feb8011
Author: Alexander Motin <[email protected]>
Date: Fri Jun 9 13:12:52 2023 -0400
Use list_remove_head() where possible.
... instead of list_head() + list_remove(). On FreeBSD the list
functions are not inlined, so in addition to more compact code
this also saves another function call.
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14955
commit fe7693a3f87229d1ae93b5ce2bb84d8bb86a9f5c
Author: Alexander Motin <[email protected]>
Date: Fri Jun 9 13:08:05 2023 -0400
ZIL: Fix race introduced by f63811f0721.
We are not allowed to access lwb after setting LWB_STATE_FLUSH_DONE
state and dropping zl_lock, since it may be freed by zil_sync().
To free itxs and waiters after dropping the lock we need to move
lwb_itxs and lwb_waiters lists elements to local storage.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14957
Closes #14959
commit 44c5a0c92f98e8c21221bd7051729d1947a10736
Author: Rich Ercolani <[email protected]>
Date: Wed Jun 7 14:14:05 2023 -0400
Revert "systemd: Use non-absolute paths in Exec* lines"
This reverts commit 79b20949b25c8db4d379f6486b0835a6613b480c since it
doesn't work with the systemd version shipped with RHEL7-based systems.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #14943
Closes #14945
commit ba5af00257eb4eb3363f297819a21c4da811392f
Author: Brian Behlendorf <[email protected]>
Date: Wed Jun 7 10:43:43 2023 -0700
Linux: Never sleep in kmem_cache_alloc(..., KM_NOSLEEP) (#14926)
When a kmem cache is exhausted and needs to be expanded a new
slab is allocated. KM_SLEEP callers can block and wait for the
allocation, but KM_NOSLEEP callers were incorrectly allowed to
block as well.
Resolve this by attempting an emergency allocation as a best
effort. This may fail but that's fine since any KM_NOSLEEP
consumer is required to handle an allocation failure.
Signed-off-by: Brian Behlendorf <[email protected]>
Reviewed-by: Adam Moss <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
commit d4ecd4efde1692641d1d0b89851e7a15e90632f8
Author: George Amanakis <[email protected]>
Date: Tue Jun 6 21:32:37 2023 +0200
Fix the L2ARC write size calculating logic
l2arc_write_size() should return the write size after adjusting for trim
and overhead of the L2ARC log blocks. Also take into account the
allocated size of log blocks when deciding when to stop writing buffers
to L2ARC.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14939
commit 8692ab174e18faf444681d67d7ea4418600553cc
Author: Rob Norris <[email protected]>
Date: Wed Mar 15 18:18:10 2023 +1100
zdb: add -B option to generate backup stream
This is more-or-less like `zfs send`, but specifying the snapshot by its
objset id for situations where it can't be referenced any other way.
Sponsored-By: Klara, Inc.
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: WHR <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #14642
commit df84ca3f3bf9f265ebc76de17394df529fd07af6
Author: Andrew Innes <[email protected]>
Date: Wed Jun 28 11:05:55 2023 +0800
Windows: znode: expose zfs_get_zplprop to libzpool
Signed-off-by: Andrew Innes <[email protected]>
commit 944c58247a13a92c9e4ffb2c0a9e6b6293dca37e
Author: Rob Norris <[email protected]>
Date: Sun Jun 4 11:14:20 2023 +1000
znode: expose zfs_get_zplprop to libzpool
There's no particular reason this function should be kernel-only, and I
want to use it (indirectly) from zdb. I've moved it to zfs_znode.c
because libzpool does not compile in zfs_vfsops.c, and this at least
matches the header its imported from.
Sponsored-By: Klara, Inc.
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: WHR <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #14642
commit 429f58cdbb195c8d50ed95c7309ee54d37526b70
Author: Alexander Motin <[email protected]>
Date: Mon Jun 5 14:51:44 2023 -0400
Introduce zfs_refcount_(add|remove)_few().
There are two places where we need to add/remove several references
with semantics of zfs_refcount_(add|remove). But when debug/tracing
is disabled, it is a crime to run multiple atomic_inc() in a loop,
especially under congested pool-wide allocator lock.
Introduced new functions implement the same semantics as the loop,
but without overhead in production builds.
Reviewed-by: Rich Ercolani <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14934
commit 077c2f359feb69a13bee37ac4220d271d1c7bf27
Author: Brian Behlendorf <[email protected]>
Date: Mon Jun 5 11:08:24 2023 -0700
Linux 6.3 compat: META (#14930)
Update the META file to reflect compatibility with the 6.3 kernel.
Signed-off-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
commit c2fcd6e484107fc7435087771757e88ba84f6093
Author: Graham Perrin <[email protected]>
Date: Fri Jun 2 19:25:13 2023 +0100
zfs-create(8): ZFS for swap: caution, clarity
Make the section heading more generic (the section relates to ZFS files
as well as ZFS volumes).
Swapping to a ZFS volume is prone to deadlock. Remove the related
instruction, direct readers to OpenZFS FAQ. Related, but not linked
from within the manual page:
<https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#using-a-zvol-for-a-swap-device-on-linux>
(Using a zvol for a swap device on Linux).
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Graham Perrin <[email protected]>
Issue #7734
Closes #14756
commit 251dbe83e14085a26100aa894d79772cbb69dcda
Author: Alexander Motin <[email protected]>
Date: Fri Jun 2 14:01:58 2023 -0400
ZIL: Allow to replay blocks of any size.
There seems to be no reason for ZIL blocks to be limited by 128KB
other than replay code is written in such a way. This change does
not increase the limit yet, just removes the artificial limitation.
Avoided extra memcpy() may save us a second during replay.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14910
commit 76170249d538965655dbd3206cd59566b1d3944b
Author: Val Packett <[email protected]>
Date: Thu May 11 18:16:57 2023 -0300
PAM: enable testing on FreeBSD
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit d1b68a45441cae8c399a8a3ed60b29726ed031ff
Author: Val Packett <[email protected]>
Date: Fri May 5 22:17:12 2023 -0300
PAM: support password changes even when not mounted
There's usually no requirement that a user be logged in for changing
their password, so let's not be surprising here.
We need to use the fetch_lazy mechanism for the old password to avoid
a double prompt for it, so that mechanism is now generalized a bit.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit 7424feff72f1e17ea27bcfe0d36cabce7c732eea
Author: Val Packett <[email protected]>
Date: Fri May 5 22:34:58 2023 -0300
PAM: add 'uid_min' and 'uid_max' options for changing the uid range
Instead of a fixed >=1000 check, allow the configuration to override
the minimum UID and add a maximum one as well. While here, add the
uid range check to the authenticate method as well, and fix the return
in the chauthtok method (seems very wrong to report success when we've
done absolutely nothing).
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit fc9e012f5fc7e7997acee2b6d8d759622b319f0e
Author: Val Packett <[email protected]>
Date: Fri May 5 22:02:13 2023 -0300
PAM: add 'forceunmount' flag
Probably not always a good idea, but it's nice to have the option.
It is a workaround for FreeBSD calling the PAM session end earier than
the last process is actually done touching the mount, for example.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit a39ed83bd31cc0c8c98dc3c4cc3d11b03d9af620
Author: Val Packett <[email protected]>
Date: Fri May 5 19:35:57 2023 -0300
PAM: add 'recursive_homes' flag to use with 'prop_mountpoint'
It's not always desirable to have a fixed flat homes directory.
With the 'recursive_homes' flag, 'prop_mountpoint' search would
traverse the whole tree starting at 'homes' (which can now be '*'
to mean all pools) to find a dataset with a mountpoint matching
the home directory.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit 7f8d5ef815b7559fcc671ff2add33ba9c2a74867
Author: Val Packett <[email protected]>
Date: Fri May 5 21:56:39 2023 -0300
PAM: use boolean_t for config flags
Since we already use boolean_t in the file, we can use it here.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit e2872932c85189f06a68f0ad10bd8eb6895d79c2
Author: Val Packett <[email protected]>
Date: Fri May 5 20:00:48 2023 -0300
PAM: do not fail to mount if the key's already loaded
If we're expecting a working home directory on login, it would be
rather frustrating to not have it mounted just because it e.g. failed to
unmount once on logout.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Felix Dörre <[email protected]>
Signed-off-by: Val Packett <[email protected]>
Closes #14834
commit b897137e2044c3ef6120820f753d940b7dfb58be
Author: Rich Ercolani <[email protected]>
Date: Wed May 31 19:58:41 2023 -0400
Revert "initramfs: use `mount.zfs` instead of `mount`"
This broke mounting of snapshots on / for users.
See https://github.com/openzfs/zfs/issues/9461#issuecomment-1376162949 for more context.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #14908
commit 10cde4f8f60d4d55887d7122a5742e6e4f90280c
Author: Luís Henriques <[email protected]>
Date: Tue May 30 23:15:24 2023 +0100
Fix NULL pointer dereference when doing concurrent 'send' operations
A NULL pointer will occur when doing a 'zfs send -S' on a dataset that
is still being received. The problem is that the new 'send' will
rightfully fail to own the datasets (i.e. dsl_dataset_own_force() will
fail), but then dmu_send() will still do the dsl_dataset_disown().
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Luís Henriques <[email protected]>
Closes #14903
Closes #14890
commit 12452d79a3fd29af1dc0b95f3e367e3ce339702b
Author: Brian Behlendorf <[email protected]>
Date: Mon May 29 12:55:35 2023 -0700
ZTS: zvol_misc_trim disable blk mq
Disable the zvol_misc_fua.ksh and zvol_misc_trim.ksh test cases on impacted
kernels. This issue is being actively worked in #14872 and as part of that
fix this commit will be reverted.
VERIFY(zh->zh_claim_txg == 0) failed
PANIC at zil.c:904:zil_create()
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14872
Closes #14870
commit 803c04f233e60a2d23f0463f299eba96c0968602
Author: Richard Yao <[email protected]>
Date: Fri May 26 18:47:52 2023 -0400
Use __attribute__((malloc)) on memory allocation functions
This informs the C compiler that pointers returned from these functions
do not alias other functions, which allows it to do better code
optimization and should make the compiled code smaller.
References:
https://stackoverflow.com/a/53654773
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute
https://clang.llvm.org/docs/AttributeReference.html#malloc
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14827
commit 64d8bbe15f77876ae9639b9971a743776a41bf9a
Author: Brian Behlendorf <[email protected]>
Date: Fri May 26 15:39:23 2023 -0700
ZTS: Add zpool_resilver_concurrent exception
The zpool_resilver_concurrent test case requires the ZED which is not used
on FreeBSD. Add this test to the known list of skipped tested for FreeBSD.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14904
commit e396d30d29ed131194605222e6ba1fec1ef8b2ca
Author: Mike Swanson <[email protected]>
Date: Fri May 26 15:37:15 2023 -0700
Add compatibility symlinks for FreeBSD 12.{3,4} and 13.{0,1,2}
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mike Swanson <[email protected]>
Closes #14902
commit f6dd0b8c1cc41707d299b7123f80912f43d03340
Author: Colm <[email protected]>
Date: Fri May 26 10:04:19 2023 -0700
Adding new read-only compatible zpool features to compatibility.d/grub2
GRUB2 is compatible with all "read-only compatible" features,
so it is safe to add new features of this type to the grub2
compatibility list. We generally want to include all compatible
features, to minimize the differences between grub2-compatible
pools and no-compatibility pools.
Adding new properties `livelist` and `zpool_checkpoint` accordingly.
Also adding them to the man page which references this file as an
example, for consistency.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Colm Buckley <[email protected]>
Closes #14893
commit 013d3a1e0e00d83dabe70837b23dab48c1bac592
Author: Richard Yao <[email protected]>
Date: Fri May 26 13:03:12 2023 -0400
btree: Implement faster binary search algorithm
This implements a binary search algorithm for B-Trees that reduces
branching to the absolute minimum necessary for a binary search
algorithm. It also enables the compiler to inline the comparator to
ensure that the only slowdown when doing binary search is from waiting
for memory accesses. Additionally, it instructs the compiler to unroll
the loop, which gives an additional 40% improve with Clang and 8%
improvement with GCC.
Consumers must opt into using the faster algorithm. At present, only
B-Trees used inside kernel code have been modified to use the faster
algorithm.
Micro-benchmarks suggest that this can improve binary search performance
by up to 3.5 times when compiling with Clang 16 and up to 1.9 times when
compiling with GCC 12.2.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14866
commit 1854df330aa57cda39f076e8ab11e17ca3697bb8
Author: George Amanakis <[email protected]>
Date: Fri May 26 18:53:00 2023 +0200
Fix inconsistent definition of zfs_scrub_error_blocks_per_txg
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14894
commit 8735e6ac03742fcf43adde3ce127af698a32c53a
Author: Damiano Albani <[email protected]>
Date: Fri May 26 01:10:54 2023 +0200
Add missing files to Debian DKMS package
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: Umer Saleem <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Damiano Albani <[email protected]>
Closes #14887
Closes #14889
commit d439021bd05a5cc0bb271a5470abb67af2f7bcda
Author: Brian Behlendorf <[email protected]>
Date: Thu May 25 13:53:08 2023 -0700
Update compatibility.d files
Add an openzfs-2.2 compatibility file for the next release.
Edon-R support has been enabled for FreeBSD removing the need
for different FreeBSD and Linux files. Symlinks for the -linux
and -freebsd names are created for any scripts expecting that
convention.
Additionally, a symlink for ubunutu-22.04 was added.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14833
commit da54d5f3f9576b958e3eadf4f4d8f68c91b3d6e4
Author: Alexander Motin <[email protected]>
Date: Thu May 25 16:51:53 2023 -0400
zil: Add some more statistics.
In addition to a number of actual log bytes written, account also a
total written bytes including padding and total allocated bytes (bytes
<= write <= alloc). It should allow to monitor zil traffic and space
efficiency.
Add dtrace probe for zil block size selection.
Make zilstat report more information and fit it into less width.
Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14863
commit faa4955023d089668bd6c564c195a933d1eac455
Author: Alexander Motin <[email protected]>
Date: Thu May 25 12:48:43 2023 -0400
ZIL: Reduce scope of per-dataset zl_issuer_lock.
Before this change ZIL copied all log data while holding the lock.
It caused huge lock contention on workloads with many big parallel
writes. This change splits the process into two parts: first,
zil_lwb_assign() estimates the log space needed for all transactions,
and zil_lwb_write_close() allocates blocks and zios while holding the
lock, then, after the lock in dropped, zil_lwb_commit() copies the
data, and zil_lwb_write_issue() issues the I/Os.
Also while there slightly reduce scope of zl_lock.
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14841
commit f77b9f7ae83834ade1da21cfc16b8a273df3acfc
Author: Dimitri John Ledkov <[email protected]>
Date: Wed May 24 20:31:28 2023 +0100
systemd: Use non-absolute paths in Exec* lines
Since systemd v239, Exec* binaries are resolved from PATH when they
are not-absolute. Switch to this by default for ease of downstream
maintenance. Many downstream distributions move individual binaries
to locations that existing compile-time configurations cannot
accommodate.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Dimitri John Ledkov <[email protected]>
Closes #14880
commit 4bfb9d28cffd4dfeb4b91359b497d100f668bb34
Author: Akash B <[email protected]>
Date: Thu May 25 00:58:09 2023 +0530
Fix concurrent resilvers initiated at same time
For draid vdevs it was possible to initiate both the
sequential and healing resilver at same time.
This fixes the following two scenarios.
1) There's a window where a sequential rebuild can
be started via ZED even if a healing resilver has been
scheduled.
- This is fixed by adding additional check in
spa_vdev_attach() for any scheduled resilver and return
appropriate error code when a resilver is already in
progress.
2) It was possible for zpool clear to start a healing
resilver when it wasn't needed at all. This occurs because
during a vdev_open() the device is presumed to be healthy not
until the device is validated by vdev_validate() and it's set
unavailable. However, by this point an async resilver will
have already been requested if the DTL isn't empty.
- This is fixed by cancelling the SPA_ASYNC_RESILVER
request immediately at the end of vdev_reopen() when a resilver
is unneeded.
Finally, added a testcase in ZTS for verification.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Dipak Ghosh <[email protected]>
Signed-off-by: Akash B <[email protected]>
Closes #14881
Closes #14892
commit c9bb406d177a00aa1f0058d29aeb29e478223273
Author: youzhongyang <[email protected]>
Date: Wed May 24 15:23:42 2023 -0400
Linux 6.4 compat: reclaimed_slab renamed to reclaimed
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Youzhong Yang <[email protected]>
Closes #14891
commit 79e61a873b136f13fcf140beb925ceddc1f94767
Author: Brian Atkinson <[email protected]>
Date: Fri May 19 16:05:53 2023 -0400
Hold db_mtx when updating db_state
Commit 555ef90 did some general code refactoring for
dmu_buf_will_not_fill() and dmu_buf_will_fill(). However, the db_mtx was
not held when update db->db_state in those code block. The rest of the
dbuf code always holds the db_mtx when updating db_state. This is
important because cv_wait() db_changed is used to check for db_state
changes.
Updating dmu_buf_will_not_fill() and dmu_buf_will_fill() to hold the
db_mtx when updating db_state.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Brian Atkinson <[email protected]>
Closes #14875
commit d7be0cdf93a568b6c9b4a4e15a88a5d88ebbb764
Author: Brian Behlendorf <[email protected]>
Date: Fri May 19 13:05:09 2023 -0700
Probe vdevs before marking removed
Before allowing the ZED to mark a vdev as REMOVED due to a
hotplug event confirm that it is non-responsive with probe.
Any device which can be successfully probed should be left
ONLINE to prevent a healthy pool from being incorrectly
SUSPENDED. This may occur for at least the following two
scenarios.
1) Drive expansion (zpool online -e) in VMware environments.
If, during the partition resize operation, a partition is
removed and re-created then udev will send a removed event.
2) Re-scanning the namespaces of an NVMe device (nvme ns-rescan)
may result in a udev remove and add event being delivered.
Finally, update the ZED to only kick in a spare when the
removal was successful.
Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14859
Closes #14861
commit 054bb22686045ea1499065a4456568f0c21d939b
Author: Andrew Innes <[email protected]>
Date: Tue Jun 27 09:20:56 2023 +0800
Windows: Teach zpool scrub to scrub only blocks in error log
Signed-off-by: Andrew Innes <[email protected]>
commit b61e89a3e68ae19819493183ff3d1fe7bf4ffe2b
Author: George Amanakis <[email protected]>
Date: Fri Dec 17 21:35:28 2021 +0100
Teach zpool scrub to scrub only blocks in error log
Added a flag '-e' in zpool scrub to scrub only blocks in error log. A
user can pause, resume and cancel the error scrub by passing additional
command line arguments -p -s just like a regular scrub. This involves
adding a new flag, creating new libzfs interfaces, a new ioctl, and the
actual iteration and read-issuing logic. Error scrubbing is executed in
multiple txg to make sure pool performance is not affected.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Co-authored-by: TulsiJain [email protected]
Signed-off-by: George Amanakis <[email protected]>
Closes #8995
Closes #12355
commit 61bfb3cb5dd792ec7ca0fbfca59b165f3ddbe1f5
Author: Brian Behlendorf <[email protected]>
Date: Thu May 18 10:02:20 2023 -0700
Add the ability to uninitialize
zpool initialize functions well for touching every free byte...once.
But if we want to do it again, we're currently out of luck.
So let's add zpool initialize -u to clear it.
Co-authored-by: Rich Ercolani <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12451
Closes #14873
commit 855b62942d4ca5dab3d65b7000f9d284fd1560bb
Author: Antonio Russo <[email protected]>
Date: Mon May 15 17:11:33 2023 -0600
test-runner: pass kmemleak and kmsg to Cmd.run
test-runner.py orchestrates all of the ZTS executions. The `Cmd` object
manages these process, and its `run` method specifically invokes these
possibly long-running processes, possibly retrying in the event of a
timeout. Since its inception, memory leak detection using the kmemleak
infrastructure [1], and kernel logging [2] have been added to this run
mechanism.
However, the callback to cull a process beyond its timeout threshold,
`kill_cmd`, has evaded modernization by both of these changes. As a
result, this function fails to properly invoke `run`, leading to an
untrapped exception and unreported test failure.
This patch extends `kill_cmd` to receive these kernel devices through
the `options` parameter, and regularizes all the `.run` calls from
`Cmd`, and its subclasses, to accept that parameter.
[1] Commit a69765ea5b563e0cd4d15fac4b1ac08c6ccf12d1
[2] Commit fc2c0256c55a2859d1988671b0896d22b75c8aba
Reviewed-by: John Wren Kennedy <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Closes #14849
commit 537939565123fd2afa097e9a56ee3efd28779e5f
Author: Richard Yao <[email protected]>
Date: Fri May 12 17:10:14 2023 -0400
Fix undefined behavior in spa_sync_props()
8eae2d214cfa53862833eeeda9a5c1e9d5ded47d caused Coverity to begin
complaining about "Improper use of negative value" in two places in
spa_sync_props() because Coverity correctly inferred from `prop ==
ZPOOL_PROP_INVAL` that prop could be -1 while both zpool_prop_to_name()
and zpool_prop_get_type() use it an array index, which is undefined
behavior.
Assuming that the system does not panic from an attempt to read invalid
memory, the case statement for ZPOOL_PROP_INVAL will ensure that only
user properties will reach this code when prop is ZPOOL_PROP_INVAL, such
that execution will continue safely. However, if we are unlucky enough
to read invalid memory, then the system will panic.
This issue predates the patch that caused coverity to begin complaining.
Thankfully, our userland tools do not pass nonsense to us, so this bug
should not be triggered unless a future userland tool attempts to set a
property that we do not understand.
Reported-by: Coverity (CID-1561129)
Reported-by: Coverity (CID-1561130)
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Amanakis <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14860
commit 02351b380f0430980bfb92e83d0800df104bd06a
Author: Richard Yao <[email protected]>
Date: Fri May 12 16:47:56 2023 -0400
Fix use after free regression in spa_remove_healed_errors()
6839ec6f1098c28ff7b772f1b31b832d05e6b567 placed code in
spa_remove_healed_errors() that uses a pointer after the kmem_free()
call that frees it.
Reported-by: Coverity (CID-1562375)
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Amanakis <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14860
commit e9b315ffb79ff6419694a2713fcd5fd448317904
Author: Andrew Innes <[email protected]>
Date: Mon May 15 13:52:35 2023 +0800
Use python3 on windows
commit 3346a5b78c2db15801ce54a70a323952fdf67fa5
Author: Jorgen Lundman <[email protected]>
Date: Thu Jun 22 08:56:38 2023 +0900
zfs_write() ignores errors
If files were advanced by zfs_freesp() we ignored
any errors returned by it.
Signed-off-by: Jorgen Lundman <[email protected]>
commit cce49c08316bc6a5dff287f4fa15856e26d5b18a
Author: Jorgen Lundman <[email protected]>
Date: Thu Jun 22 08:55:55 2023 +0900
Correct Stream event path
The Stream path events used the incorrect name
"stream", now uses "file.txt:stream" as per ntfs.
Signed-off-by: Jorgen Lundman <[email protected]>
commit 0f83d31e288d789fb4e10a7e4b12e27887820498
Author: Jorgen Lundman <[email protected]>
Date: Wed Jun 21 14:30:13 2023 +0900
Add stub for file_hard_link_information()
Signed-off-by: Jorgen Lundman <[email protected]>
commit 8d6db9490364e4d281546445571d2ca9d5abda22
Author: Jorgen Lundman <[email protected]>
Date: Wed Jun 21 14:29:43 2023 +0900
Return correct FileID in dirlist
Signed-off-by: Jorgen Lundman <[email protected]>
commit 4c011397229e3c38259d6956458a4fd287dca72d
Author: Andrew Innes <[email protected]>
Date: Wed Jun 21 10:17:30 2023 +0800
Fix logic (#232)
Signed-off-by: Andrew Innes <[email protected]>
commit 467436b676ad897025b7ed90d8f033969da441cc
Author: Andrew Innes <[email protected]>
Date: Wed Jun 21 09:47:38 2023 +0800
Run winbtrfs tests by default (#231)
Signed-off-by: Andrew Innes <[email protected]>
commit 56eca2a5d116c66b10579f9cf6d5f271991c7e2e
Author: Jorgen Lundman <[email protected]>
Date: Wed Jun 21 09:54:00 2023 +0900
SetFilePositionInformation SetFileValidDataLengthInformation
Signed-off-by: Jorgen Lundman <[email protected]>
commit b4fbbda470f27aee565dfa9bc0d68217b969339c
Author: Andrew Innes <[email protected]>
Date: Tue Jun 20 16:33:12 2023 +0800
Add sleep to tests (#230)
Signed-off-by: Andrew Innes <[email protected]>
commit 94f1f52807d1f8c0c2931e9e52b91f0ce5e488f4
Author: Jorgen Lundman <[email protected]>
Date: Tue Jun 20 16:53:50 2023 +0900
CreateFile of newfile:newstream should create both
In addition, many more stream fixes, illegal chars, and names
Signed-off-by: Jorgen Lundman <[email protected]>
commit 894d512880d39ecf40e841c6d7b73157dfe397e0
Author: Jorgen Lundman <[email protected]>
Date: Tue Jun 20 08:41:37 2023 +0900
Windows streams should return parent file ID
When asked for File ID of a stream, it should return
the FileID of the parent file, which is two levels up.
Signed-off-by: Jorgen Lundman <[email protected]>
commit 0cc45d2154a2866b2f494c3790a57555c29e60c3
Author: Jorgen Lundman <[email protected]>
Date: Tue Jun 20 08:32:44 2023 +0900
Support FILE_STANDARD_INFORMATION_EX
Signed-off-by: Jorgen Lundman <[email protected]>
commit a6edd02999d581db56f4a53567f4c5db11778f64
Author: Jorgen Lundman <[email protected]>
Date: Mon Jun 19 10:36:13 2023 +0900
Add xattr compat code from upstream
and adjust calls to new API calls.
This adds xattr=sa support to Windows.
Signed-off-by: Jorgen Lundman <[email protected]>
commit 0e1476a3942990385d32c02403ebe2c815d567db
Author: Jorgen Lundman <[email protected]>
Date: Wed Jun 14 11:56:09 2023 +0900
Set EA can panic
Signed-off-by: Jorgen Lundman <[email protected]>
commit 4a1adef6b8c2851195d692a42d5718c9a1b03490
Author: Jorgen Lundman <[email protected]>
Date: Wed Jun 14 09:49:57 2023 +0900
Incorrect MAXPATH used in delete entry
Signed-off-by: Jorgen Lundman <[email protected]>
commit 2c0d119e37cb3eed1acac90efa9fe0f8c173e0f0
Author: Jorgen Lundman <[email protected]>
Date: Tue Jun 13 16:19:42 2023 +0900
Large changes fixing FS notify events
Some incorrect behavior still, query name of
a stream is wrong.
Signed-off-by: Jorgen Lundman <[email protected]>
commit 5b2b2b0550a493497a0b460206079fd57c639543
Author: Jorgen Lundman <[email protected]>
Date: Tue May 16 14:42:52 2023 +0900
file name and file full information buffer overrun
When a buffer is not big enough, we would still
null terminate on the full string, beyond the supplied
buffer.
Signed-off-by: Jorgen Lundman <[email protected]>
commit 94bfb92951a5ccdef7b2a1fb818fafdafbc4fff0
Author: Jorgen Lundman <[email protected]>
Date: Tue May 16 11:48:12 2023 +0900
Correct Query EA and Query Streams
Which includes:
* NextEntryOffset is not offset from Buffer, but from one struct to
the next struct.
* Pack only complete EAs, and return Overflow if does not fit
* query file EA information would return from Information=size
* Call cleareaszie on VP when EAs have changed
Signed-off-by: Jorgen Lundman <[email protected]>
commit 9c7a4071fcfc99c3308620fc1943355f9ade34b3
Author: Alexander Motin <[email protected]>
Date: Fri May 12 12:49:26 2023 -0400
zil: Free lwb_buf after write completion.
There is no sense to keep that memory allocated during the flush.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14855
commit 7e91b3222ddaadc10c92d1065529886dd3806acc
Author: Alexander Motin <[email protected]>
Date: Fri May 12 12:14:29 2023 -0400
zil: Some micro-optimizations.
Should not cause functional changes.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14854
commit 6b62c3b0e10de782c3aef0e1206aa48875519c4e
Author: Don Brady <[email protected]>
Date: Fri May 12 10:12:28 2023 -0600
Refine special_small_blocks property validation
When the special_small_blocks property is being set during a pool
create it enforces a limit of 128KiB even if the pool's record size
is larger.
If the recordsize property is being set during a pool create, then
use that value instead of the default SPA_OLD_MAXBLOCKSIZE value.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #13815
Closes #14811
commit d0ab2dddde618c394fa7fe88211276786ba8ca12
Author: Brian Behlendorf <[email protected]>
Date: Fri May 12 09:07:58 2023 -0700
ZTS: Add auto_replace_001_pos to exceptions
The auto_replace_001_pos test case does not reliably pass on
Fedora 37 and newer. Until the test case can be updated to make
it reliable add it to the list of "maybe" exceptions on Linux.
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14851
Closes #14852
commit 1e3e7a103a5026e9a2005acec7017e4024d95115
Author: Pawel Jakub Dawidek <[email protected]>
Date: Tue May 9 22:32:30 2023 -0700
Make sure we are not trying to clone a spill block.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit a22891c3272d8527d4c8cb7ff52a25ef396e7add
Author: Pawel Jakub Dawidek <[email protected]>
Date: Thu May 4 16:14:19 2023 -0700
Correct comment.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit 9b016166dd5875db87963b5deeca8eeda094b571
Author: Pawel Jakub Dawidek <[email protected]>
Date: Wed May 3 23:25:22 2023 -0700
Remove badly placed comment.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit 6bcd48e213a279781ecd6df22799532cbec353d6
Author: Pawel Jakub Dawidek <[email protected]>
Date: Wed May 3 00:24:47 2023 -0700
Don't call zfs_exit_two() before zfs_enter_two().
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit 0919c985e294a89169adacd5ed4a240945e5fbee
Author: Pawel Jakub Dawidek <[email protected]>
Date: Tue May 2 15:46:14 2023 -0700
Don't use dmu_buf_is_dirty() for unassigned transaction.
The dmu_buf_is_dirty() call doesn't make sense here for two reasons:
1. txg is 0 for unassigned tx, so it was a no-op.
2. It is equivalent of checking if we have dirty records and we are doing
this few lines earlier.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit 7f88494ac91c61aeffad810e7d167badb875166e
Author: Pawel Jakub Dawidek <[email protected]>
Date: Tue May 2 14:24:43 2023 -0700
Deny block cloning is dbuf size doesn't match BP size.
I don't know an easy way to shrink down dbuf size, so just deny block cloning
into dbufs that don't match our BP's size.
This fixes the following situation:
1. Create a small file, eg. 1kB of random bytes. Its dbuf will be 1kB.
2. Create a larger file, eg. 2kB of random bytes. Its dbuf will be 2kB.
3. Truncate the large file to 0. Its dbuf will remain 2kB.
4. Clone the small file into the large file. Small file's BP lsize is
1kB, but the large file's dbuf is 2kB.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit 49657002f9cb57b9b4675100aaf58e1e93984bbf
Author: Pawel Jakub Dawidek <[email protected]>
Date: Sun Apr 30 02:47:09 2023 -0700
Additional block cloning fixes.
Reimplement some of the block cloning vs dbuf logic, mostly to fix
situation where we clone a block and in the same transaction group
we want to partially overwrite the clone.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
commit 4d31369d3055bf0cf1d4f3e1e7d43d745f2fd05f
Author: Alexander Motin <[email protected]>
Date: Thu May 11 17:27:12 2023 -0400
zil: Don't expect zio_shrink() to succeed.
At least for RAIDZ zio_shrink() does not reduce zio size, but reduced
wsz in that case likely results in writing uninitialized memory.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14853
commit 663dc5f616e6d0427207ffcf7a83dd02fe06a707
Author: Ameer Hamza <[email protected]>
Date: Wed May 10 05:56:35 2023 +0500
Prevent panic during concurrent snapshot rollback and zvol read
Protect zvol_cdev_read with zv_suspend_lock to prevent concurrent
release of the dnode, avoiding panic when a snapshot is rolled back
in parallel during ongoing zvol read operation.
Reviewed-by: Chunwei Chen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ameer Hamza <[email protected]>
Closes #14839
commit 7375f4f61ca587f893435184f398a767ae52fbea
Author: Tony Hutter <[email protected]>
Date: Tue May 9 17:55:19 2023 -0700
pam: Fix "buffer overflow" in pam ZTS tests on F38
The pam ZTS tests were reporting a buffer overflow on F38, possibly
due to F38 now setting _FORTIFY_SOURCE=3 by default. gdb and
valgrind narrowed this down to a snprintf() buffer overflow in
zfs_key_config_modify_session_counter(). I'm not clear why this
particular snprintf() was being flagged as an overflow, but when
I replaced it with an asprintf(), the test passed reliably.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #14802
Closes #14842
commit 9d3ed831f309e28a9cad56c8b1520292dbad0d7b
Author: Brian Behlendorf <[email protected]>
Date: Tue May 9 09:03:10 2023 -0700
Add dmu_tx_hold_append() interface
Provides an interface which callers can use to declare a write when
the exact starting offset in not yet known. Since the full range
being updated is not available only the first L0 block at the
provided offset will be prefetched.
Reviewed-by: Olaf Faaland <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14819
commit 2b6033d71da38015c885297d1ee6577871099744
Author: Brian Behlendorf <[email protected]>
Date: Tue May 9 08:57:02 2023 -0700
Debug auto_replace_001_pos failures
Reduced the timeout to 60 seconds which should be more than
sufficient and allow the test to be marked as FAILED rather
than KILLED. Also dump the pool status on cleanup.
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14829
commit f4adc2882fb162c82e9738c5d2d30e3ba8a66367
Author: George Amanakis <[email protected]>
Date: Tue May 9 17:54:41 2023 +0200
Remove duplicate code in l2arc_evict()
l2arc_evict() performs the adjustment of the size of buffers to be
written on L2ARC unnecessarily. l2arc_write_size() is called right
before l2arc_evict() and performs those adjustments.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14828
commit 9b2c182d291bbb3ece9ceb1c72800d238d19b2e7
Author: Alexander Motin <[email protected]>
Date: Tue May 9 11:54:01 2023 -0400
Remove single parent assertion from zio_nowait().
We only need to know if ZIO has any parent there. We do not care if
it has more than one, but use of zio_unique_parent() == NULL asserts
that.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14823
commit 4def61804c052a1235179e3a7c98305d8075e0e9
Author: George Amanakis <[email protected]>
Date: Tue May 9 17:53:27 2023 +0200
Enable the head_errlog feature to remove errors
In case check_filesystem() does not error out and does not report
an error, remove that error block from error lists and logs
without requiring a scrub. This can happen when the original file and
all snapshots/clones referencing it have been removed.
Otherwise zpool status will still report that "Permanent errors have
been detected..." without actually reporting any of them.
To implement this change the functions introduced in corrective
receive were modified to take into account the head_errlog feature.
Before this change:
=============================
pool: test
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
/home/user/vdev_a ONLINE 0 0 2
errors: Permanent errors have been detected in the following files:
=============================
After this change:
=============================
pool: test
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are
unaffected.
action: Determine if the device needs to be replaced, and clear the
errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
/home/user/vdev_a ONLINE 0 0 2
errors: No known data errors
=============================
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14813
commit 3f2f9533ca8512ef515a73ac5661598a65b896b6
Author: George Amanakis <[email protected]>
Date: Mon May 8 22:35:03 2023 +0200
Fixes in head_errlog feature with encryption
For the head_errlog feature use dsl_dataset_hold_obj_flags() instead of
dsl_dataset_hold_obj() in order to enable access to the encryption keys
(if loaded). This enables reporting of errors in encrypted filesystems
which are not mounted but have their keys loaded.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14837
commit 288ea63effae3ba24fcb6dc412a3125b9f3e1da9
Author: Matthew Ahrens <[email protected]>
Date: Mon May 8 11:20:23 2023 -0700
Verify block pointers before writing them out
If a block pointer is corrupted (but the block containing it checksums
correctly, e.g. due to a bug that overwrites random memory), we can
often detect it before the block is read, with the `zfs_blkptr_verify()`
function, which is used in `arc_read()`, `zio_free()`, etc.
However, such corruption is not typically recoverable. To recover from
it we would need to detect the memory error before the block pointer is
written to disk.
This PR verifies BP's that are contained in indirect blocks and dnodes
before they are written to disk, in `dbuf_write_ready()`. This way,
we'll get a panic before the on-disk data is corrupted. This will help
us to diagnose what's causing the corruption, as well as being much
easier to recover from.
To minimize performance impact, only checks that can be done without
holding the spa_config_lock are performed.
Additionally, when corruption is detected, the raw words of the block
pointer are logged. (Note that `dprintf_bp()` is a no-op by default,
but if enabled it is not safe to use with invalid block pointers.)
Reviewed-by: Rich Ercolani <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Paul Zuchowski <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #14817
commit 23132688b9d54ef11413925f88c02d83d607ec2b
Author: Brian Behlendorf <[email protected]>
Date: Mon May 8 11:17:41 2023 -0700
zdb: consistent xattr output
When using zdb to output the value of an xattr only interpret it
as printable characters if the entire byte array is printable.
Additionally, if the --parseable option is set always output the
buffer contents as octal for easy parsing.
Reviewed-by: Olaf Faaland <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14830
commit 6deb342248e10af92e2d3fbb4e4b1221812188ff
Author: Brian Behlendorf <[email protected]>
Date: Mon May 8 10:09:30 2023 -0700
ZTS: add snapshot/snapshot_002_pos exception
Add snapshot_002_pos to the known list of occasional failures
for FreeBSD until it can be made entirely reliable.
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14831
Closes #14832
commit a0a125bab291fe005d29be5375a5bb2a1c8261c7
Author: Alexander Motin <[email protected]>
Date: Fri May 5 12:17:55 2023 -0400
Fix two abd_gang_add_gang() issues.
- There is no reason to assert that added gang is not empty. It
may be weird to add an empty gang, but it is legal.
- When moving chain list from the added gang clear its size, or it
will trigger assertion in abd_verify() when that gang is freed.
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14816
commit aefb80389458dcccdcb9659914714264248b8e52
Author: Pawel Jakub Dawidek <[email protected]>
Date: Sat May 6 01:09:12 2023 +0900
Simplify and optimize random_int_between().
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14805
commit cf53b4376d902baecc04e450038d49c84c848e56
Author: Pawel Jakub Dawidek <[email protected]>
Date: Sat May 6 00:51:41 2023 +0900
Plug memory leak in zfsdev_state.
On kernel module unload, free all zfsdev state structures, except for
zfsdev_state_listhead, which is statically allocated.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14824
commit 409f6b6fa0caba14be1995bbe28ca70e55ab7666
Author: Ameer Hamza <[email protected]>
Date: Thu May 4 03:10:32 2023 +0500
zpool import -m also removing spare and cache when log device is missing
spa_import() relies on a pool config fetched by spa_try_import() for
spare/cache devices. Import flags are not passed to spa_tryimport(),
which makes it return early due to a missing log device and missing
retrieving the …1 parent a9d6b06 commit b1b2ff8Copy full SHA for b1b2ff8
File tree
Expand file treeCollapse file tree
526 files changed
+157897
-126
lines changedFilter options
- .github/workflows
- cmd
- os/windows
- kstat
- zfsinstaller
- raidz_test
- os/windows
- zdb
- os/windows
- zfs
- os/windows
- zpool
- os/windows
- zstream
- os/windows
- contrib
- bpftrace
- windows
- Inno.Setup
- OpenZFS
- OpenZFS
- TestCert
- cmake
- parsedump
- include
- os
- freebsd/spl/sys/ia32
- windows
- spl
- rpc
- sys
- ia32
- sysevent
- zfs
- sys
- fs
- sys
- fm/fs
- fs
- lib
- libavl
- libefi
- libicp
- libnvpair
- libshare
- os/windows
- libspl
- include
- os/windows
- rpc
- sys
- ia32
- uuid
- rpc
- sys
- os/windows
- libtpool
- libunicode
- libuutil
- libzfs
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Expand file treeCollapse file tree
526 files changed
+157897
-126
lines changed.github/workflows/codeql-windows.yml
Copy file name to clipboard+69Lines changed: 69 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + |
0 commit comments