Skip to content

Fix handling of objects with many xattrs on FreeBSD #766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

ptrrkssn
Copy link

The current code in lib/sysxattrs.c mishandles the case when there is a file/directory with many xattrs set - if the sum of the length of all the attribute names are more than 1024 it will fail. This causes problems with backups.

The fix here just makes sure we return ERANGE if we notice that the returned data is exactly the same length as the passed in buffer which will cause the code in xattrs.c to reallocate and retry with a bigger buffer.

rosorio pushed a commit to rosorio/rsync that referenced this pull request Jun 22, 2025
When a file have many xattrs that the sum of all
attribute names is larger than 1024, rsync will fail.
rosorio pushed a commit to rosorio/rsync that referenced this pull request Jun 22, 2025
When a file have many xattrs that the sum of all
attribute names is larger than 1024, rsync will fail.
@@ -126,17 +126,26 @@ ssize_t sys_llistxattr(const char *path, char *list, size_t size)
unsigned char keylen;
ssize_t off, len = extattr_list_link(path, EXTATTR_NAMESPACE_USER, list, size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Called this way extattr_list_link(2) behaves like read(2) and return the amount of bytes there was able to write in the list buffer which is not that informative and can misleading if list has the required size.

I suggest to perform a call of extattr_list_link(2) with a null pointer which returns us the exact amount of bytes required in the buffer without perform any write.

       ssize_t off = 0, len;

       /* Check for the required size to store the extended attibutes  */
       len = extattr_list_link(path, EXTATTR_NAMESPACE_USER, NULL, size);

Copy link
Author

@ptrrkssn ptrrkssn Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and no. Doing it that way has a potential to introduce a race condition if someone changes the extattr in the (very short) time between then call with NULL was done and when you actually do the real read. Doing it the way it's done now removes that race.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask myself the same question when doing my tests.
But if you consider that rsync process extended attributes by

  • extracting xattrs key list
  • then in a second time for each key, recovering the associated value
    the race condition you describe can occurs at any time (eg. if you start removing xattrs when rsync is working)

If you are looking for full consistent remote copy tool, rsync only is probably not the most appropriate tool and you should consider make zfs snapshots, mount them then use rsync for the file transfer.

In fact, in worst case scenario, we can have a truncated key value which makes a particular file transfer fail because the value associated with the broken key is missing. But I don't expect any crash.

return len;

if ((size_t)len >= size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The equal test is not required anymore just do if ((size_t)len > size)

Copy link
Author

@ptrrkssn ptrrkssn Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if extattr_list_link(..., size) returns len==size then we don't know if we got all the data or if there is more data to read. It might be exactly 'size' bytes that was available there, or it might be more, but since extattr_list_link() will return just 'size' in both cases then that will fail. Now the code after that which parses the array might return EINVAL if we just got a partial buffer - but there is a small chance that the list of extattrs is so perfectly aligned that it terminates exactly at the 'size' border - but there are still more extattrs that could have been read if the buffer was bigger...

return len;

if ((size_t)len >= size) {
/* FreeBSD extattr_list_xx() returns 'size' as 'len' in case there are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make the comment simpler : "buffer is too small, behave as linux and request a bigger buffer"

errno = ERANGE;
return -1;
}

Copy link
Contributor

@rosorio rosorio Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we can perform the real call to retrieve the buffer:

len = extattr_list_link(path, EXTATTR_NAMESPACE_USER, list, size);

if (len <= 0)
    return len;

Edit: Catch an invalid memory access error

errno = ERANGE;
return -1;
}

/* FreeBSD puts a single-byte length before each string, with no '\0'
* terminator. We need to change this into a series of null-terminted
* strings. Since the size is the same, we can simply transform the
* output in place. */
for (off = 0; off < len; off += keylen + 1) {
Copy link
Contributor

@rosorio rosorio Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This for loop is quite inefficient performing multiple memmove(2) calls, what about doing it one:

       keylen = (unsigned char)list[0];
       memmove(list, list+1, len-1);
       list[len-1] = '\0';
       for (off = keylen; off < (len - 1); off += (keylen + 1)) {
               keylen = (unsigned char)list[off];
               list[off] = '\0';
        }

        if ((size_t)len != (size_t)(off+1)) {
                errno = ERANGE;
                return -1;
        }

Edited: Added a final test to ensure second extattr_list_link didn't truncate the list due to a change of the file extended attribute list (eg. a new attribute was added). On failure we request a longer buffer.

Copy link
Author

@ptrrkssn ptrrkssn Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like a good suggestion, I never looked at that part of the code :-).

However, isn't the list[len-1] = '\0' overwriting the next keylen before you read it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we did the memmove before changing list[len-1]

before memmove list = [0x1,'A',0x1,'B',0x1,'C']
after memmove list = ['A',0x1,'B',0x1,'C','C']
after list[5] = '\0' list = ['A',0x1,'B',0x1,'C','\0']

@rosorio
Copy link
Contributor

rosorio commented Jul 19, 2025

Hi,

I think you was right, and doing a extattr_list_link() with a null buffer has no sens, mostly because is what
rsync does if first attempt fail to resize its buffer, and because your way has less edge cases to handle than mine.
So if we can agree in something like that, I'll be happy to push the patch on FreeBSD ports.

ssize_t sys_llistxattr(const char *path, char *list, size_t size)
{
        unsigned char keylen;
        ssize_t off, len = extattr_list_link(path, EXTATTR_NAMESPACE_USER, list, size);

        if (len <= 0 || size == 0)
                return len;

        if ((size_t)len == size) {
                /* Buffer is too small, behave like linux
                 * syscall and request a bigger buffer */
                errno = ERANGE;
                return -1;
        }

        /* FreeBSD puts a single-byte length before each string, with no '\0'
         * terminator.  We need to change this into a series of null-terminted
         * strings.  Since the size is the same, we can simply transform the
         * output in place. */
        keylen = (unsigned char)list[0];
        memmove(list, list+1, len-1);
        list[len-1] = '\0';
        for (off = keylen; off < (len - 1); off += (keylen + 1)) {
                keylen = (unsigned char)list[off];
                list[off] = '\0';
        }

        return len;
}

@ptrrkssn
Copy link
Author

Yes, I agree that this looks good. I probably would use >= instead of == for the second if statement, even though FreeBSD currently never will return len>size - defensive programming covering all possible cases :-)

Or perhaps an assert((size_t) len > size); /* Should never happen */) - just incase... :_)

@rosorio
Copy link
Contributor

rosorio commented Jul 29, 2025

Patch committed into the FreeBSD ports tree. Hope this could be merged in the rsync tree.

freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this pull request Jul 29, 2025
Fix sys_llistxattr() for FreeBSD code to request a larger
buffer to store the extended attributes list if the current
one is too small.

I also improve the attribute parsing by reducing to one the memmove
calls when converting FreeBSD attribute list to the format used by
Linux and extend the rsync unit tests for long xattrs lists.

Changes where submitted to upstream as:
RsyncProject/rsync#781
RsyncProject/rsync#766

PR:		286773
Reported by:	Peter Eriksson <[email protected]>
Obtained from:	Peter Eriksson <[email protected]>
Reviewed by:	rodrigo
Tested by:	rodrigo
Event:		Berlin Hackaton 202507
Relnotes:  	yes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants