Skip to content

"attrs: [Errno 12] Cannot allocate memory" when restoring ACLs without associated user id #8750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jernejs opened this issue Apr 13, 2025 · 16 comments
Milestone

Comments

@jernejs
Copy link

jernejs commented Apr 13, 2025

I had to do a bare metal restore of my server today, and during restore I noticed a lot of "attrs: [Errno 12] Cannot allocate memory" errors. It appears that Borg (v1.4.0) stores ACLs as username only (no numeric id), and then fails to restore them with that error message when the user is not present on the system.

Here's a minimal test case — restore archive test3 to reproduce (the test was done by setting ACLs to user belonging to an Active Directory domain; after stopping Samba, restoring the backup produces the above error; if I make a backup while Samba is stopped, Borg stores numeric id in ACL, and restore then works regardless of whether Samba is running or not).

@ThomasWaldmann ThomasWaldmann added this to the 1.4.1 milestone Apr 13, 2025
@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 14, 2025

OS is Linux I suspect?

errno.ENOMEM == 12

Please add the borg command you used for restoring.

You say you noticed a lot of these errors. I am wondering about the "attrs: " part - is that a file name? Is that always the same in all error msgs you see?

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 14, 2025

I could reproduce using the repo linked above:

(borg-env) vagrant@vagrant:~$ tar xJf borgacl.tar.xz 
(borg-env) vagrant@vagrant:~$ borg list borgacl
test                                 Sun, 2025-04-13 17:39:13 [1a1a68e55718949e2ea0d52aee140e5935202c6fa709e4658fe2d8b17debc63a]
test2                                Sun, 2025-04-13 17:40:32 [18ad623c8d22144a23cb17b6e8a2071f065b3dcbaf7239416d477b7eb432c121]
test3                                Sun, 2025-04-13 17:44:25 [28ae33a43c08ffe6c4d453962310985747249c4b22f633d894cf1f838be0e665]
(borg-env) vagrant@vagrant:~$ borg extract borgacl::test3
acltest/bla.txt: attrs: [Errno 12] Cannot allocate memory: '/home/vagrant/acltest/bla.txt'
acltest: attrs: [Errno 12] Cannot allocate memory: '/home/vagrant/acltest'

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 14, 2025

That's the archived access ACL that blows things up:

b'''
user::rw-
user:YGGDRASIL\\\\ender:r--:YGGDRASIL\\\\ender
user:1234567:r--:1234567
user:123456789:r--:123456789
group::r--
mask::r--
other::r--
'''

This gets converted via acl_use_local_uid_gid to:

b'''
user::rw-
user:YGGDRASIL\\\\ender:r--
user:1234567:r--
user:123456789:r--
group::r--
mask::r--
other::r--
'''

And if that is fed into acl_from_text(), it blows up with:

acltest/bla.txt: attrs: [Errno 12] Cannot allocate memory: '/home/vagrant/acltest/bla.txt'

@jernejs
Copy link
Author

jernejs commented Apr 14, 2025

Right, OS is Linux, and borg extract borgacl::test3 is enough to reproduce (and using --numeric-ids doesn't help).

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 14, 2025

There are multiple issues here with acl_use_local_uid_gid() trying to convert user:YGGDRASIL\\\\ender:r--:YGGDRASIL\\\\ender:

As field0 is "user", it tries to convert the name from field1 to a local uid.

But, if samba is not running, that fails (the mapping is unknown then) and it defaults to the contents of field3 (which usually should contain the archived uid, but is a name here!?).

Using --numeric-ids at extract time does not help, because it tries to use the archived uid from field3, but that is not a uid here, but same name as in field1.

So guess the root cause of this issue is field3 not containing an archived uid.

Additionally, there is also non-borg issue with acl_from_text wanting to allocate lots of memory for inputs it can't process.

@ThomasWaldmann
Copy link
Member

borg create --numeric-ids archives ACLs using the acl_numeric_ids converter, which puts the uid into field1 and field3 IF it could map the username from field1 to a uid. Otherwise it will put the username into field1 and field3.

borg create archives ACLs using the acl_append_numeric_ids converter, which puts the uid into field3 IF it could map the username from field1 to a uid. Otherwise it will put the username into field3.

So, it looks like maybe samba was not running at borg create time, thus the uid lookups failed and the user name (not id) ended up in field3?

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 14, 2025

Reproduce the non-borg bug:

#include <stdio.h>
#include <stdlib.h>
#include <sys/acl.h>

int main() {
    const char *acl_text = "u:HOST\\\\user:rwx,g::r--,o::---";

    acl_t acl = acl_from_text(acl_text);
    if (acl == NULL) {
        perror("acl_from_text");
        return EXIT_FAILURE;
    }

    printf("ACL successfully parsed from text: %s\n", acl_text);

    // Always free the allocated ACL structure
    acl_free(acl);

    return EXIT_SUCCESS;
}

Result:

$ ./acl_test 
acl_from_text: Cannot allocate memory

Any non-existent username leads to this error, even simple ones like "foo".

Update: seems fixed since a while:

https://git.savannah.nongnu.org/cgit/acl.git/commit/?id=56abe432b65801f31277fb9a3bca0f9e31502315

Not released yet, bug is still present in libacl 2.3.2.

@ThomasWaldmann
Copy link
Member

About field3: I think there is only one way how a name can end up being there (and getting into the archived ACL):

If the lookup username -> userid (or groupname -> groupid) fails at borg create time.

The user2uid function of borg uses a caching wrapper, so if the first lookup fails, any further lookup will return the same result.

@jernejs please check. ^^^

@jernejs
Copy link
Author

jernejs commented Apr 14, 2025

How would I check for this?

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 15, 2025

@jernejs Maybe check if all services required for the lookup run all the time while the backup is running, check cron jobs and other timers, logs.

@jernejs
Copy link
Author

jernejs commented Apr 15, 2025

I've been running this since yesterday evening, and it hasn't exited yet:

import pwd
import time
import sys

def user2uid(user, default=None):
    try:
        return user and pwd.getpwnam(user).pw_uid
    except KeyError:
        return default

while str(user2uid('YGGDRASIL\\ender','YGGDRASIL\\ender')) == "1101104":
        print(".", end="", flush=True)
        time.sleep(1)

(I did test it first by stopping Samba, which caused it to exit immediately).

@ThomasWaldmann
Copy link
Member

Maybe make the sleep much shorter (like 1ms), so you could also notice "outages" shorter than a second.

@jernejs
Copy link
Author

jernejs commented Apr 15, 2025

OK, commented out the print line and changed interval to .001, created new borg archive, stopped Samba (which stopped the script), and tried extracting the archive, which still gave Cannot allocate memory.

@ThomasWaldmann
Copy link
Member

Did you use borg extract --numeric-ids? If samba is stopped, it can only work if it does not try looking up name->id, but just uses the id it has archived.

Then 2 outcomes are possible:

  • it has an archived uid in field3 (as it should be) and just works, no lookup
  • it has an archived username in field3 (lookup at borg create time didn't give a uid), would need to do a name->id lookup now and fails (with that cannot allocate error due to the bug in libacl)

@jernejs
Copy link
Author

jernejs commented Apr 16, 2025

When I was doing real archive extract, I did use --numeric-ids; while doing tests with the above archive, I tested both with and without (and it didn't make any difference).
If Samba isn't running while creating the archive, numeric id gets stored (presumably in both fields, since there's no way to look up username without winbind); such archive can be extracted whether Samba is running or not.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 17, 2025

Hmm, maybe we can take another approach (as I miss the necessary samba/winbind environment here to reproduce):

Can you reproduce the bug without involving samba/winbind?

The bug to reproduce is borg create (without --numeric-ids) creating an archive that has archived ACLs with the username in field3 (and not the uid, as it should be).

If it's a borg bug, that should be possible and if it is not, guess the samba devs need to analyse/debug this.

Maybe you can also try if this script is enough to reproduce in your samba environment:

import os
from borg.platform import *

path = "foo"

# use a OS-level fd as borg usually does
item = {}
with open(path) as f:
    fd = f.fileno()
    acl_get(path, item, os.stat(path), fd=fd)

print(item)

# use a path (older borg worked path-based)
item = {}
acl_get(path, item, os.stat(path), fd=None)

print(item)

# example output (valid):
# {'acl_access': b'user::rw-\nuser:root:r--:0\ngroup::r--\nmask::r--\nother::r--'}
#                       field 0    1    2   3
#                                           ^ has a uid as it should be.
#                                  ^          has a username as it should be.

If that makes the bug visible in the samba environment, try to produce the same bug without samba.

@ThomasWaldmann ThomasWaldmann modified the milestones: 1.4.1, 1.4.2 Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants