infix icon indicating copy to clipboard operation
infix copied to clipboard

Unable to copy running-config to non-existing startup-config

Open rical opened this issue 11 months ago • 7 comments

Current Behavior

No user can copy running to startup if startup doesn't exist or the full path name is given (copy running /cfg/startup-config.cfg)

root@ix-00-00-00:/> copy running-config startup-config 
Error: setting group owner wheel (10) on /cfg/startup-config.cfg: No such file or directory

No startup-config.cfg is created here ^

root@ix-00-00-00:~$ ls /cfg/startup-config.cfg 
ls: /cfg/startup-config.cfg: No such file or directory

Expected Behavior

root@ix-00-00-00:/> remove startup-config.cfg
Remove /cfg/startup-config.cfg, are you sure (y/N)? y
root@ix-00-00-00:/> copy running-config startup-config 

Steps To Reproduce

root@ix-00-00-00:/> remove startup-config.cfg
Remove /cfg/startup-config.cfg, are you sure (y/N)? y

root@ix-00-00-00:/> copy running-config startup-config 
Error: setting group owner wheel (10) on /cfg/startup-config.cfg: No such file or directory

Additional information

This problem manifests itself by throwing this chown error but the main problem is that no start-config is created. The command copy running startup takes a different code path than copy running /cfg/startup-config.cfg.

rical avatar Mar 10 '25 15:03 rical

I think the root cause is that following should be && instead of || -- so that we can enter the first if() statement to reach systemf("sysrepocfg -d %s -X%s -f json", srcds->sysrepocfg, fn);, but I don't understand how this could work if the file exists ... 🤔

https://github.com/kernelkit/infix/blob/dd788f5611b1cac232673c0a2655edb0368e5f9a/src/bin/copy.c#L241-L244

troglobit avatar Mar 10 '25 16:03 troglobit

I had a quick look.

startup-config is the only infix_config with a path, that causes the error printout from the permission code. But I'm not sure this is purely a problem with the copy code. If the file exists it's updated properly but I don't see any explicit writes to it in the copy code? Is it sysrepo who updates the backing file (/cfg/startup-config.cfg)?

EDIT: To clarify, the copy running /cfg/startup-config.cfg works as it takes a totally different code path as /cfg/startup-config.cfg is treated as a file and not a data store.

rical avatar Mar 11 '25 09:03 rical

CCB: Future

jovatn avatar Mar 17 '25 09:03 jovatn

Found the root cause for this now. Turns out the copy command does "nothing" when we save running to startup, instead it's a trigger in confd:core.c that runs -- but only if there have been any changes to the startup datastore.

Took me a while, but I'm now recalling that one of the reasons for this weird design is that we have users that script Infix using sysrepocfg, who expect sysrepocfg -d startup -C running to persist across boots.

troglobit avatar Mar 24 '25 07:03 troglobit

The trigger in confd, core_startup_save(), runs for each change registered. E.g., for nine changes in running config, we will call sysrepocfg -X/cfg/startup-config.cfg -d startup -f json nine times. Less than ideal, obviously.

So if /cfg/startup-config.cfg is removed manually, running-config is updated, followed by another copy running-config startup-config from the CLI, the file will be created, but with the wrong permissions.

troglobit avatar Mar 24 '25 07:03 troglobit

There are multiple ways forward here, a radical one is to remove the tools sysrepocfg and sysrepoctl -- do we really need them if we ensure the CLI and our own command line tools link with libsysrepo and use the C API instead?

Another, perhaps less invasive which would allow us to drop core_startup_save() is to patch sysrepo to support an alternative (single) backend-store file for startup, /cfg/startup-config.cfg.

Personally I vote for the section option, either way a decision needs to be made before the copy command can be simplified/refactored.

troglobit avatar Mar 24 '25 07:03 troglobit

After discussions during today's CCB we agreed we need something fundamental to fix this issue, and @wkz suggested instead patch sysrepo to add a post-hook with a datastore argument. That way we could at the same time remove our own num_changes hack in confd:core.c

troglobit avatar Apr 01 '25 08:04 troglobit

Despite the alternatives, I've found a way just tell sysrepocfg to export the source datastore in the case where the destination in the copy command is a file. This works and fixes the bug.

troglobit avatar Oct 20 '25 14:10 troglobit