Seed not supported by DataSourceNoCloud when using a http-URL
Bug report
While cloud-init properly finds a NoCloud datasource with instructions to pull the configuration from a http-source but fails to recognise it as compatible datasource seed. This leads to the cloud-init configuration to be not pulled and executed. The main error message is:
DataSourceNoCloud.py[DEBUG]: Seed from http://<domain>/api/service/v1/cloud-init/ not supported by DataSourceNoCloud [seed=None][dsmode=net]
So cloud-init properly detects it's running in a NoCloud environment, properly extracts the seed URL from the SMBIOS serial but then somehow thinks that's not a valid seed which is really confusing.
Steps to reproduce the problem
We create and run an ordinary libvirt KVM where Oracle Linux 9 is being installed on. The libvirt XML contains the following SMBIOS entry:
<sysinfo type='smbios'>
<system>
<entry name='serial'>ds=nocloud;s=http://<domain>/api/service/v1/cloud-init/</entry>
</system>
</sysinfo>
I also tried the following serials to make sure I don't do anything majorly wrong:
ds=nocloud-net;s=http://<domain>/api/service/v1/cloud-init/
ds=nocloud;s=https://<domain>/api/service/v1/cloud-init/
ds=nocloud-net;s=https://<domain>/api/service/v1/cloud-init/
ds=nocloud;s=http://<domain>/api/service/v1/cloud-init/;h=test
ds=nocloud;s=https://<domain>/api/service/v1/cloud-init/;h=test
ds=nocloud-net;s=http://<domain>/api/service/v1/cloud-init/;h=test
ds=nocloud-net;s=https://<domain>/api/service/v1/cloud-init/;h=test
Then start the machine for the first time after the installation. The expectation would be that cloud-init properly detects that it's running in a NoCloud environment, pulls the seed URL from the SMBIOS and then proceeds to request the contents of user-data and vendor-data from that URL. However, the process immediately fails and there's no request at the web server you could observe at all.
Environment details
- Cloud-init version: 23.4-7.0.1.el9_4
- Operating System Distribution: Oracle Linux 9
- Cloud provider, platform or installer type: KVM/Libvirt
cloud-init logs
2024-06-07 09:29:08,560 - util.py[DEBUG]: Cloud-init v. 23.4-7.0.1.el9_4 running 'init-local' at Fri, 07 Jun 2024 09:29:08 +0000. Up 8.70 seconds.
2024-06-07 09:29:08,560 - main.py[INFO]: PID [1] started cloud-init.
2024-06-07 09:29:08,560 - main.py[DEBUG]: No kernel command line url found.
2024-06-07 09:29:08,560 - main.py[DEBUG]: Closing stdin.
2024-06-07 09:29:08,568 - util.py[DEBUG]: Writing to /var/log/cloud-init.log - ab: [640] 0 bytes
2024-06-07 09:29:08,569 - util.py[DEBUG]: Writing to /var/lib/cloud/data/python-version - wb: [644] 3 bytes
2024-06-07 09:29:08,569 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance/boot-finished
2024-06-07 09:29:08,570 - handlers.py[DEBUG]: start: init-local/check-cache: attempting to read from cache [check]
2024-06-07 09:29:08,570 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
2024-06-07 09:29:08,570 - stages.py[DEBUG]: no cache found
2024-06-07 09:29:08,570 - handlers.py[DEBUG]: finish: init-local/check-cache: SUCCESS: no cache found
2024-06-07 09:29:08,570 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance
2024-06-07 09:29:08,587 - stages.py[DEBUG]: Using distro class <class 'cloudinit.distros.rhel.Distro'>
2024-06-07 09:29:08,588 - __init__.py[DEBUG]: Looking for data source in: ['NoCloud', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM']
2024-06-07 09:29:08,607 - __init__.py[DEBUG]: Searching for local data source in: ['DataSourceNoCloud']
2024-06-07 09:29:08,607 - handlers.py[DEBUG]: start: init-local/search-NoCloud: searching for local data from DataSourceNoCloud
2024-06-07 09:29:08,607 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceNoCloud.DataSourceNoCloud'>
2024-06-07 09:29:08,608 - __init__.py[DEBUG]: Update datasource metadata and network config due to events: boot-new-instance
2024-06-07 09:29:08,609 - __init__.py[DEBUG]: Detected platform: DataSourceNoCloud [seed=None][dsmode=net]. Checking for active instance data
2024-06-07 09:29:08,610 - dmi.py[DEBUG]: querying dmi data /sys/class/dmi/id/product_serial
2024-06-07 09:29:08,611 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud/user-data (quiet=False)
2024-06-07 09:29:08,611 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud/meta-data (quiet=False)
2024-06-07 09:29:08,611 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud/vendor-data (quiet=False)
2024-06-07 09:29:08,611 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud/network-config (quiet=False)
2024-06-07 09:29:08,612 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/user-data (quiet=False)
2024-06-07 09:29:08,612 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/meta-data (quiet=False)
2024-06-07 09:29:08,612 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/vendor-data (quiet=False)
2024-06-07 09:29:08,612 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/network-config (quiet=False)
2024-06-07 09:29:08,613 - subp.py[DEBUG]: Running command ['blkid', '-tTYPE=vfat', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-06-07 09:29:08,661 - subp.py[DEBUG]: Running command ['blkid', '-tTYPE=iso9660', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-06-07 09:29:08,677 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL=CIDATA', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-06-07 09:29:08,693 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL=cidata', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-06-07 09:29:08,708 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL_FATBOOT=cidata', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-06-07 09:29:08,723 - DataSourceNoCloud.py[DEBUG]: Seed from http://<domain>/api/service/v1/cloud-init/ not supported by DataSourceNoCloud [seed=None][dsmode=net]
2024-06-07 09:29:08,723 - __init__.py[DEBUG]: Datasource DataSourceNoCloud [seed=None][dsmode=net] not updated for events: boot-new-instance
2024-06-07 09:29:08,724 - handlers.py[DEBUG]: finish: init-local/search-NoCloud: SUCCESS: no local data found from DataSourceNoCloud
2024-06-07 09:29:08,724 - main.py[DEBUG]: No local datasource found
2024-06-07 09:29:10,303 - util.py[DEBUG]: Cloud-init v. 23.4-7.0.1.el9_4 running 'init' at Fri, 07 Jun 2024 09:29:10 +0000. Up 10.45 seconds.
2024-06-07 09:29:10,304 - main.py[INFO]: PID [1] started cloud-init.
2024-06-07 09:29:10,304 - main.py[DEBUG]: No kernel command line url found.
2024-06-07 09:29:10,304 - main.py[DEBUG]: Closing stdin.
2024-06-07 09:29:10,305 - util.py[DEBUG]: Writing to /var/log/cloud-init.log - ab: [640] 0 bytes
2024-06-07 09:29:10,306 - subp.py[DEBUG]: Running command ['ip', '--json', 'addr'] with allowed return codes [0] (shell=False, capture=True)
2024-06-07 09:29:10,314 - subp.py[DEBUG]: Running command ['ip', '-o', 'route', 'list'] with allowed return codes [0] (shell=False, capture=True)
2024-06-07 09:29:10,318 - subp.py[DEBUG]: Running command ['ip', '--oneline', '-6', 'route', 'list', 'table', 'all'] with allowed return codes [0, 1] (shell=False, capture=True)
2024-06-07 09:29:10,331 - handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust]
2024-06-07 09:29:10,331 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
2024-06-07 09:29:10,334 - stages.py[DEBUG]: no cache found
2024-06-07 09:29:10,335 - handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: no cache found
2024-06-07 09:29:10,335 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance
2024-06-07 09:29:10,348 - stages.py[DEBUG]: Using distro class <class 'cloudinit.distros.rhel.Distro'>
2024-06-07 09:29:10,349 - __init__.py[DEBUG]: Looking for data source in: ['NoCloud', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM', 'NETWORK']
2024-06-07 09:29:10,361 - __init__.py[DEBUG]: Searching for network data source in: ['DataSourceNoCloudNet', 'DataSourceNone']
2024-06-07 09:29:10,362 - handlers.py[DEBUG]: start: init-network/search-NoCloudNet: searching for network data from DataSourceNoCloudNet
2024-06-07 09:29:10,362 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceNoCloud.DataSourceNoCloudNet'>
2024-06-07 09:29:10,362 - __init__.py[DEBUG]: Update datasource metadata and network config due to events: boot-new-instance
2024-06-07 09:29:10,368 - __init__.py[DEBUG]: Datasource type DataSourceNoCloudNet [seed=None][dsmode=net] is not detected.
2024-06-07 09:29:10,368 - __init__.py[DEBUG]: Datasource DataSourceNoCloudNet [seed=None][dsmode=net] not updated for events: boot-new-instance
2024-06-07 09:29:10,368 - handlers.py[DEBUG]: finish: init-network/search-NoCloudNet: SUCCESS: no network data found from DataSourceNoCloudNet
2024-06-07 09:29:10,368 - handlers.py[DEBUG]: start: init-network/search-None: searching for network data from DataSourceNone
2024-06-07 09:29:10,368 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceNone.DataSourceNone'>
2024-06-07 09:29:10,369 - __init__.py[DEBUG]: Update datasource metadata and network config due to events: boot-new-instance
2024-06-07 09:29:10,369 - __init__.py[DEBUG]: Detected platform: DataSourceNone. Checking for active instance data
Thanks, @thannaskem, for reporting. Could you please attach the results of cloud-init collect-logs? Making sure there is no sensitive data in it.
Just to add this issue also occurs in the exactly same way when running on a freshly installed CentOS 8 Stream
@thannaske I think that this was fixed already in the 24.1 release, can you please test that and report back?
I have same problem with Ubuntu 24.04 LTS (vmdk cloud image) with cloud-init 24.1.3-0ubuntu3.3
@rbm0407 could you please run cloud-init collect-logs and attach the results, making sure there is no sensitive data within them?
@rbm0407 could you please run
cloud-init collect-logsand attach the results, making sure there is no sensitive data within them?
I had the same problem with Ubuntu 20.04 too (24.1.3-0ubuntu1 20.04.5), downgrading to 23.1.2-0ubuntu0~20.04.2 resolved the problem for 20.04
Thanks, @rbm0407, for adding extra logs.
I confirm I can reproduce this issue with: QEMU and latest (serial 20240627) oracular image from https://cloud-images.ubuntu.com/oracular/current/oracular-server-cloudimg-amd64.img by:
Mounting the image and adding:
$ cat /etc/cloud/cloud.cfg.d/99-test.cfg
datasource_list: [ NoCloud, None ]
datasource:
NoCloud:
seedfrom: http://0.0.0.0:8000/
password: password
chpasswd:
expire: False
Launch with:
qemu-system-x86_64 \
-net nic \
-net user \
-machine accel=kvm:tcg \
-cpu host \
-m 512 \
-nographic \
-hda ./oracular-server-cloudimg-amd64.img \
-smbios type=1
Log:
[ 11.674851] cloud-init[861]: Cloud-init v. 24.2~5g8c396a4b-0ubuntu1 running 'modules:config' at Tue, 16 Jul 2024 11:41:51 +0000. Up 11.64 seconds.
2024-07-16 11:41:46,260 - util.py[DEBUG]: Cloud-init v. 24.2~5g8c396a4b-0ubuntu1 running 'init-local' at Tue, 16 Jul 2024 11:41:46 +0000. Up 6.23 seconds.
2024-07-16 11:41:46,260 - main.py[INFO]: PID [1] started cloud-init.
2024-07-16 11:41:46,260 - main.py[DEBUG]: No kernel command line url found.
2024-07-16 11:41:46,260 - main.py[DEBUG]: Closing stdin
2024-07-16 11:41:46,262 - util.py[DEBUG]: Writing to /var/log/cloud-init.log - ab: [640] 0 bytes
2024-07-16 11:41:46,262 - util.py[DEBUG]: Changing the ownership of /var/log/cloud-init.log to 102:4
2024-07-16 11:41:46,262 - util.py[DEBUG]: Writing to /var/lib/cloud/data/python-version - wb: [644] 4 bytes
2024-07-16 11:41:46,263 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance/boot-finished
2024-07-16 11:41:46,263 - handlers.py[DEBUG]: start: init-local/check-cache: attempting to read from cache [check]
2024-07-16 11:41:46,263 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
2024-07-16 11:41:46,269 - subp.py[DEBUG]: Running command ['blkid', '-tTYPE=vfat', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,330 - subp.py[DEBUG]: Running command ['blkid', '-tTYPE=iso9660', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,382 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL=CIDATA', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,439 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL=cidata', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,487 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL_FATBOOT=cidata', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,330 - subp.py[DEBUG]: Running command ['blkid', '-tTYPE=iso9660', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,382 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL=CIDATA', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,439 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL=cidata', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,487 - subp.py[DEBUG]: Running command ['blkid', '-tLABEL_FATBOOT=cidata', '-odevice'] with allowed return codes [0, 2] (shell=False, capture=True)
2024-07-16 11:41:46,538 - DataSourceNoCloud.py[DEBUG]: self.supported_seed_starts: ('/', 'file://')
2024-07-16 11:41:46,538 - DataSourceNoCloud.py[DEBUG]: Seed from http://0.0.0.0:8000/ not supported by DataSourceNoCloud [seed=None][dsmode=net]
2024-07-16 11:41:46,538 - sources[DEBUG]: Datasource DataSourceNoCloud [seed=None][dsmode=net] not updated for events: boot-new-instance
2024-07-16 11:41:46,538 - handlers.py[DEBUG]: finish: init-local/search-NoCloud: SUCCESS: no local data found from DataSourceNoCloud
I have applied this patch for introspection:
diff --git a/cloudinit/sources/DataSourceNoCloud.py b/cloudinit/sources/DataSourceNoCloud.py
index a7d3f3adf..548e68479 100644
--- a/cloudinit/sources/DataSourceNoCloud.py
+++ b/cloudinit/sources/DataSourceNoCloud.py
@@ -162,6 +162,7 @@ class DataSourceNoCloud(sources.DataSource):
if "seedfrom" in mydata["meta-data"]:
seedfrom = mydata["meta-data"]["seedfrom"]
seedfound = False
+ LOG.debug("self.supported_seed_starts %s", self.supported_seed_starts)
for proto in self.supported_seed_starts:
if seedfrom.startswith(proto):
seedfound = proto
and the log spits:
2024-07-16 11:41:46,538 - DataSourceNoCloud.py[DEBUG]: self.supported_seed_starts: ('/', 'file://')
Something is off while loading DatasourNoCloud in net mode.
I just tested this using Qemu and an image with Cloud-init 24.1.3-0ubuntu1~22.04.5 and could not reproduce the issue that you've reported - when the serial is set, cloud-init detects that it is using DataSourceNoCloudNet correctly.
So cloud-init properly detects it's running in a NoCloud environment
No, it is checking whether it is running in a NoCloud environment - and at this stage it is only checking for local datasource files and so it ignores your seedfrom url. Since it doesn't find any local seed files it moves on since it needs the network to be up (which will happen later) before it can use seed urls.
DataSourceNoCloud.py[DEBUG]: Seed from http://<domain>/api/service/v1/cloud-init/ not supported by DataSourceNoCloud [seed=None][dsmode=net]
This is just a DEBUG message which is printed before the network is up when cloud-init is checking to see if a local nocloud configuration exist. We've had multiple users file reports pointing to this as the issue, so I'm going to update the nocloud logging to try to make it more clear what is happening. Please ignore this red herring.
The logs later in your report that NoCloudNet is not detected, which is the actual issue that occurred.
2024-06-07 09:29:10,368 - __init__.py[DEBUG]: Datasource type DataSourceNoCloudNet [seed=None][dsmode=net] is not detected.
From the log you shared, I see that in ds-identify.log the product serial wasn't set:
DMI_PRODUCT_SERIAL=VMware-42 17 a5 73 8d f3 85 87-9f fd 4f cd 09 16 4b 8d
On a working instance I see:
DMI_PRODUCT_SERIAL=ds=nocloud;s=http://10.0.2.2:8000/
In your logs, indeed I see that cloud-init found a url, but since it didn't come from the DMI, I'm unsure how it was received. @thannaske did you also have a config file in /etc/cloud/cloud.cfg.d/ like @aciba90's example had? I can see in the kernel log that the url didn't come from your kernel command line.
If you actually set your DMI, this shouldn't be an issue.
@aciba90 Nice job finding a reproducer. However, it looks like the example you provided didn't actually reproduce the exact behavior despite demonstrating similar symptoms. When detecting the datasource with a config file, a single datasource in the datasource_list value is required for cloud-init to select this datasource. We can probably safely change it to positively detect nocloud if a seedfrom url is provided. Assuming that this is what @thannaske was doing, this would resolve their issue as well.