bugs icon indicating copy to clipboard operation
bugs copied to clipboard

Guestinfo configuration applied in wrong order

Open theit8514 opened this issue 6 years ago • 4 comments

Issue Report

Bug

When applying a guestinfo configuration in VMware, CoreOS attempts to connect to the guestinfo.coreos.config.url before configuring the interface in guestinfo.interface.0.ip.0.address. If it matters, this host is getting an IPv6 configuration via autoconfig but not an IPv4 DHCP address.

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
VERSION=2247.6.0
VERSION_ID=2247.6.0
BUILD_ID=2019-11-06-2138
PRETTY_NAME="Container Linux by CoreOS 2247.6.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

What hardware/cloud provider/hypervisor is being used to run Container Linux? VMware ESXi 6.7 U2

Expected Behavior

CoreOS should configure the network prior to fetching guestinfo.coreos.config.url or retry the fetch after the network has been configured if the previous fetch failed.

Actual Behavior

coreos-cloudinit[659]: 2019/11/08 07:10:48 Unable to fetch data: Get https://host/coreos/path/user_data: dial tcp: lookup host: no such host

Reproduction Steps

  1. Deploy VMware OVA, configuring the URL and static IP on a VM Network that has no DHCP.
  2. Boot the VM.
  3. CoreOS attempts to download the URL 15 times, then configures the static IP. The URL is not fetched or loaded.

Other Information

vApp environment xml:

<?xml version="1.0" encoding="UTF-8"?>
<Environment
     xmlns="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
     xmlns:ve="http://www.vmware.com/schema/ovfenv"
     oe:id=""
     ve:vCenterId="vm-6587">
   <PlatformSection>
      <Kind>VMware ESXi</Kind>
      <Version>6.7.0</Version>
      <Vendor>VMware, Inc.</Vendor>
      <Locale>en</Locale>
   </PlatformSection>
   <PropertySection>
         <Property oe:key="guestinfo.coreos.config.data" oe:value=""/>
         <Property oe:key="guestinfo.coreos.config.data.encoding" oe:value=""/>
         <Property oe:key="guestinfo.coreos.config.url" oe:value="https://host/coreos/path/user_data"/>
         <Property oe:key="guestinfo.dns.server.0" oe:value="192.168.0.41"/>
         <Property oe:key="guestinfo.dns.server.1" oe:value="192.168.0.42"/>
         <Property oe:key="guestinfo.hostname" oe:value="docker-node2"/>
         <Property oe:key="guestinfo.interface.0.dhcp" oe:value="no"/>
         <Property oe:key="guestinfo.interface.0.ip.0.address" oe:value="192.168.0.32/24"/>
         <Property oe:key="guestinfo.interface.0.mac" oe:value="00:50:56:96:5a:ca"/>
         <Property oe:key="guestinfo.interface.0.name" oe:value=""/>
         <Property oe:key="guestinfo.interface.0.role" oe:value="public"/>
         <Property oe:key="guestinfo.interface.0.route.0.destination" oe:value="0.0.0.0/0"/>
         <Property oe:key="guestinfo.interface.0.route.0.gateway" oe:value="192.168.0.1"/>
   </PropertySection>
   <ve:EthernetAdapterSection>
      <ve:Adapter ve:mac="00:50:56:96:5a:ca" ve:network="VM Network" ve:unitNumber="7"/>
   </ve:EthernetAdapterSection>
</Environment>

theit8514 avatar Nov 08 '19 07:11 theit8514

@theit8514 thanks for the report. The configuration you are trying to apply is for cloud-init, which is known to have such race issues and has been superseded by Ignition.

The recommendation is to follow https://coreos.com/os/docs/latest/booting-on-vmware.html#defining-the-ignition-config-in-guestinfo and stick to guestinfo.coreos.config.data (plus .encoding) to pass your Ignition configuration.

lucab avatar Nov 08 '19 09:11 lucab

Thanks for the reply. The only thing I didn't state from the above is that these settings were generated by the CoreOS OVA file.

I have tried switching over to the guestinfo.coreos.config.data but until just now I had no luck getting it to work. Typing out the process here, I realized that I was base64 encoding the cloud-config instead of the ignition transpiled file. Now cloudinit recognizes an ignition file with this log entry: Detected an Ignition config. Exiting...

I am still seeing a race condition with my config: The IP address is not pingable and Ignition is attempting to fetch the config but failing now due to systemd-resolved "connection refused". It also seems to be stuck in an infinite boot loop waiting for ignition to complete.

{
   "systemd" : {},
   "passwd" : {},
   "storage" : {},
   "ignition" : {
      "security" : {
         "tls" : {}
      },
      "timeouts" : {},
      "config" : {
         "replace" : {
            "verification" : {},
            "source" : "https://host/coreos/path/user_data"
         }
      },
      "version" : "2.2.0"
   },
   "networkd" : {
      "units" : [
         {
            "name" : "00-primary.network",
            "contents" : "[Match]\nMACAddress=00:50:56:96:5a:ca\n\n[Network]\nDNS=192.168.0.41\nDNS=192.168.0.42\nAddress=192.168.0.32/24\nGateway=192.168.0.1\n"
         }
      ]
   }
}

Screenshot from 2019-11-09 10-38-38

I have tried both config>replace and config>append.

theit8514 avatar Nov 09 '19 15:11 theit8514

I found this document stating that my use case is not supported and to use kernel ip= parameters: https://coreos.com/ignition/docs/latest/network-configuration.html#using-static-ip-addresses-with-ignition

Attempting to do so (ip=192.168.0.31::192.168.0.1:255.255.255.0::ens192:none:192.168.0.41:192.168.0.42) still fails with the above config. Has a similar error trying to resolve the domain with systemd-resolved and attempts to download infinitely.

theit8514 avatar Nov 09 '19 20:11 theit8514

@theit8514 at an high level, you are expecting Ignition to fetch and apply a configuration to change the initramfs (i.e. the environment where Ignition is running), in order to influence how Ignition fetches its configuration. This is a fundamental chicken-egg problem which (by design) cannot be solved by Ignition alone.

There are have a few ways out of that:

  • set up DHCP on the network for the main interface of the VM. This will result in a proper network environment in the initramfs, which can be used by Ignition to perform remote fetches.
  • pass the whole configuration via guestinfo.coreos.config.data. This will completely avoid network fetches in the initramfs.
  • setting up static network via kernel arguments (as in your last comment). This should work but is a bit unwieldy compared to the other ways, and it needs to be performed at boot-loading time on first-boot.

lucab avatar Nov 11 '19 16:11 lucab