bugs failed to fetch config: not a config (empty) when using terraform

I first thought that this might be a bug of terraform like reported in https://github.com/terraform-providers/terraform-provider-digitalocean/issues/100, but after investigating a bit further it seems that the ignition file is actually properly supplied to the Droplet.

This is my ignition file

{
    "ignition": {
        "config": {},
        "security": {
            "tls": {}
        },
        "timeouts": {},
        "version": "2.2.0"
    },
    "networkd": {},
    "passwd": {},
    "storage": {
    "files": [{
            "filesystem": "root",
            "path": "/etc/ssh/sshd_config",
            "contents": {
                "source": "data:,%23%20Use%20most%20defaults%20for%20sshd%20configuration.%0AUsePrivilegeSeparation%20sandbox%0ASubsystem%20sftp%20internal-sftp%0AUseDNS%20no%0A%0APermitRootLogin%20no%0AAllowUsers%20core%0AAuthenticationMethods%20publickey%0A",
                "verification": {}
            },
            "mode":384
        },
        {
            "filesystem": "root",
            "path": "/etc/sysctl.d/vm.conf",
            "contents": {
                "source": "data:,vm.max_map_count%3D262144%0A",
                "verification": {}
            },
            "mode":420
        },
        {
            "filesystem": "root",
            "path": "/etc/coreos/update.conf",
            "contents": {
                "source": "data:,%0AREBOOT_STRATEGY%3D%22etcd-lock%22",
                "verification": {}
            },
            "mode":420
        }]
    },
    "systemd": {
        "units": [{
            "dropins": [{
                "contents": "[Unit]\nRequires=coreos-metadata.service\nAfter=coreos-metadata.service\n\n[Service]\nEnvironmentFile=/run/metadata/coreos\nExecStart=\nExecStart=/usr/lib/coreos/etcd-wrapper $ETCD_OPTS \\\n  --listen-peer-urls=\"http://${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0}:2380\" \\\n  --listen-client-urls=\"http://0.0.0.0:2379,http://0.0.0.0:4001\" \\\n  --initial-advertise-peer-urls=\"http://${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0}:2380\" \\\n  --advertise-client-urls=\"http://${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0}:2379,http://${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0}:4001\" \\\n  --discovery=\"https://discovery.etcd.io/43890d7ee9635609b84a067bd7214623\"",
                "name": "20-clct-etcd-member.conf"
            }],
            "enable":true,
            "name": "etcd-member.service"
        },
        {
            "dropins": [{
                "contents": "[Socket]\nListenStream=\nListenStream=12345\n",
                "name": "10-sshd-port.conf"
            }],
            "name": "sshd.socket"
        },
        {
            "contents": "[Unit]\nDescription=DockerPS\nAfter=docker.service\nRequires=docker.service\n\n[Service]\nType=oneshot\nExecStart=-/usr/bin/docker ps\n\n[Install]\nWantedBy=multi-user.target\n",
            "enabled":true,"name": "dockerps.service"
        }]
    }
}

If I supply it as user_data via the DigitalOcean UI it works.

If I supply it via terraform it doesn't.

This is the output from journalctl --identifier=ignition --all on a manually provisioned Droplet via the Digitalocean UI (supplying user_data in the respective field):

[…]
Jun 12 14:04:04 localhost ignition[405]: GET http://169.254.169.254/metadata/v1/user-data: attempt hashicorp/terraform#6
Jun 12 14:04:04 localhost ignition[405]: GET result: OK
Jun 12 14:04:04 localhost ignition[405]: parsing config: {"ignition":{"config":{},"security":{"tls":{}},"timeouts":{},"version":"2.2.0"},"networkd":{},"passwd":{},"storage":{"files":[{"filesystem":"root",
[…]

This is the output from journalctl --identifier=ignition --all on a Droplet that was provisioned using terraform (with the user_data directive):

Jun 12 13:37:24 localhost ignition[415]: GET http://169.254.169.254/metadata/v1/user-data: attempt hashicorp/terraform#6
Jun 12 13:37:24 localhost ignition[415]: GET result: OK
Jun 12 13:37:24 localhost ignition[415]: parsing config:
Jun 12 13:37:24 localhost ignition[415]: failed to fetch config: not a config (empty)
Jun 12 13:37:24 localhost ignition[415]: not a config (empty): ignoring user-provided config

If I ssh into the droplet and curl http://169.254.169.254/metadata/v1/user-data myself I get back the ignition config posted above.

So in theory this should work, but it doesn't.

Jun 12 '18 18:06 jaschaio

Hi! Thanks for the report. Unfortunately, I can't seem to reproduce this issue. I used your issue on terraform-provider-digitalocean to try and recreate a terraform config that would have the same issue. This is what I came up with -

# main.tf
provider "digitalocean" {
  token = "${chomp(file("~/.config/digital-ocean/token"))}"
}

resource "digitalocean_droplet" "test" {
  image = "coreos-stable"
  size = "s-1vcpu-2gb"
  count = 1
  name = "sdemos-test"
  region = "sfo2"
  private_networking = true
  resize_disk = false
  user_data = "${file("config/ignition")}"
  ssh_keys = ["46:9e:68:ed:21:b0:1e:76:75:16:6d:e1:3a:a8:49:6a"]
}

I left out the provisioner "file" and provisioner "remote-exec" sections since you mentioned omitting those didn't fix the issue.

# config/ignition
{
  "ignition": {
    "config": {},
    "security": {
      "tls": {}
    },
    "timeouts": {},
    "version": "2.2.0"
  },
  "networkd": {},
  "passwd": {
    "users": [
      {
        "name": "core",
        "sshAuthorizedKeys": [
          "ssh-rsa [omitted] demos@anduin"
        ]
      }
    ]
  },
  "storage": {
    "files": [
      {
        "filesystem": "root",
        "path": "/home/core/test",
        "contents": {
          "source": "data:,Hello%2C%20world!",
          "verification": {}
        },
        "mode": 420
      }
    ]
  },
  "systemd": {}
}

Then I just used terraform apply and it successfully created a droplet with the expected configuration. It shows up as a machine running the current coreos-stable image on DO (1745.6.0). Are you doing something else on top of that, either in your terraform configuration or your invocation, that might cause something to be configured differently?

Jun 12 '18 20:06 sdemos

Thanks for your quick reply.

Your config worked for me as well. So I took another look at my terraform config to see what might be causing the issue and I think its the Following:

I first build an image based on coreos-beta with packer

Instead of than using image = "coreos-beta" in my terraform droplet ressource, I use the image ID I got back from packer.

Is it possible that I can only ever supply user_data and make use of it on the inital provision of a Droplet? So when I create the packer image without user_data (or ignition config) and than try to create a new droplet based on that image, this time with a custom ignition config it ignores it?

What I am basically trying to do is to make the droplet creation faster by doing some configuration in the packer image. I initially had two ignition files, one for packer and one for terraform. The packer ignition file made general changes to the system which would apply to all droplets. The terraform ignition file initiated the etcd cluster. I couldn't add the etcd part to the packer ignition file as the IP of the droplet which is used by packer to build the image shouldn't be added to etcd and the IPs are only known after a droplet has been created through terraform. I think that the second ignition file supplied through terraform was never applied so thats why I moved all over to terraform.

It seems that in the combination of using packer & terraform only one of them can have a ignition file that will be applied upon boot

Jun 13 '18 05:06 jaschaio

I don't have experience with packer, but from my understanding of what you are saying, it sounds like you are provisioning a CoreOS machine with packer, which then takes a snapshot of the resulting machine, and lets you boot from that snapshot?

Ignition only runs once on a particular machine, only on the very first boot. After that, if you make changes to your ignition config and want them applied to your machine, you have to reprovision it from scratch. If packer is provisioning a machine, then terraform is using that already provisioned machine, ignition will never run again, and so no ignition configuration that you provide later will ever be applied.

Most of the time, terraform is aware of this constraint, and if you run terraform apply with a different ignition config, it will reprovision all your machines for you. However, if it thinks it's starting with a fresh image, it has no way to know about that.

Jun 13 '18 20:06 sdemos

Ran into this and came across this issue as well as https://github.com/coreos/bugs/issues/2090. It's not perfect but it seems like the following workaround causes Ignition to run as expected the second time:

sudo touch /boot/coreos/first_boot
sudo rm /etc/machine-id

When I encountered this I was building an image (for AWS) with Packer. I wasn't provisioning that image with Ignition but hit this anyway when provisioning via Terraform/user_data.

I haven't had a chance to track down why it's parsed as an empty response resulting in the confusing error.

Jun 22 '18 21:06 bendrucker

I have also hit this error, specifically it occurs when an image based on coreos is provisioned using packer.

The ignition logs have a datetime that matches with the datetime of the packer run but this is easy to miss.

The user data fetch is empty in the logs because no user data was provided to the packer builder vm, which is the vm ignition ran on.

I don't think this is a bug of any kind in coreos, but rather a mismatch of expectations for packer users who provision using some other method and then expect ignition provisioning to be an available option upon launching the packer provisioned image.

It seems some documentation on how to force a reprovisioning is missing, or I have missed it. I think this documentation would essentially solve the issue.

Sep 25 '19 23:09 brthor

It seems some documentation on how to force a reprovisioning is missing, or I have missed it. I think this documentation would essentially solve the issue.

The instructions in https://github.com/coreos/bugs/issues/2456#issuecomment-399585191 are correct, but we don't generally encourage reprovisioning of already-provisioned machines.

Sep 25 '19 23:09 bgilbert

I think the use-case could be better stated as building a custom coreos image, then provisioning it later.

Except the custom coreos image part fires ignition, rather than the desired provisioning part.

Agreed though, the above instructions worked for me.

If you see value in the use-case from the coreos perspective, it may be worth a mention in the official docs, it took me some time to land on this page.

Sep 26 '19 02:09 brthor