deploykit Where does infrakit keep info about instances

I'm implementing an instance plugin for ProfitBricks.

I've been looking at two implementations one is infrakit.aws and the other one is infrakit-plugin-sakuracloud the only difference i've noticed between these two is that sakuracloud implementation is persisting data about its instances in a file. AWS implementation doesn't have that (or maybe I haven't seen it).

What I'm looking for is a way to keep track of instances my provider has provisioned even if infrakit has been restarted. What would be recommended practice for something like this or does infrakit handle this?

Nov 15 '16 11:11 jasmingacic

Instance tracking is the responsibility of the instance plugin. I would not recommend storing instance state in a file, for the reason you note in addition to high availability. In the case of infrakit.aws, we lean on the AWS API as the source of truth for existing instances; and assume that infrastructure provider APIs have similar support of tagging instances and querying for a list of instances.

Nov 15 '16 15:11 wfarner

Unfortunately the API we are using doesn't support tagging. All we can do is query the API for the list of datacenters, servers and various other objects.

Does infrakit keep track of any data related to the instances? What would you recommend that we do for the use case above?

Nov 15 '16 16:11 jasmingacic

You could configure your plugin to accept a globally-accessible store, e.g. S3. On the InfraKit side, we're starting to add persistent state, but it's unclear if/when/how that would be exposed to arbitrary plugins.

Nov 15 '16 16:11 wfarner

What would be difference for infrakit to read information from S3 or from a local file? Both are basically there to provide information about existing infrastructure with the difference how you are accessing each.

Nov 15 '16 16:11 jasmingacic

I'm thinking of S3 as a benefit to provide high availability and better durability than a local file.

Nov 15 '16 17:11 wfarner

So essentially there is no difference as long that it is accessible? Is it safe to say that infrakit is ready for third party plugins like we are developing or is it too early?

Nov 15 '16 17:11 jasmingacic

So essentially there is no difference as long that it is accessible?

From the perspective of the group plugin, it's ignorant of what state is used for DescribeInstances() - so yeah, anything will do.

However, thinking this over more, i wonder if we should explore having Group plugin store Instance records on behalf of the Instance plugin. Not necessarily exposing a storage API, but maintaining them based on the result of Provision() and Destroy() calls.
@chungers any thoughts on that?

Is it safe to say that infrakit is ready for third party plugins like we are developing or is it too early?

Good question. Now is a great time to start experimenting and help us flesh out APIs and design assumptions, as you are now (thank you!). Given that we're actively changing APIs and tweaking the design, i can understand how it might be moving too quickly for some appetites.

Nov 15 '16 17:11 wfarner

I agree with what you said Plugin is an interface it would be fairly simple (I'm only assuming) to utilize it's output to persist those information. Something similar what docker-machine and terraform are doing. No need to expose storage API, since it is easier to have one format throughout the plugins than for each plugin to persist information as they deem necessary.

Nov 15 '16 18:11 jasmingacic

The manager, which exposes a Group interface but provides storage and leader detection for HA, can be a good example for how to go about solving this.

We can implement an Instance plugin that can provide instance-metadata storage for instance plugins for platforms that do not support tags. Essentially this instance plugin will implement the DescribeInstances but calls the actual instance plugin's Provision to provision. The user could configure this to use something like FUSE over S3 to achieve HA.

It's important to note that as far as the InfraKit group plugin is concerned, the data stored by the plugin is the master of records. The Group plugin will call the instance plugin as usual and aggregate information from difference sources to compute the true state of the infrastructure.

Nov 15 '16 21:11 chungers

@jasminSPC a little late to the discussion, but it seems that a request to the DataCenter API will provide you with the data you need. GET /datacenters/{datacenterId} Returns details about the datacenter with the state being part of the response.

Per the docs:

AVAILABLE There are no pending modification requests for this item; 
BUSY There is at least one modification request pending and all following requests will be queued; 
INACTIVE Resource has been de-provisioned.

The InfraKit plugin should be able to infer what needs to be done based on the datacenter state and the entities attached to it.

NB: I haven't done any dev work with ProfitBricks, all of the above is inferred from their API and its possible use.

Nov 28 '16 21:11 FrenchBen

@FrenchBen A virtual datacenter is a logical container for all the objects that ProfitBricks users are going to create. So, yes indeed GET /datacenters/{datacenterId} can provide a lot of data, but in most cases users will have multiple servers in one datacenter. One of the reasons is that only servers within the same datacenter are able to communicate between each other via private LAN. Also the sate of datacenter can mean many different things, such as creation of a loadbalancer and similar . Secondly if a single datacenter that is monitored and it is to be terminated to reduce the group size to then the other resources that are not controlled by InfraKit would be removed.

So what we did at the end is, inspired by Terraform, each time a server instance is created we create a instance id + .pbstate file which contains instance.Description. When DescribeInstances is called we pull all .pbstate files and check if there is such instance with ProfitBricks, if it doesn't then we continue to the next one.

Also as a required filed in the configuration file one of the required fields is DatacenterId so all the resources provisioned by InfraKit are in the same datacenter.

Nov 28 '16 23:11 jasmingacic

@jasminSPC Will you open source the plugin? It'd be neat to see how this was built and possibly allow others to be inspired by it.

Nov 29 '16 00:11 FrenchBen

@FrenchBen https://github.com/profitbricks/infrakit-instance-profitbricks There you go. There are many pieces that could be cleaned up but this is the general idea.

Nov 29 '16 00:11 jasmingacic