Splitting pipeline.yml into several files via simple templating
Feature Request
What challenge are you facing?
Right now, pipeline.yml files are getting to big and also informations for e.g. one job is really spread over the whole file. Editing a job and finding it is pretty painful, even considering i use the visual-code concourse plugin with the outliner+search.
I miss a grouping of jobs and their resources into a file, which is then combined into e.g. a pipeline file
A Modest Proposal
I would propose, that we allow splitting the pipeline.yml into several files with a static structure for better oversight, understanding and a better editing experience.
A typical folder of that would look like
- mypipeline.pl.yml
- mypipeline/jobs/myjob1.job.yml
- mypipeline/jobs/myjob2.job.yml
- mypipeline/resource/resource1.resource.yml
- mypipeline/resource/resource2.resource.yml
Groups and resource-types should stay in the mypipeline.pl.yml.
The integration should go into fly directly, so when passing a pl.yml to fly, it compile it using the template engine.
Milestone 1: single-file
Every resource and every job must go in exactly one file, it is not allowed to have 2 resources in one *resource.yml file, same for the jobs.
Milestone 2: allow "bundles"
E.g. by using a template_engine: bundle in the mypipeline.pl.yml people can use the combining method. In this case, one can put several jobs and several resources into one file. Most probably the files will get an additional file ending myjob1.bundle.job.yml
Open Questions:
a) Should we rather separate by folder name:mypipeline/jobs/.. or by file suffix: myjob1.job.yml?
I am for the latter, while allowing people to create any subfolders they want - we scan recursively. This also allows for better IDE support
b
I would love to see something like this. I'm kind of tempted to suggest the Ansible route though, where you simply include the file. It makes the whole thing less opinionated.
include: file.yml
Perhaps you could also use wildcards too, so your solution would work as such:
resources:
include: ./resources/*.yml
jobs:
include: ./jobs/*.yml
Will re-usability then still work? Say i have two jobs:
- name: foo-develop
type: git
source: &repo-source
branch: develop
uri: foo-bar-url.git
username: {{git-user}}
password: {{git-pwd}}
- name: foo-master
type: git
source:
<<: *repo-source
branch: master
I like the file suffix since i allready have scripts and tasks ...
I've been thinking about this for a while now too, and would love to see something that helped split out files. Not sure it would be useful without variable substitution of some description tho (at least not for us).
Something like:
include:
file: include.yml
params:
env: qa
region: region-1
include:
file: include.yml
params:
env: ci
region: region-2
and that would take a file like include.yml
- task: Deploy
file: tasks/deploy.yml
params:
region: {.region}
environment: {.env}
component: grafana
and produce
- task: Deploy
file: tasks/deploy.yml
params:
region: region-1
environment: qa
component: grafana
- task: Deploy
file: tasks/deploy.yml
params:
region: region-2
environment: ci
component: grafana
Consequently, any first level constructs (jobs, groups, resources, resource-types) should be consolidated, so that you could include your required resource and/or group in the include file instead of having to know how that will present itself in the template file.
It would also be nice to be able to recursively define these. And perhaps even have a FOR Loop construct - although that's getting rather ninja.
Not sure just how easy this is, but I'd be willing to help bring it to completion. My current pipelines are getting pretty hairy, and for all the splitting out of tasks that we do, there's still a lot of duplication and boilerplate copy and paste, which is very subject to errors.
This is really needed. Personally I think include: should take a list of .yml files that are just regular pipeline configurations, and then merge them before running validate and set-pipeline. E.g., if we have:
pipeline.yml:
include:
- file: ./include.yml
- file: ...
resources:
- name: git-1
type: git
source: ...
jobs:
- name: some-task-1
plan: ...
And include.yml:
resources:
- name: git-2
type: git
source: ...
jobs:
- name: some-task-2
plan: ...
Setting pipeline.yml could merge in the contents of include.yml (and other files), appending to jobs, resources and so forth. Benefit being that pipeline definitions are loosely coupled, so include.yml can be 100% standalone and deployed as a separate pipeline as well.
EDIT: Had a quick look at fly and atc:
- Fly parses the pipeline definition based on a struct from the ATC.
- Adding
includesto the ATC config does not make a lot of sense since local files are not available to the ATC. (But maybe it's best to do that anyway).
Either way I spin the scenario above, some logic would end up in fly. I think the best case would be:
- Give the
atc.Configstruct a method toMergeorAppendwith another pipeline definition, which could be used infly. -
Includescould probably also go into theatc.Config. - Add logic to
flyto parse all files underIncludesand merge/append them before callingSet.
Having thought a bit more about this, and glanced over the current implementation:
Fly gets the "schema" for a pipeline definition straight from concourse/atc. Since includes: will never mean anything to the ATC, it would just be there to aid Fly in doing its job, which does not make a lot of sense to me. I think that instead we should just add support for a glob pattern (and/or allow passing -c multiple times) to fly:
fly -t main set-pipeline -p test -c */.ci/pipeline.yml
Pros:
- Only have to add a
Mergemethod toatc.Configto combine two pipeline definitions. - You can write standalone pipelines and combine them using the glob pattern.
Cons:
- Obfuscates things like
passed:if the job it should have passed is defined in another pipeline definition (could strictly enforce definitions to be valid on their own). - Can't reuse yaml anchors between files, but we could/should use config files instead?
@itsdalmo That still doesn't address the reuse problem. Splitting out the pipeline into different files, helps lower the cognitive threshold for each file, but still leaves the fact that you end up with the exact same config (now in multiple files) to address during any refactor or addition.
Maybe it belongs in another tool. Certainly other people have solved the problem with things like ejs. I don't like that, since it adds another step to workflow for deploying a pipeline, that can easily be forgotten. Something native to fly would be much more preferable.
Hi there, any update about this feature narrative?
Up this thread! this will be a very useful feature.
Seems like @EugenMayer worked on that issue: https://github.com/EugenMayer/concourse-pipeline-templateer. Probably a solution to this issue, finally?
We use gomplate to autogenerate and concat yaml files together for concourse (as well as other yaml files). It has many features that we find useful when manipulating yaml files for these types of situations.
Woah, this got a lot of thumbs-ups, so this deserves some attention!
This was something that @EugenMayer and I discussed in Discord a while back. I like it as a templating-system-neutral middle ground that lets you split up your giant YAMLs and make traversing and editing individual resources and jobs work more nicely with editors, shells, git, etc.
Simply splitting one file into many files makes it a lot easier to manage, and is a cheap UX win: now you can use a fuzzy-finder to edit a job (i.e. Ctrl+P in vim), now you can use git log to see when a job or resource has changed, now GitHub's UI will show that automatically, etc. You could potentially even use symlinks to share resource definitions across pipelines. (Although that sounds a little scary. With great power comes great responsibility, I suppose.)
I'm actually pretty excited to pull something like this into fly as client-side sugar. But to be honest, I really wouldn't want this feature it to be any more than that. I don't want to implement or adopt a templating system. Every time someone suggests that we adopt templating system X, the next day someone else suggests that we adopt templating system Y. :confused: I'm not exaggerating, this happens all the time. And I don't want to implement a templating system, because that's just another thing for new users to learn. It's more likely that those new users are familiar with one of the many existing templating systems out there, and if what they know isn't powerful enough, there is definitely another well-supported one they can try.
I think this matter is orthogonal to implementing an actual templating system, and we should instead make sure whatever we do here allows users to continue to template their pipelines using whatever tool they like. This better fits the UNIX philosophy and is more likely to keep everyone happy.
All it might take is allowing this:
fly -t ci set-pipeline --config-dir ./my-pipeline
...but also allowing this:
fly -t ci build-pipeline --config-dir ./my-pipeline | my-templating-tool | fly -t ci set-pipeline -c -
This build-pipeline command would just take all the jobs and resources from the given directory and spit out one big pipeline YAML (the same thing set-pipeline would do under the hood). We could change set-pipeline to explicitly support reading config from stdin. (You can already do this with process substitution shell syntax, but still.)
If that sounds good, I think we could prioritize this and implement it pretty quickly. I'm interested in hearing from folks who template their pipelines to make sure that this would mesh well. :slightly_smiling_face:
A really simple thing I tried to do when I first started writing pipelines was using task files that were in the same place as my pipeline definitions, instead of a file from an input. Being able to do something like the example below, where example.yml is in the same place as the pipeline file might support some of the simpler use cases here, although it is already possible to do something similar with YAML anchors.
I think set-pipeline being able to read from STDIN would be a really good improvement regardless - I already script all the pipeline configuration I use anyway.
plan:
- task: a
file: pipelines/jobs/example.yml
params: {foo: bar}
- task: b
file: pipelines/jobs/example.yml
params: {foo: baz}
I agree with @vito, that introducing a templeting system would do more harm than good. I am perfectly fine using a combination of aviator (and sometimes spruce directly) and yaml anchors. I prefer to use one templating standard (spruce) across everything yaml (helm charts for example which also have their own templating system).
The two features described above (read from stdin and --config-dir) would be helpful in my setup, removing the need to run fly in loops and possibly removing some of yaml merge steps.
It's possible to do this somewhat with variable files. For example, this works:
jobs/my_job.yml
my_job:
name: unit-tests
serial_groups: [unit-tests]
plan:
- get: my-repo
trigger: true
- task: my-task
etc...
pipeline.yml
jobs:
- ((my_job))
fly -t main -c pipeline.yml -p main -l jobs/my_job.yml
The problem is that while you can use Vault variables in my_job.yml, you cannot use other local variables. But if you're cool with putting all of your non-secret variables in Vault (which I don't agree with and stopped me from implementing this, but that's your business), this works.
On the other hand, because this does already work, a modified and less hacky version of this might be an easy way to go with developing a build-pipeline command.
Beep boop! This issue has been idle for long enough that it's time to check in and see if it's still important.
If it is, what is blocking it? Would anyone be interested in submitting a PR or continuing the discussion to help move things forward?
If no activity is observed within the next week, this issue will be ~~exterminated~~ closed, in accordance with our stale issue process.
'Projects' (concourse/rfcs#32) accomplish part of this by allowing resources to be split out and defined project-wide.
It doesn't replace this feature, since it adds additional semantics to those 'split out' resource definitions rather than being a pure templating implementation. I'm curious to see how much we'll still want this after all is said and done. :thinking:
I'm going to add this to the 'Projects' project but put it on the backburner for now. We may want to revisit at some point since it would need to be supported by however 'Projects' loads up pipeline configs.
Pipeline configuration templating/slicing would be really nice to have to decrease maintenance and increase reuse.
However, to keep configuration simple and lower configuration treshould it is appriciated to support templating feature in visual studio code Concourse Pipeline Editor as well. If having multipel files for your configuration you still want the editor to validate the aggregated pipeline to help you find miss configuration.
I would love componentization of pipeline jobs. It would make things much easier to understand for our customers and allow for easier maintenance.
We are looking for templating feature and allow multiple dev-ops teams to reuse the main template with cutom ops-files. Something similar to bosh flag like --ops-file Since, concourse will do interpolation nicely with credhub and aws secure secrets manager. This kind of configuration flag allows to apply multiple ops-files to single concourse main pipeline.
Waiting for this one
We are quite satisfied with aviator to template our Concourse pipelines.
https://github.com/herrjulz/aviator
@ringods thank you man, I'm looking into aviator
@ringods not sure if you're interested in this, but feel free to PR a guide on using aviator here: https://github.com/concourse/docs/blob/master/lit/docs/guides/pipeline-guides.lit
It would render in this section of the site: https://concourse-ci.org/how-to-guides.html
I'm adding a guide right now about templating with ytt since it's what I'm somewhat familiar with: https://github.com/concourse/docs/pull/442