sup Error handling suggestion

The problem: Solve littering and easing cleanup scenarios in case of failed deployment. (e.g. due to dry run failing.)

The proposed solution: Consider

networks:
  production:
    host:
      - 192.168.0.1
commands:
  fetch:
    desc: Fetch the build
    run: wget http://s3.amazonaws.com/my/src/code.tgz -O /tmp/code.tgz .tgz
    ensure: rm -f /tmp/code.tgz
  extract:
    desc: Extract the build
    run: cd /app/releases && tar -xvzpf /tmp/code.tgz 
    on-error: rm -rf /app/releases/code
  dryrun:
    desc: Do a dry run
    run: /app/releases/code/dry-run.sh
  start:
    desc: start the app
    run: 
targets:
  deploy:
    - fetch
    - extract
    - dryrun 
    - start

ensure scripts would be always be run after all commands have run. They are run in reverse order.

on-error scripts would be run only if a command failed. They are run in reverse order.

Objections and discussions are welcome.

Dec 15 '15 13:12 stengaard

Awesome idea, something like a rollback would be cool as well. Maybe we can use a target name for a rollback.

networks:
  production:
    host:
      - 192.168.0.1
commands:
  fetch:
    desc: Fetch the build
    run: wget http://s3.amazonaws.com/my/src/code.tgz -O /tmp/code.tgz .tgz
    ensure: rm -f /tmp/code.tgz
  extract:
    desc: Extract the build
    run: cd /app/releases && tar -xvzpf /tmp/code.tgz 
    rollback: cleanup
  dryrun:
    desc: Do a dry run
    run: /app/releases/code/dry-run.sh
  remove_tmp:
    desc: Remove 
    run: rm -rf /tmp/build
  cleandb:
    desc: Clean my db
    run: dbcommand --remove-last-migration
  start:
    desc: start the app
    run: 
targets:
  deploy:
    - fetch
    - extract
    - dryrun 
    - start

 cleanup:
    - remove_tmp
    - cleandb

Dec 15 '15 13:12 eduardonunesp

good ideas, but we need to make sure that the semantics are consistent and clear. For a "command" we now accept:

desc
run
script
local
serial
once

just something to consider, we should keep this list as minimal as possible and as intuitive as possible.

Dec 15 '15 14:12 pkieltyka

@eduardonunesp - When is rollback invoked? Only on errors? Is it a special command sup production rollback (a default capistrano command)? I tend to think the term is slightly overloaded and hence not very clear. Just my 2 cents. But maybe your approach has a simpler model in that you don't mix things at the command level with things at the target level (except for pointing at it).

@pkieltyka Agree - as this list grows the complexity also grows. Any suggestions? I really do think error handling is needed somehow.

Dec 15 '15 14:12 stengaard

@stengaard indeed my idea added some complexity, on-error looks good as well, my major concern is to clean up the deploy when something goes wrong.

@pkieltyka the list looks good and small, also a fallback like on-error is something that is missing

maybe the command should be onerror just to keep pattern

Dec 15 '15 15:12 eduardonunesp

I like the concept of onerror roll-backs :+1:

But I'm missing why we'd need ensure, though -- you can put ensuring commands right into your existing commands, like

commands:
  fetch:
    run: >
      curl http://s3.amazonaws.com/my/src/code.tgz | \
        tar -xvzp > /tmp/code || exit 1 && \
        test -f /tmp/code || exit 1

Dec 15 '15 16:12 VojtechVitek

The point of ensure was that they were run after all other commands are run in the target list. Irregardless of return codes. (Also: return code of ensure command should be ignored.)

So they are very handy for cleanup tasks that should always be run without chaining a lot of commands together. ensure: rm -rf /tmp/code

They should of course only be run if the command they are attached to has run.

Dec 15 '15 17:12 stengaard

You can use http://redsymbol.net/articles/bash-exit-traps/ for a clean-up on exit. I'm still not convinced we need ensure, since you can achieve the same thing easily using bash/sh. Let's keep the API clean unless we have a really strong consensus on a feature like that.

However, I like the onerror rollbacks, that's a great feature. That one makes sense, as it's not easily achievable by sequentially executed bash commands.

Dec 15 '15 19:12 VojtechVitek

I do agree with your views on API size, but I still feel I haven't explained the usefulness of ensure well enough. I'll give it another go. :)

The thing about ensure is that you could want to use the product of a command (e.g. /tmp/code) in a later command - but would then need a separate cleanup command.

Compare

commands:
  fetch:
    run: curl http://s3/build_script >  /tmp/file
  build:
    run: /tmp/file
  cleanup:
    run: rm /tmp/file
targets:
  deploy:
    - fetch
    - build
    - cleanup

To

commands:
  fetch:
    run: curl http://s3/build_script >  /tmp/file
    ensure: rm /tmp/file
  build:
    run: /tmp/file
targets:
  deploy:
    - fetch
    - build

The point I'm trying (badly) to get at is locality and readability. In the example the same piece of "code" that litters is charged with cleaning it up once it's done. Further you can't put the discrete commands together in a way that would litter.

Dec 16 '15 09:12 stengaard

Also: maybe the name ensure is horrible. I suck at naming things. If a had a dog it's name would be: dog. Or Arnie.

Dec 16 '15 09:12 stengaard

Haha dog :smile: ! Dog come here!

Apr 30 '16 16:04 pyrossh