Question/Feature request: how can I prepend an environment name to configured log groups at runtime
Scenario We have implemented a 'golden image' pipeline using AWS ImageBuilder that bakes in our base cloudwatch agent configuration (core logs files and system metrics),
e.g.
...
"logs": {
"log_stream_name": "${aws:InstanceId}",
"force_flush_interval" : 60,
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/syslog",
"log_group_name": "syslog",
"timestamp_format": "%b %d %H:%M:%S"
},
...
This ends up with a massive log group with streams of every instance we're running in the region. We want to be able to separate those out into say '/production/syslog' and '/staging/syslog'...
Work Around
Our current work-around is to run sed over the base config included in the golden image early in the boot order...
e.g. sed -E 's/"log_group_name": "(.*)"/"log_group_name": "\/ENV_NAME\/TIER\/\1"/g' /path/to/amazon-cloudwatch-agent.json
"logs": {
"log_stream_name": "${aws:InstanceId}",
"force_flush_interval" : 60,
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/syslog",
"log_group_name": "/ENV_NAME/TIER/syslog",
"timestamp_format": "%b %d %H:%M:%S"
},
{
"file_path": "/var/log/auth",
"log_group_name": "/ENV_NAME/TIER/auth",
"timestamp_format": "%b %d %H:%M:%S"
},
{
"file_path": "/var/log/audit/audit.log",
"log_group_name": "/ENV_NAME/TIER/audit"
},
{
"file_path": "/var/log/aide/aide.log",
"log_group_name": "/ENV_NAME/TIER/aide"
}
]
}
}
Desired Behaviour
It would be great if we could just declare arbitrary environment variables that are picked up by and agent and substituted in at runtime. e.g. "log_group_name": "/${env.ENV_NAME}/${env.TIER}/aide"
Hey Dan,
thanks for submitting the feature request, we have had many requests for enhanced string interpolation in those fields, but have not yet thought through how it might be implemented yet. I like where you're going with the env dot notation there, but we would need to think through cases where the env var is not set and how this would be supported for windows/linux. I've taken note of the request and will keep you updated on how things progress.
Thanks for the response John. My intuition lead me to hope that I could define some params in the common toml file (actually, it was the env* file first, but that doesn't appear to be used by anything)...
For anyone hoping to achieve something similar, here is my take on it...
/usr/local/sbin/amazon-cloudwatch-agent.bootstrap.sh
#!/bin/sh
set -ex
# check if the vars are defined in a file
test -f /etc/default/amazon-cloudwatch-agent && . /etc/default/amazon-cloudwatch-agent
# check if they are defined using instance tags
if [ "${ENV_NAME}" = "" ]; then
REGION=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/.$//')
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
TAGS=$(/usr/local/sbin/aws ec2 describe-tags --region $REGION --filters "Name=resource-id,Values=$INSTANCE_ID" | jq -r '.Tags[]')
ENV_NAME=$(echo "$TAGS" | jq -r 'select(.Key == "environment") | .Value')
TIER=$(echo "$TAGS" | jq -r 'select(.Key == "tier") | .Value')
fi
# make it easier to figure out what went wrong with some defaults
ENV_NAME=${ENV_NAME:-'golden-image-default'}
TIER=${TIER:-'unknown'}
# remove the files that the agent manipulates on startup
rm -f /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.toml \
/etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.d/*
# the agent manipulates the contents of amazon-cloudwatch-agent.d on startup to generate the toml file it actually uses
# copy the original unmanipulated agent config files from the special 'env.d' dir into the expected dir
cp /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.env.d/*.json /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.d/
# substitute the placeholders
find /etc/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.d -type f -name "*.json" | \
xargs sed -i "s/ENV_NAME/${ENV_NAME}/g;s/TIER/${TIER}/g"
/etc/systemd/system/amazon-cloudwatch-agent.service.d/systemd.override.conf
[Service]
ExecStartPre=+/usr/local/sbin/amazon-cloudwatch-agent.bootstrap.sh
I didn't find a good solution when I hit this roadblock about a month back, and wound up sort of flipping it on it's head. Thought I'd share my own work-around.
You can't grab tags, but one pseudo parameter you can reference in an agent config is the instance id. With that in mind, what I did is created a separate agent configuration for each environment, and saved them to SSM Parameters. Then I set up a separate Systems Manager State Association for each environment that utilizes the managed Document AmazonCloudWatch-ManageAgent, which takes a config stored in SSM's Parameter store and uses it to configure the agent on the associated instances, which can at long last be targeted by SSM based on the value of their environment tags.
In my use case, I was grouping logs together by instance, with separate streams for each log, but a config for "Dev" that has a stream for each instance, for example, might define log groups like this:
{
"file_path": "/path/to/myLog",
"log_group_name": "Dev_myLog",
"log_stream_name": "{instance_id}"
}
Maintaining that with a Tag-based State Association and SSM Params has the added benefit that if an instance's environment tag is changed, i.e. an instance is upgraded, it will get automatically associated with that environment's State Manager Association, which will update it's agent to the correct config for it's new environment.
That the only difference between these configs is that single reference to the environment makes all this set up seem like a real nuisance, tbh. However, I'm looking ahead to the next update of the agent, which looks like it'll bring the ability to define a retention policy in the config file. Setting a different retention policy for each environment seems like a more legitimate purpose, rather than what currently is just a hacky work-a-round for namespacing.
I'm doing a staged roll out of the agent to several environments, so in the end I wound up creating a CloudFormation template that runs fn::Sub on an agent configuration template, to quickly set up a new config SSM Param for a given environment. It's very similar to how OP is handling things with their bootstrap script. But that's hardly necessary if you're only setting up a few.
I would love if there was some way to get the existing behavior of using the filepath of the log file up to the last dot, but just as a placeholder, so I can use a prefix (see https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html).
Even though the documentation doesn't seem to indicate, it is currently not possible to omit the log_group_name when using glob in the file_path.
I would love to have the ability to put a prefix in for purposes where I am doing a glob and want the auto naming functionality for purposes of separating the files into different log_groups, but need to separate instances.
What would be useful:
log_group_name: null (should be possible according to documentation, but is not -- will use filepath up to last dot)
log_group_name: "test_{instance_id}" (should be possible according to documentation, but is not)
log_group_name: "test_{file_path}" (where {file_path} is the same as the "null" behavior)
The last example above of "what would be useful" is really what would be most useful.
Hello, What are the equivalent {instance_id} for ECS? Is there a task id or something that can be used when using a CW agent as a sidecar?