Endava - Team EndIf: AWS-Inventory & Climitiq.io plugins
Prize category
Best Plugin
Overview
Endava's team hopes to build two plugins to enable the identification and observation of AWS resources within an application boundary, that can then be consumed by the Impact Framework. This should open up the IF to use by AWS cloud customers.
If time permits, we intend to build a total of three plugins including one to use Climatiq.io's Cloud Calcuation APIs, to perform automated SCI scoring of the Fraud Detection Tool that we submitted to the SCI-Case-Studies repository, as an end to end use-case.
The aim is to demonstrate use of the three plugins along with existing GSF and community plugins required to achieve this, in a manifest file, with an initial child-node representing an AWS subscription/application and 'application' tag. The plugin will then return the EC2 VMs and storage as children to the tree for observations and further pipeline execution to provide an SCI score.
AWS-Inventory plugin: https://github.com/Green-Software-Foundation/hack/discussions/89 AWS-Importer plugin: https://github.com/Green-Software-Foundation/hack/discussions/90 Climatiq plugin: https://github.com/Green-Software-Foundation/hack/discussions/92
We have prioritised the plugin development for the hackathon as follows:
- AWS-Importer (component output)
- AWS-Inventory (grouping/children, manual manifest alternative exists)
- Climatiq (component, an alternative plugin chain could be used)
Questions to be answered
- Are any other teams working on AWS-based components?
- Are all IF refactorings to output children to the graph completed (kind=children?),
- Are any known changes planned to Boavizta or CCF models to expand embodied carbon support (we are considering options to determine SSD embodied emissions for ESB storage, and were thinking about forking the existing plugin to add support for the component/ssd | hdd APIs.
Have you got a project team yet?
Yes and we aren't recruiting
Project team
@jcendava @viktoria-mahmud en-andrei-serdulet @eblenert en-vasiliuralucaelena
Terms of Participation
- [X] I agree to the hackathon Rules & Terms and Code of Conduct
Project Submission
Summary
Endava’s team wanted to see how the Impact Framework and its plugins could be used to perform automated sustainability calculations such as the SCI score, of large or complex cloud applications. We decided to use our Fraud Detection solution as our target use-case, building IF plugins to replicate the process we followed in our SCI case-study submission.
Problems
- Our calculation of the SCI score had included a significant amount of manual work, identifying resources, recording utilisation and performing manual calculations. Even using the IF, defining the tree or graph for a large or complex application with various compute and storage resources and multiple regions could be time consuming, and changes to application infrastructure would need to be reflected in a manifest. We wished to contribute plugins to the IF that would keep those manual tasks to a minimum.
- There was no native support within the IF plugins for AWS-based observations. We wanted to contribute a plugin to provide this support.
- There was no plugin for Climatiq.io, who provide cloud calculation APIs that simplify some of the process of calculating energy and emissions from cloud resources. We wanted to contribute a plugin that enabled Climatiq outputs to be included in an IF pipeline.
Application
Endava's team has built two (technically, three) Impact Framework plugins as part of the hackathon submission:
- The AWS-Importer plugin enables retrieval of observations of AWS VM and storage resources, based on a time-span, region and resource tag parameters, using the AWS SDK to retrieve usage data from AWS EC2 Service and AWS CloudWatch.
- The Climatic plugin allows calculation of energy consumption, operating and embodied CO2e emissions for VM, storage, CPU and Memory components using the services cloud calculation APIs.
- The Boavista-storage plugin adds support for the Boavista component APIs that enable calculation of embodied emissions for storage devices.
In our project, the plugins have been chained together with the SCI plugin to build a manifest that calculates the SCI score of our Fraud Detection Tool for given dates and time-spans.
Prize Category
Best plugin
Judging Criteria
The plugin opens up the IF more easily to AWS customers. AWS is currently the cloud provider with the largest market share, and thus the plugin may enable a larger number of software applications to use the Impact Framework. Support for further AWS services can be added to realise the plugin’s full potential. This will however require the ability to interpret observations to measure energy consumption or environmental impacts of those services. Similar plugins will be needed for other cloud and XaaS providers to use the Impact Framework across the full spectrum of software solutions. Both the AWS-Importer and Climatiq follow the micro-model architecture, using the standard inputs and outputs where possible, and introducing standard-type parameters where no existing ones could be found. The Climatiq model offers configuration options to return or omit emissions, sum energy and intensity parameters in order that subsequent plugins can perform that role.
Video
https://www.youtube.com/watch?v=EUoKxxaLD4M
Artefacts
- https://github.com/Endava/awsimporter-impactframework-plugin
- https://github.com/Endava/climatiq-impactframework-plugin
Usage
- AWS-Importer plugin: https://github.com/Endava/awsimporter-impactframework-plugin/blob/main/README.md
- Climatiq-plugin: https://github.com/Endava/climatiq-impactframework-plugin/blob/main/README.md
Process
We began by looking at the SCI calculation exercise we had performed, and broke down that process into logical steps based on the SCI specification components themselves. This helped identify the plugins that would be required to automate the same process within the Impact Framework, and an idea of the manifest that might represent our application for scoring. We created a Jira backlog with Epics and user stories for each plugin, prioritizing the AWS-Importer as we felt it delivered the most value to the Impact Framework. As work was completed on each plugin, we created a manifest connecting the plugins within a single pipeline, ensuring parameters were successfully passed from one to another. With the plugins successfully integrated, we were finally able to execute the pipeline against the full (representative of production) infrastructure we had used for our SCI case-study.
Inspiration
Endava's existing SCI case-study, and the potential to automate what had been a fairly manual exercise provided the inspiration for our hackathon solution, along with the potential to utilise the Impact Framework and such plugins with our clients, to introduce sustainability serviceability into deployment pipelines or operational monitoring.
Challenges -/150
We had a few challenges around how to structure our plugins and manifest; whether and how to group results and the number of plugins actually required. The AWS SDKs were useful starting points but accessing the correct CloudWatch metrics took some time to get right. Ensuring the parameter interfaces between plugins (inputs/outputs) were consistent took some time. One of the toughest challenges was working with plugin parameter requirements with no configuration to switch modes. Determining embodied emissions for storage proved the most challenging data-wise, and we ended up creating a quick plugin using Boavista’s APIs to access SSD and HDD embodied data – although we’re not sure how it compares to server-grade componentry.
Additionally, the team were all working on paid client engagements, so we all had to fit our hackathon activities around our working days.
Accomplishments
The AWS-Importer simplifies the manifest for a large AWS EC2 based application, requiring only tags to identify resources within the application boundary and populate output parameters with observations. The Climatiq plugin utilises batch APIs to reduce network calls to streamline calculations of time series or multiple resource observations. Delivering two working plugins with a team who were fitting in hackathon activity around paid client engagements :-)
Learnings
We learnt a great deal about the Impact Framework itself, and how to build plugins and structure manifests, and also learnt our way round the AWS v3 SDK. The hackathon also helped us understand the challenges around determining emissions for cloud software services in general where data is not readily available. Having completed the hackathon, we’ll feel confident we can help our clients with similar projects in the future.
What Next
We hope that contributing two plugins that simplify two aspects of the SCI calculation - resource utilisation observations and energy & emissions calculation, will help expand the current capabilities of the Impact Framework, and encourage other AWS customers to explore or adopt the Impact Framework. Support for additional AWS services, or new Climatiq cloud-calculation endpoints can be added in future to broaden the capabilities of each plugin. We could look to fork the Boavista plugin and add the storage component functions to the community plugin too.
Hi @jcendava, great to see this submission!
- Are any other teams working on AWS-based components?
There is one other team proposed getting data from S3 buckets but nothing that looks like an importer.
- Are all IF refactorings to output children to the graph completed (kind=children?),
They are... however after the refactoring and usage I would strongly recommend returning data in a flat array and allowing consumers of your plugin to group using the built in group-by plugin rather than returning children. I commented on this approach in another proposal #96. There is a lot of complexity in processing a pipeline where some plugins return children and I can't guarantee there won't be edge case bugs we didn't surface during development.
I'll create a discussion where I'll suggest an approach for writing an importer plugin, I think more than others they need a little more guidance to ensure they work well with the rest of the pipeline of plugins.
To be clear I believe you are proposing the AWS Importer to be something that returns that flat array correct, and then AWS-Inventory to be something that groups that data into useful child structures based on some other external metadata?
I would recommend instead having the AWS Importer returns lots of contextual metadata with each observation which can be used to group how you want.
- timestamp: xxxx
duration: xxxx
cpu/utilization: 34
cloud/instance-type: EC2
cloud/vendor: aws
cloud/region: eastus
custom-application-name: my-app-name
custom-some-other-grouping-context: xxxx
custom-some-other-grouping-context: xxxx
custom-some-other-grouping-context: xxxx
custom-some-other-grouping-context: xxxx
custom-some-other-grouping-context: xxxx
Then in your group by plugin users can group how they want, sometimes you might want to group everything by region, or by application, or by department, or by cost-center as long as those params are in the observation, you can use them in the group-by.
- Are any known changes planned to Boavizta or CCF models to expand embodied carbon support (we are considering options to determine SSD embodied emissions for ESB storage, and were thinking about forking the existing plugin to add support for the component/ssd | hdd APIs.
There are no planned changed to the CCF and/or Boavizta approaches. However I would caution slightly, the CCF data is now >2 years old and the data for chip <2 years old is somewhat innacurate.
Boavizta also uses a very different methodology for energy calculation (primary energy) so if you are using Boavizta I wouldn't recommend combining it with any other approach, i.e. if you are using Boavizta for embodied, use Boavizta for energy also.
The safest approach right now would be to use the teads-curve plugin and if you are up for it create something like a teads-embodied plugin #83, CCF was built using Teads and Climatiq uses CCF behind the scenes.
BTW any single one of these plugins would by themselves make an amazing submission and addition to the ecosystem!
Thanks for the detailed reply. Just to clarify the idea, the AWS-Inventory would be run on a top node, intended to find tagged (initially EC2) resources, and build out the children beneath it, each representing say a VM Cloud instance. These nodes would contain identifier data, plus any instance-type and other metadata to run the AWS-Importer and retrieve observations for that individual resource. I believe this would be analogous to the Azure-Importer, able to return a flat array of observations, with as many of the standard params as possible to enrich the output for subsequent plugins.
We'll take your steer re. embodied & energy data - it's probably a stretch goal right now anyhow. We were specifically looking at global warming potential data (kgCO2e) returned by https://api.boavizta.org/v1/component/ssd in lieu of anything else suitable, purely for storage rather than other instance types.
https://doc.api.boavizta.org/Explanations/components/ssd/
Ahh I see what you are saying now, that should work with the way we've coded up plugins that return children, but I would still recommend merging the inventory and importer into one and using group-by on the result. I'll make a manifest file so you can see what I'm proposing and make your choice based on that!
OK, thanks, that would be helpful. This approach would reduce API calls as batches could probably be used. It might make it easier from a hackathon submission POV too!
Would group-by work if indexes in the array contained different sets of fields? [ {a,b,c,d}, {a,b,c,d}, {a,d,e,f}, {a,d,e,f} ] (grouping on common type field a )
Hi @jcendava please don't forget to register your project: https://hack.greensoftware.foundation/register/
This provides you direct access to the Impact Framework team for your questions and also benefits from our community partners (Microsoft & Electricity Maps).
You must register your project before you can submit your solution for judging.
@russelltrow Updated with submission details
Updated to change single plugin repo reference to two new individual repos. Original repo has been deleted to avoid any confusion.