dce icon indicating copy to clipboard operation
dce copied to clipboard

CloudWatch - logs from lambda/update_lease_status reporting incorrect cost

Open rafabnunes opened this issue 5 years ago • 9 comments

Version information

  • The version of DCE > v0.29.0
  • OS and version - MAC > Darwin Kernel Version 19.3.0:
  • Go version - 1.0
  • Terraform version (if using directly) Terraform v0.12.24 / + provider.archive v1.3.0 / + provider.aws v2.43.0 / + provider.template v2.1.2

Describe the bug When I receive the notification budget I see a cost different from AWS billing. According to information bellow from CloudWatch logs, the user has spent $9.03, but I saw in the AWS Billing a value above. My question is, DCE can get all information regarding the cost from AWS Billing ? I've deployed in the lease account the services such as EC2 and VPC Endpoints.

Another scenario, when leased account achieve the budget limit, DCE would do reset account, but according to the information bellow from CloudWatch logs, account has achieved the limit but does not performed an account reset.

-"2020/04/15 15:39:27 Principal dcepocuser has spent $14.44 of their current principal budget

  • 2020/04/15 15:39:27 OverBudget. Updating lease as ready to be reclaimed...

I also checked the logs from code build, but there's no information about reset from any account in the date mentioned above.

How can I fix this issue ?

rafabnunes avatar Apr 15 '20 21:04 rafabnunes

Hi @rafabnunes -- we're aware of some issues in the way cost/usage are being reported in the system today. We're actually in the middle of a big overhaul of cost reporting (see #316), that will hopefully address any issues we're seeing.

I apologize that we haven't better documented some of the known buggy behavior here. I'm still trying to get a handle on all of the moving parts myself. But I intend to document the current situation in more detail for posterity.

Note that we're probably looking at some significant breaking changes with PR #316 (eg. it will destroy and recreate the usage DB table), so if this is a feature you're relying on, it may be worthwhile to wait for that release to land before investing too heavily in your current setup.

eschwartz avatar Apr 16 '20 03:04 eschwartz

Hey @eschwartz thanks for all information.

According to PR #316 it seems like will solve this issue.

My current setup is proof of concept, I might wait a little bit for the new version.

rafabnunes avatar Apr 16 '20 03:04 rafabnunes

Sure @rafabnunes , I think that's a wise path.

I will update this ticket when the PR is merged and released. Hopefully it's won't be too long here now....

eschwartz avatar Apr 16 '20 18:04 eschwartz

Hey @eschwartz

Just to make sure that you mentioned before, the issue bellow will be fixed with DCE new version ?

"When leased account achieve the budget limit, DCE would do reset account, but according to the information bellow from CloudWatch logs, account has achieved the limit but does not performed an account reset.

-"2020/04/15 15:39:27 Principal dcepocuser has spent $14.44 of their current principal budget

2020/04/15 15:39:27 OverBudget. Updating lease as ready to be reclaimed... I also checked the logs from code build, but there's no information about reset from any account in the date mentioned above."

rafabnunes avatar Apr 22 '20 19:04 rafabnunes

Hi @eschwartz

I did a new test leasing a new account. I've deployed a few resources like EC2 and VPC Endpoints, after few hours the budget has exceeded, the account was changed to "OverBudget" but does not performed the reset process by code build. It seems like Lambda "Update lease status" does not push the event "OverBudget" to SQS "Reset queue".

To enforce the cleanup in the account, I've changed manually the account status to "NotReady" at DynamoDB. After that, the Lambda "Process reset queue" was able to trigger codebuild "Reset AWS codebuild" and hence the account was cleaned and came back to the pool with status "Ready".

Since I haven't found any errors at CloudWatch logs. Do you have any idea to fix this issue ? Screen Shot 2020-04-24 at 10 18 18

Screen Shot 2020-04-24 at 12 39 52

rafabnunes avatar Apr 25 '20 12:04 rafabnunes

Hey @eschwartz I noticed that was released the version 0.30.1. I've already deployed, but I continue facing the issue. When an account become "OverBudget", the process to reset or cleanup account is not performed. Do you have any idea or tip to solve this issue ? Screen Shot 2020-04-28 at 14 18 47

rafabnunes avatar Apr 28 '20 17:04 rafabnunes

@rafabnunes I want to let you know that I'm moving off the DCE team at Optum this week, so I want to pass this PR off @robologic to shepherd through.

@robologic can you take the ball on this one, please? I'm hoping this will all be resolved by #302 , but it's worth a follow-up

eschwartz avatar Apr 29 '20 20:04 eschwartz

@eschwartz thanks for everything and good luck !

@robologic it seems like the lambda "Update Leases" after update DynamoDB table, unable to send information from account with status "OverBudget" to SQS.

rafabnunes avatar Apr 29 '20 23:04 rafabnunes

Hi @robologic,

I've uninstalled DCE version 0.30.1 and installed the version according to Feature/usage2.0 #302. After some tests with leased accounts, the lambda "end_over_budget_lease" was performed and triggered SQS and hence the account was cleaned and came back to the pool with status "Ready".

Thanks a lot.

Rafael Nunes

rafabnunes avatar May 08 '20 02:05 rafabnunes