[Tuning] AWS Access Token Used from Multiple Addresses

Open imays11 opened this issue 8 months ago • 1 comments

Rule tuning for AWS STS Temporary IAM Session Token Used from Multiple Addresses

Pull Request

Issue link(s):

https://github.com/elastic/ia-trade-team/issues/616

Additional Context from Initial ByBit threat research this rule was meant to capture : (https://github.com/elastic/ia-trade-team/issues/585)

Summary - What I changed

Objective: Enhance the detection capabilities of the rule by refining the query to improve accuracy and reduce false positives. Initially I noticed false positives when certain AWS Global endpoints like the Health API would use IPV6, this caused multiple IP addresses from the same user identity. I also noticed the rule was correlating based on user identity arn instead of by access token, causing false positives if the same user were using 2 different access tokens within a short time frame. Additionally, there was not useful context provided to the user to successfully triage alerts.

Changes I Made:

Expanded coverage to include both long and short term credentials
Focuses exclusively on IAMUser type, removing AssumedRole from consideration to target specific user behaviors.
Excludes specific service APIs and known benign sources, such as AMAZON-AES, to reduce noise and focus on more indicative events.
Filters out less indicative service providers to minimize alerts from non-malicious activities.
Introduces detailed aggregation fields like ip_user_agent_pair and ip_city_pair to capture and provide more context around triggered access patterns.
Implements a classification system for detected activities based on combinations of unique IPs, user agents, cities, and networks.
Assigns fidelity scores to each classification, providing a nuanced assessment of the likelihood of malicious behavior to support prioritization efforts.
The query now retains a detailed list of fields for further analysis, enabling more in-depth investigation and response.
Changed the rule description and investigation guide to match the rule changes
32 minute lookback added as a small buffer to account for ingest delays with minimal risk of duplicate alerts due to overlapping alert windows

FROM logs-aws.cloudtrail* metadata _id, _version, _index
| WHERE @timestamp > NOW() - 30 minutes
    // filter on CloudTrail logs for STS temporary session tokens used by IAM users

  AND event.dataset == "aws.cloudtrail"
  AND aws.cloudtrail.user_identity.arn IS NOT NULL
  AND aws.cloudtrail.user_identity.type == "IAMUser"
  AND source.ip IS NOT NULL
    
    // exclude known benign IaC tools and Amazon Network
  AND NOT (user_agent.original LIKE "%Terraform%" OR user_agent.original LIKE "%Ansible%" OR user_agent.original LIKE "%Pulumni%")
  AND `source.as.organization.name` != "AMAZON-AES"
    
    // exclude noisy service APIs less indicative of malicous behavior
  AND event.provider NOT IN ("health.amazonaws.com", "monitoring.amazonaws.com", "notifications.amazonaws.com", "ce.amazonaws.com", "cost-optimization-hub.amazonaws.com", "servicecatalog-appregistry.amazonaws.com", "securityhub.amazonaws.com")

| EVAL
  // create a time window for aggregation
    time_window = DATE_TRUNC(30 minutes, @timestamp),
  // capture necessary fields for detection and investigation  
    user_id = aws.cloudtrail.user_identity.arn,
    access_key_id = aws.cloudtrail.user_identity.access_key_id,
    ip = source.ip,
    user_agent = user_agent.original,
    ip_string = TO_STRING(source.ip),  // Convert IP to string
    ip_user_agent_pair = CONCAT(ip_string, " - ", user_agent.original),  // Combine IP and user agent
    ip_city_pair = CONCAT(ip_string, " - ", source.geo.city_name), // Combine IP and city
    city = source.geo.city_name,
    event_time = @timestamp,
    network_arn = `source.as.organization.name` 

| STATS
    event_actions = VALUES(event.action),
    event_providers = VALUES(event.provider),
    access_key_id = VALUES(access_key_id),
    user_id = VALUES(user_id), 
    ip_list = VALUES(ip),  // Collect list of IPs
    user_agent_list = VALUES(user_agent),  // Collect list of user agents
    ip_user_agent_pairs = VALUES(ip_user_agent_pair),  // Collect list of IP - user agent pairs
    cities_list = VALUES(city), // Collect list of cities
    ip_city_pairs = VALUES(ip_city_pair), // Collect list of IP - city pairs
    networks_list = VALUES(network_arn), // Collect list of networks
    unique_ips = COUNT_DISTINCT(ip),
    unique_user_agents = COUNT_DISTINCT(user_agent),
    unique_cities = COUNT_DISTINCT(city),
    unique_networks = COUNT_DISTINCT(network_arn),
    first_seen = MIN(event_time),
    last_seen = MAX(event_time),
    total_events = COUNT()
  BY time_window, access_key_id 

| EVAL
 //   activity type based on combinations of detection criteria 
    activity_type = CASE(
        unique_ips >= 2 AND unique_networks >= 2 AND unique_cities >= 2 AND unique_user_agents >= 2, "multiple_ip_network_city_user_agent", // high severity
        unique_ips >= 2 AND unique_networks >= 2 AND unique_cities >= 2, "multiple_ip_network_city", // high severity
        unique_ips >= 2 AND unique_cities >= 2, "multiple_ip_and_city", // medium severity
        unique_ips >= 2 AND unique_networks >= 2, "multiple_ip_and_network", // medium severity
        unique_ips >= 2 AND unique_user_agents >= 2, "multiple_ip_and_user_agent", // low severity
        "normal_activity"
    ),
 // likelihood of malicious activity based on activity type
    fidelity_score = CASE(
        activity_type == "multiple_ip_network_city_user_agent", "high",
        activity_type == "multiple_ip_network_city", "high",
        activity_type == "multiple_ip_and_city", "medium",
        activity_type == "multiple_ip_and_network", "medium",
        activity_type == "multiple_ip_and_user_agent", "low"
    )

| KEEP
    time_window, activity_type, fidelity_score, total_events, first_seen, last_seen,
    user_id, access_key_id, event_actions, event_providers, ip_list, user_agent_list, ip_user_agent_pairs, cities_list, ip_city_pairs, networks_list, unique_ips, unique_user_agents, unique_cities, unique_networks

| WHERE activity_type != "normal_activity"

How To Test

There is a saved timeline in our TRaDE stack that goes back 60 days to capture the ByBit emulation behavior this rule was meant to capture.

May 29 '25 21:05 imays11

Rule: Tuning - Guidelines

These guidelines serve as a reminder set of considerations when tuning an existing rule.

Documentation and Context

[ ] Detailed description of the suggested changes.
[ ] Provide example JSON data or screenshots.
[ ] Provide evidence of reducing benign events mistakenly identified as threats (False Positives).
[ ] Provide evidence of enhancing detection of true threats that were previously missed (False Negatives).
[ ] Provide evidence of optimizing resource consumption and execution time of detection rules (Performance).
[ ] Provide evidence of specific environment factors influencing customized rule tuning (Contextual Tuning).
[ ] Provide evidence of improvements made by modifying sensitivity by changing alert triggering thresholds (Threshold Adjustments).
[ ] Provide evidence of refining rules to better detect deviations from typical behavior (Behavioral Tuning).
[ ] Provide evidence of improvements of adjusting rules based on time-based patterns (Temporal Tuning).
[ ] Provide reasoning of adjusting priority or severity levels of alerts (Severity Tuning).
[ ] Provide evidence of improving quality integrity of our data used by detection rules (Data Quality).
[ ] Ensure the tuning includes necessary updates to the release documentation and versioning.

Rule Metadata Checks

[ ] updated_date matches the date of tuning PR merged.
[ ] min_stack_version should support the widest stack versions.
[ ] name and description should be descriptive and not include typos.
[ ] query should be inclusive, not overly exclusive. Review to ensure the original intent of the rule is maintained.

Testing and Validation

[ ] Validate that the tuned rule's performance is satisfactory and does not negatively impact the stack.
[ ] Ensure that the tuned rule has a low false positive rate.

May 29 '25 21:05 github-actions[bot]