sdk-python Fix/bedrock cache point messages

Description

This PR fixes the cachePoint formatting issue in BedrockModel that was causing ParamValidationError when using prompt caching with system prompts.

Problem: When using cachePoint in system prompts, Bedrock's API returned a validation error because the system content blocks were not being formatted correctly. Bedrock requires cachePoint to be a separate content block (tagged union), not merged with other fields.

Solution: Added _format_bedrock_system_blocks() method that formats system content blocks using the same logic as message content blocks, ensuring cachePoint blocks remain as separate content blocks.

Verified: Cache metrics now show cacheWriteInputTokens on first request and cacheReadInputTokens on subsequent requests, confirming prompt caching works correctly.

Related Issues

Fixes #1219 Fixes #1015

Documentation PR

N/A - No documentation changes needed

Type of Change

Bug fix

Testing

How have you tested the change?

All 92 bedrock unit tests pass (hatch test -- tests/strands/models/test_bedrock.py)
All 1608 unit tests pass (hatch test)
All 19 bedrock integration tests pass (hatch run test-integ -- tests_integ/models/test_model_bedrock.py)
Added 5 new tests specifically for cachePoint formatting
Verified cache metrics show actual cache hits with real Bedrock API calls
[x] I ran hatch run prepare

Checklist

[x] I have read the CONTRIBUTING document
[x] I have added any necessary tests that prove my fix is effective or my feature works
[x] I have updated the documentation accordingly
[x] I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
[x] My changes generate no new warnings
[x] Any dependent changes have been merged and published

Output

================================================================================
CACHE PERFORMANCE SUMMARY
================================================================================

Request 1 (First - cache CREATED):
  - Cache Write Input Tokens: 1761 tokens  <- system prompt written to cache
  - Cache Read Input Tokens:  0 tokens
  - Input Tokens:             14 tokens  <- user query only

Request 2 (Second - cache HIT):
  - Cache Write Input Tokens: 0 tokens
  - Cache Read Input Tokens:  1761 tokens  <- reused from cache!
  - Input Tokens:             197 tokens  <- user query + conversation

Request 3 (Third - cache HIT):
  - Cache Write Input Tokens: 0 tokens
  - Cache Read Input Tokens:  1761 tokens  <- reused from cache!
  - Input Tokens:             442 tokens  <- user query + conversation

================================================================================
SUCCESS: Prompt caching is working!
   Total Cache Write: 1761 tokens (charged at write rate)
   Total Cache Read:  3522 tokens (charged at ~90% discount)
   Total Input:       653 tokens (charged at standard rate)

   Without caching, requests 2 & 3 would have cost 3522 more input tokens!
================================================================================

Notes

NOTE: Bedrock Prompt Caching Minimum Token Requirements (5 minute cache)
--------------------------------------------------------
| Model                  | Min Tokens per Cache Checkpoint |
|------------------------|--------------------------------|
| Claude Sonnet 4        | 1,024 tokens                   |
| Claude 3.7 Sonnet      | 1,024 tokens                   |
| Claude 3.5 Haiku       | 2,048 tokens                   |
| Claude Opus 4.5        | 4,096 tokens                   |
| Amazon Nova (all)      | 1,000 tokens                   |
--------------------------------------------------------
Reference: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html

Dec 07 '25 19:12 af001

Here is the error you get:

    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in system[1]: "cachePoint", must be one of: text, guardContent

This is a result of attempting to use both methods for Bedrock System Prompt Caching. For example:

bedrock_modle = BedrockModel(
    boto_session=boto_session,
    boto_client_config=bedrock_config,
    model_id=model,
    temperature=self.parameters.temperature,
    streaming=streaming,
    cache_prompt=cache_prompt,  # True
)

As a result, no cacheRead/Write tokens are being used and input_tokens remains high. System prompts are static and are the majority of the tokens we are being charged for:

{ # First run
  ...
  "input_tokens": 4489,
  "output_tokens": 39,
  "total_tokens": 4528,
  "latency_ms": 2764,
  "cycle_count": 1,
  "total_duration_s": 3.0020978450775146,
  "cache_read_tokens": 0,
  "cache_write_tokens": 0,
  "time_to_first_byte_ms": null,
  "tool_calls": 1
},
{  # Second Run
  ...
  "input_tokens": 4494,
  "output_tokens": 40,
  "total_tokens": 4534,
  "latency_ms": 3257,
  "cycle_count": 1,
  "total_duration_s": 3.5222342014312744,
  "cache_read_tokens": 0,
  "cache_write_tokens": 0,
  "time_to_first_byte_ms": null,
  "tool_calls": 1
}

The file was last modified here from #1112 . This introduced the feature but may have had a formatting bug, making it a regression. Verified this impacts v1.15-v1.19. Did not test v1.14 due to code refactoring required.

Dec 11 '25 22:12 af001

Here is a temp fix for anyone that runs into this. Instead of using BedrockModel, use the CacheEnabledBedrockModel and override the problematic functions

https://gist.github.com/af001/09acaeb4360cb4537d10bb6b096de765

Dec 17 '25 14:12 af001

Hi @af001, when you say

This is a result of attempting to use both methods for Bedrock System Prompt Caching.

Do you mean both cache_prompt and SystemContentBlock Cache blocks?

If we regressed I want to resolve this quickly, thanks

Jan 08 '26 20:01 dbschmigelski

Hey @dbschmigelski! No regression. This is essentially a hack for users pinned to older botocore versions :). I am going to close this. I have confirmed my issue is resolved by updating botocore>=1.41.5

Jan 16 '26 03:01 af001