Current Context Window Size
Theget_messages response includes a this_time_tokens field that returns the total token count of the returned messages.
This is particularly useful when:
- figuring out the current context window size.
- determining if you should apply edit strategies to reduce the context window size.
- determining if you should reset the prompt cache.
Context Editing On-the-fly
Acontext supports to edit the session context when you obtain the current messages. The basic usage is to pass theedit_strategies to the get_messages method to get the edited session messages without modifying the original session storage:
Prompt Cache Stability: Edit strategies can break LLM prompt caching.
Learn how to use
pin_editing_strategies_at_message to maintain cache hits in the Prompt Cache Stability guide.Token Limit
This strategy truncates messages based on token count, removing the oldest messages until the total token count is within the specified limit. It’s useful for managing context window limits and ensuring your session stays within model constraints. It will:- Removes messages from oldest to newest
- Maintains tool-call/tool-result pairing (when removing a tool-call, its corresponding tool-result is also removed)
Remove Tool Result
This strategy will replace the oldest tool results’ content with a placeholder text to reduce the session context, while keeping the most recent N tool results intact. Parameters:keep_recent_n_tool_results(optional, default: 3): Number of most recent tool results to keep with original contenttool_result_placeholder(optional, default: “Done”): Custom text to replace old tool results with
Remove Tool Call Params
This strategy removes parameters from old tool-call parts to reduce the session context, while keeping the most recent N tool calls with their full parameters intact. This is particularly useful when you have many tool calls in your session history and want to reduce token usage by removing the detailed arguments from older tool calls, while still maintaining the tool call structure (ID and name) so that tool-results can still reference them. Parameters:keep_recent_n_tool_calls(optional, default: 3): Number of most recent tool calls to keep with full parameters
- Keeps the most recent N tool calls with their original parameters
- Replaces older tool call arguments with empty JSON
{} - Tool call ID and name remain intact so tool-results can still reference them
Count the Session Raw Messages Tokens
Performing context editing won’t affect the original session messages. If you like to know the total token count of the raw messages,get_token_counts() method returns the total token count for all text and tool-call parts in a session.
The token count of Acontext is relative and proportional to the length of your session.
You can use it to determine whether the current session is too long and needs to be edited.Please do not use the token count to calculate the cost of LLM, as the actual token consumption of each LLM can vary subtly.