Choosing the Right Token TTL for Your AI Agents

15 minutes vs. 4 hours vs. 24 hours. The right TTL depends on your agent's task duration, failure recovery model, and tolerance for extra latency on token refresh. There is no universal answer — but there is a framework for arriving at the right answer for your workload.

Why TTL matters more for agents than for humans

For human users, OAuth access tokens typically have a short TTL (15-60 minutes) backed by a long-lived refresh token that re-mints the access token transparently. The user never experiences token expiry as a disruption because the refresh happens in the background.

AI agents cannot use the same mechanism as gracefully. A mid-task token expiry means the agent gets a 401 from the API, which it must detect and recover from — either by refreshing the token and retrying, or by failing the task. The token refresh adds latency. If the agent is mid-way through a complex multi-step task, a 401 in the middle can cause partial execution with uncertain state, which is harder to recover from than a clean start-over.

This is why TTL selection for agents is a deliberate decision rather than an infrastructure default. The right TTL depends on the specific characteristics of the workload, not on general security guidelines.

The three variables that determine your TTL

Task duration. What is the typical duration of your agent's longest task? If your agent runs 5-minute PR reviews, a 15-minute TTL works — the token covers one task with comfortable headroom. If your agent runs 3-hour data processing jobs, a 15-minute TTL means 12 token refreshes per task, each adding latency and creating a potential failure point. For long-running tasks, the TTL should be longer than the task duration, not shorter.

Failure recovery model. If a token expires and the agent gets a 401, what happens? If your agent framework handles 401s by refreshing the token and retrying the last API call transparently, short TTLs are workable. If your agent framework treats a 401 as a fatal error and aborts the task, short TTLs will cause frequent task failures for long-running agents. Check your framework's default 401 handling before setting a short TTL.

Blast radius tolerance. A longer TTL means a compromised token is valid for longer. If your agent handles sensitive data — financial records, personal information, production database access — the blast radius of a compromised token with a 24-hour TTL is much larger than one with a 15-minute TTL. For sensitive workloads, accept the operational complexity of shorter TTLs and better 401 handling in exchange for smaller exposure windows.

TTL recommendations by workload type

Short-lived discrete tasks (under 15 minutes). Recommended TTL: 900 seconds (15 minutes). Example: PR review agents, issue triage agents, Slack notification agents. These agents complete each task well within the TTL, so token expiry during a task is rare. The refresh overhead is low because tokens are re-minted once per task, not mid-task. The small blast radius is appropriate for low-sensitivity workloads.

Medium-duration tasks (15 minutes to 2 hours). Recommended TTL: 7200 seconds (2 hours). Example: code analysis agents, documentation generation agents, data transformation pipelines. The TTL covers the longest expected task duration. Build retry logic for 401s as a safety net, but design for it to rarely trigger. A 2-hour TTL is Alter's default because it covers most agent workloads without being unreasonably long.

Long-running autonomous agents (over 2 hours). Recommended TTL: 3600 seconds (1 hour) with explicit mid-task refresh logic. Do not use 8-hour or 24-hour TTLs for long-running agents as a shortcut for avoiding refresh logic. Instead, implement refresh proactively: check whether the token will expire within 10 minutes before each major operation, and re-mint if it will. This pattern gives you the security properties of short TTLs with the reliability of not getting surprised by mid-task expiry.

Batch processing agents with restart capability. Recommended TTL: 3600 seconds with checkpoint-based recovery. If your agent processes 10,000 records and can checkpoint progress, a token expiry should trigger a clean restart from the last checkpoint rather than failing mid-batch. Design the TTL to be longer than a single checkpoint cycle so expiry always happens between checkpoints, not within one.

The proactive refresh pattern

The most reliable approach for agents with unpredictable task durations is proactive refresh: before each operation that requires a credential, check whether the token will expire before the operation is expected to complete. If yes, re-mint the token before starting. This eliminates mid-operation expiry as a failure mode.

def get_fresh_token(alter_client, provider, scope, task_context,
                    buffer_seconds=300):
    """Get a token, re-minting if it expires within buffer_seconds."""
    token = alter_client.get_cached_token(provider=provider, scope=scope)

    if token is None or token.expires_in() < buffer_seconds:
        token = alter_client.get_token(
            provider=provider,
            scope=scope,
            task_context=task_context,
            force_refresh=True,
        )

    return token.value

The buffer_seconds=300 means you re-mint any token that will expire within 5 minutes. The Alter SDK's get_cached_token returns the cached token with its remaining TTL, and expires_in() returns seconds until expiry. This pattern adds one lightweight check before each operation — typically under 1ms when the token is fresh — and eliminates the need to handle 401 recovery in most cases.

Provider-imposed TTL limits

Your intended TTL may be constrained by what the OAuth provider allows. GitHub App installation tokens have a hard maximum of 3600 seconds (1 hour) — you cannot request a longer TTL regardless of your policy. Google's access tokens default to 3600 seconds and do not accept longer TTLs for most use cases. Slack bot tokens do not have a TTL at all — they persist until revoked.

Alter enforces a TTL ceiling per provider based on the provider's documented maximum. If you configure a policy with TTL 7200 for a GitHub integration, Alter issues tokens with TTL 3600 (the GitHub maximum) and re-mints automatically when the agent requests a token after expiry. The policy TTL is the maximum you want; the actual TTL may be shorter due to provider constraints.

This means you need to know the maximum TTL for each provider your agents connect to. Alter's documentation includes a provider capability table that lists the maximum TTL, whether the provider supports explicit revocation (not all do), and whether the provider supports downscoping.

TTL and the audit log density tradeoff

Shorter TTLs generate more token mint events in the audit log. An agent using 15-minute TTLs over an 8-hour shift generates 32 mint events per provider per agent. The same agent with 4-hour TTLs generates 2. For a system with 20 agents across 5 providers, the difference is 3,200 events per day vs. 200 events per day.

More events is not inherently bad — denser event coverage means better forensics if an incident occurs. But it does mean higher SIEM storage costs and potentially higher noise in alert rules that trigger on mint frequency. Factor this in when sizing your SIEM retention and tuning alert thresholds. Alter's default TTL of 4 hours is partly a cost-conscious default that balances security coverage against audit log volume for typical workloads.