Retry Strategies

Request Forge provides robust retry mechanisms to handle transient failures gracefully.

Overview

Retry strategies determine:

When to retry (which errors/status codes)
How many times to retry
How long to wait between retries
When to give up (max retries or circuit breaker)

Available Strategies

Strategy	Description	Use Case
`NoRetryStrategy`	No retries, fail immediately	Testing, non-critical requests
`SimpleRetryStrategy`	Fixed delay between retries	Simple retry logic
`ExponentialBackoffRetryStrategy`	Exponential delay with jitter	Production use (recommended)
`CircuitBreakerRetryStrategy`	Circuit breaker pattern	Prevent cascade failures

No Retry Strategy

Fail immediately on any error:

from requestforge import NoRetryStrategy, HttpClientConfigBuilder

config = (
    HttpClientConfigBuilder()
    .with_base_url('https://api.example.com')
    .with_retry_strategy(NoRetryStrategy())
    .build()
)

# No retries - fails on first error

When to Use

Unit tests where you want immediate failures
Non-critical requests where retry overhead isn’t worth it
Requests that should never be retried (e.g., POST that creates resources)

Simple Retry Strategy

Fixed delay between retry attempts:

from requestforge import SimpleRetryStrategy

strategy = SimpleRetryStrategy(
    max_retries=3,      # Retry up to 3 times
    delay=2.0           # Wait 2 seconds between retries
)

config = builder.with_retry_strategy(strategy).build()

Timeline Example

Attempt 1 → Fail
Wait 2s
Attempt 2 → Fail
Wait 2s
Attempt 3 → Fail
Wait 2s
Attempt 4 → Fail
→ Raise MaxRetryException

Custom Retryable Exceptions

Only retry specific exception types:

from requestforge import TimeoutException, ConnectionException

strategy = SimpleRetryStrategy(
    max_retries=3,
    delay=1.0,
    retryable_exceptions=frozenset({
        TimeoutException,
        ConnectionException
    })
)

# Only retries on TimeoutException or ConnectionException
# Other exceptions fail immediately

Exponential Backoff Strategy

Recommended for production use.

Exponential delay with optional jitter to prevent thundering herd:

from requestforge import ExponentialBackoffRetryStrategy

strategy = ExponentialBackoffRetryStrategy(
    max_retries=5,        # Maximum 5 retry attempts
    base_delay=1.0,       # Start with 1 second
    max_delay=60.0,       # Cap at 60 seconds
    multiplier=2.0,       # Double delay each time
    jitter=True           # Add randomization
)

config = builder.with_retry_strategy(strategy).build()

Delay Calculation

Without jitter:

delay = min(base_delay * (multiplier ^ attempt), max_delay)

With jitter (±25% randomization):

delay = min(base_delay * (multiplier ^ attempt), max_delay)
jitter_range = delay * 0.25
delay += random(-jitter_range, jitter_range)

Timeline Example

Attempt 1 → Fail
Wait ~1s   (1.0 * 2^0 = 1.0s ± jitter)

Attempt 2 → Fail
Wait ~2s   (1.0 * 2^1 = 2.0s ± jitter)

Attempt 3 → Fail
Wait ~4s   (1.0 * 2^2 = 4.0s ± jitter)

Attempt 4 → Fail
Wait ~8s   (1.0 * 2^3 = 8.0s ± jitter)

Attempt 5 → Fail
Wait ~16s  (1.0 * 2^4 = 16.0s ± jitter)

Attempt 6 → Fail
→ Raise MaxRetryException

Retryable Status Codes

Configure which HTTP status codes trigger retries:

strategy = ExponentialBackoffRetryStrategy(
    max_retries=3,
    base_delay=1.0,
    retryable_status_codes=frozenset({
        408,  # Request Timeout
        429,  # Too Many Requests
        500,  # Internal Server Error
        502,  # Bad Gateway
        503,  # Service Unavailable
        504,  # Gateway Timeout
    })
)

client = HttpClient(config)

# Automatically retries on 503, 502, etc.
response = client.get('/unstable-endpoint')

Default retryable status codes:

408 - Request Timeout
429 - Too Many Requests
500 - Internal Server Error
502 - Bad Gateway
503 - Service Unavailable
504 - Gateway Timeout

Retryable Exceptions

Specify which exceptions to retry:

from requestforge import TimeoutException, ConnectionException, HttpStatusException

strategy = ExponentialBackoffRetryStrategy(
    max_retries=3,
    retryable_exceptions=frozenset({
        TimeoutException,
        ConnectionException,
        HttpStatusException  # Retry on HTTP errors
    })
)

By default, retries on:

TimeoutException
ConnectionException
HttpStatusException (when status code is retryable)

Why Use Jitter?

Without jitter, synchronized clients retry at the same time:

100 clients fail at t=0
All retry at t=1s → Server overload
All retry at t=3s → Server overload
All retry at t=7s → Server overload

With jitter, retries are spread out:

100 clients fail at t=0
Retry between t=0.75s-1.25s → Distributed load
Retry between t=1.5s-2.5s → Distributed load
Retry between t=3s-5s → Distributed load

Configuration Examples

Conservative (slow to retry):

strategy = ExponentialBackoffRetryStrategy(
    max_retries=3,
    base_delay=5.0,    # Start with 5 seconds
    max_delay=300.0,   # Cap at 5 minutes
    multiplier=3.0,    # Triple delay each time
    jitter=True
)

Aggressive (fast retries):

strategy = ExponentialBackoffRetryStrategy(
    max_retries=5,
    base_delay=0.5,    # Start with 500ms
    max_delay=30.0,    # Cap at 30 seconds
    multiplier=2.0,
    jitter=True
)

Balanced (recommended):

strategy = ExponentialBackoffRetryStrategy(
    max_retries=3,
    base_delay=1.0,
    max_delay=60.0,
    multiplier=2.0,
    jitter=True
)

Circuit Breaker Strategy

Implements the circuit breaker pattern to prevent cascade failures.

States

┌──────────┐
│  CLOSED  │ ◄──┐ Normal operation
└────┬─────┘    │
     │          │ Success threshold met
     │ Failures │
     │ exceed   │
     │ threshold│
     ▼          │
┌──────────┐    │
│   OPEN   │    │ Fail fast (no requests sent)
└────┬─────┘    │
     │          │
     │ Recovery │
     │ timeout  │
     ▼          │
┌──────────┐    │
│HALF-OPEN │ ───┘ Testing recovery
└──────────┘

CLOSED: Normal operation, requests pass through

OPEN: Too many failures, reject requests immediately (fail fast)

HALF-OPEN: Testing if service recovered

Configuration

from requestforge import CircuitBreakerRetryStrategy

strategy = CircuitBreakerRetryStrategy(
    max_retries=3,           # Retries per request
    failure_threshold=5,     # Open after 5 failures
    recovery_timeout=30.0,   # Try again after 30 seconds
    half_open_max_calls=3,   # Test with 3 calls before closing
    base_delay=1.0           # Delay between retries
)

config = builder.with_retry_strategy(strategy).build()

How It Works

client = HttpClient(config)

# Normal operation (CLOSED)
client.get('/api')  # Success
client.get('/api')  # Success

# Service starts failing
client.get('/api')  # Fail (1/5)
client.get('/api')  # Fail (2/5)
client.get('/api')  # Fail (3/5)
client.get('/api')  # Fail (4/5)
client.get('/api')  # Fail (5/5) → Circuit OPENS

# Circuit is OPEN - fail fast
client.get('/api')  # Immediate failure (no request sent)
client.get('/api')  # Immediate failure (no request sent)

# After 30 seconds → HALF-OPEN
client.get('/api')  # Try request (1/3)
client.get('/api')  # Try request (2/3)
client.get('/api')  # Try request (3/3)

# If all 3 succeed → CLOSED (back to normal)
# If any fail → OPEN again

When to Use

Protecting downstream services from overload
Preventing cascade failures in microservices
When failures are likely to persist (server down, not transient network issues)

Checking Circuit State

strategy = CircuitBreakerRetryStrategy(...)

print(strategy.state)  # 'closed', 'open', or 'half_open'

# Reset circuit manually
strategy.reset()

Custom Retry Strategy

Implement RetryStrategyInterface for custom logic:

from requestforge.interfaces import RetryStrategyInterface
from requestforge.models import RequestContext

class CustomRetryStrategy(RetryStrategyInterface):
    def __init__(self, max_retries=3):
        self._max_retries = max_retries

    @property
    def max_retries(self) -> int:
        return self._max_retries

    def should_retry(self, context: RequestContext, exception: Exception) -> bool:
        # Custom logic: only retry on weekdays
        import datetime
        if datetime.datetime.now().weekday() >= 5:  # Weekend
            return False

        if context.attempt >= self._max_retries:
            return False

        # Only retry on specific error
        return isinstance(exception, TimeoutException)

    def get_delay(self, context: RequestContext) -> float:
        # Custom delay: based on time of day
        import datetime
        hour = datetime.datetime.now().hour

        if 9 <= hour <= 17:  # Business hours
            return 1.0  # Short delay
        else:
            return 5.0  # Longer delay

# Use custom strategy
config = builder.with_retry_strategy(CustomRetryStrategy()).build()

Combining Strategies

Request-Level vs. Auth-Level Retries

Different retry strategies for general requests vs. authentication:

from requestforge import (
    ExponentialBackoffRetryStrategy,
    SimpleAuthRetryStrategy
)

# General retry (network errors, 5xx)
request_retry = ExponentialBackoffRetryStrategy(
    max_retries=5,
    base_delay=1.0,
    max_delay=60.0
)

# Auth retry (401 errors)
auth_retry = SimpleAuthRetryStrategy(
    max_retries=1,  # Only retry auth once
    delay=0.5
)

config = (
    HttpClientConfigBuilder()
    .with_base_url('https://api.example.com')
    .with_retry_strategy(request_retry)
    .with_token_auth(
        token_manager=token_manager,
        auth_retry_strategy=auth_retry
    )
    .build()
)

Retry Flow

Request → Execute
          ↓
          Fail (401 Unauthorized)
          ↓
          Auth Retry Strategy
          ├─ Should retry? Yes
          ├─ Refresh token
          └─ Retry request → Fail (503 Service Unavailable)
                             ↓
                             Request Retry Strategy
                             ├─ Should retry? Yes
                             ├─ Wait (exponential backoff)
                             └─ Retry request → Success

Best Practices

Use Exponential Backoff in Production

# Good ✅
strategy = ExponentialBackoffRetryStrategy(
    max_retries=3,
    base_delay=1.0,
    jitter=True
)

Enable Jitter

# Good ✅ - Prevents thundering herd
strategy = ExponentialBackoffRetryStrategy(jitter=True)

# Avoid ❌ - All clients retry simultaneously
strategy = ExponentialBackoffRetryStrategy(jitter=False)

Set Reasonable Max Delay

# Good ✅ - Caps at 1 minute
strategy = ExponentialBackoffRetryStrategy(max_delay=60.0)

# Avoid ❌ - Could wait hours
strategy = ExponentialBackoffRetryStrategy(max_delay=3600.0)

Don’t Retry Non-Idempotent Operations by Default

# Careful with POST requests that create resources
client = HttpClient(config_with_retry)

# This could create duplicate users on retry
response = client.post('/users', json_data={'name': 'John'})

# Solution: Use idempotency keys
response = client.post('/users',
    json_data={'name': 'John'},
    headers={'Idempotency-Key': 'unique-key-123'}
)

Log Retry Attempts

# Enable logging to track retries
config = (
    HttpClientConfigBuilder()
    .with_retry(max_retries=3)
    .with_logging()  # Logs retry attempts
    .build()
)

Set Max Retries Based on SLA

# If SLA allows 30s total timeout:
# 3 retries with exponential backoff (1s, 2s, 4s) = ~7s
# 5 retries with exponential backoff (1s, 2s, 4s, 8s, 16s) = ~31s

strategy = ExponentialBackoffRetryStrategy(
    max_retries=3,
    base_delay=1.0
)

Monitoring Retries

Track retry metrics for observability:

from requestforge.interfaces import ErrorHookInterface

class RetryMetricsHook(ErrorHookInterface):
    def on_error(self, exception, context):
        attempt = context.attempt
        max_retries = context.max_retries

        # Send to metrics system
        metrics.increment('http.retry.attempt', tags={
            'attempt': attempt,
            'url': context.request.url,
            'exception_type': type(exception).__name__
        })

        if attempt >= max_retries:
            metrics.increment('http.retry.exhausted')

config = builder.with_error_hook(RetryMetricsHook()).build()

Testing Retry Logic

Test with mock failures:

import pytest
from unittest.mock import Mock, patch
from requestforge import HttpClient, ExponentialBackoffRetryStrategy

def test_retry_on_timeout():
    config = (
        HttpClientConfigBuilder()
        .with_base_url('https://api.example.com')
        .with_retry_strategy(
            ExponentialBackoffRetryStrategy(
                max_retries=2,
                base_delay=0.01  # Fast retry for testing
            )
        )
        .build()
    )
    client = HttpClient(config)

    call_count = 0
    def mock_request(*args, **kwargs):
        nonlocal call_count
        call_count += 1
        if call_count < 3:
            raise TimeoutException('Timeout')
        return Mock(status_code=200)

    with patch.object(client.session, 'request', side_effect=mock_request):
        response = client.get('/test')
        assert response.status_code == 200
        assert call_count == 3  # Initial + 2 retries

Common Patterns

Retry with Rate Limiting

Respect rate limit headers:

from requestforge.interfaces import ResponseHookInterface

class RateLimitRetryHook(ResponseHookInterface):
    def after_response(self, response, context):
        if response.status_code == 429:  # Too Many Requests
            retry_after = response.headers.get('Retry-After')
            if retry_after:
                # Store in context for retry strategy
                context.metadata['retry_after'] = int(retry_after)
        return response

Conditional Retry

Retry based on response content:

class ConditionalRetryStrategy(RetryStrategyInterface):
    def should_retry(self, context, exception):
        if isinstance(exception, HttpStatusException):
            # Check if error is retryable
            if exception.response_body:
                return 'temporary' in exception.response_body.lower()
        return False

Next Steps

Learn about Hooks for custom retry logic
Explore Error Handling for exception management
Check Custom Retry Examples for advanced examples