Skip to content

Adaptive Checkpointer: Sublinear State Saving for Simulations

PyPI Version Documentation

Practical implementation of √T-based checkpointing for distributed simulations

Key Features

  • Sublinear Memory: O(√T) storage complexity
  • Adaptive Checkpointing: Dynamic interval adjustment
  • Tiered Storage: RAM → Disk → S3
  • Rollback Optimization: Average O(√T) recovery time
  • Distributed Ready: Thread-safe and cloud-enabled

Quick Start

```python from adaptive_checkpointer import AdaptiveCheckpointer, TieredBackend

Configure storage hierarchy

storage = TieredBackend() storage.add_ram_layer(10_000) # First 10k events in RAM storage.add_disk_layer(100_000, "ckpt/") # Next 90k on disk storage.add_s3_layer("simulation-bucket") # Rest on cloud

Initialize checkpointer

checkpointer = AdaptiveCheckpointer( base_interval=500, storage=storage )

Simulation loop

state = {"counter": 0} for event_id in range(1, 100_001): # Update state state["counter"] += event_id

# Checkpoint decision
if checkpointer.should_checkpoint(event_id):
    checkpointer.save_checkpoint(event_id, state)

# Handle rollbacks (1% probability)
if random.random() < 0.01:
    target = max(0, event_id - random.randint(10, 100))
    _, state = checkpointer.get_last_checkpoint(target)