#
Collect-Replay Investigation Tool
#
Overview
The collect-replay.py
script is a forensic investigation tool shipped within the janitor's Docker image. It enables extraction of engine state data for debugging and analysis purposes. The tool automatically locates the appropriate snapshot and command files needed to reconstruct engine state for a specific command index or range.
#
Purpose
The collect-replay tool extracts relevant data in response to system issues or for post-mortem investigations. It collects the minimal amount of data necessary to reproduce the system conditions at the referenced command indices.
#
How It Works
The script follows a systematic process to gather all necessary files:
- Snapshot Discovery: Locates the most recent snapshot preceding the target command index
- Multi-Location Search: Searches across all storage locations:
- Active persistence directory (
/app/persistence
) - Archive directory (
/app/persistence/archived
) - Deep storage directory (
/app/deep_storage
)
- Active persistence directory (
- Command File Collection: Gathers all command log files from the snapshot through the target range
- Archive Creation: Packages everything into a compressed ZIP file
- Safe Storage: Places the archive in the investigations volume or fallback location
#
Usage
#
Basic Syntax
docker exec <janitor_container> /app/collect-replay.py <start_index> [end_index]
#
Single Command Index
Extract data for investigation of a specific command:
docker exec janitor /app/collect-replay.py 4280361412
This creates: investigate.4280361412.zip
#
Command Range
Extract data covering a range of commands:
docker exec janitor /app/collect-replay.py 4280361412 4281361512
This creates: investigate.4280361412-4281361512.zip
#
Parameters
If end_index
is omitted, only the single start_index
command is targeted.
#
Output Location
The script attempts to write output files in the following priority order:
- Primary:
/app/investigations/
(mounted volume) - Fallback:
/tmp/
(if investigations volume unavailable)
#
File Naming Convention
investigate.<start_index>[-<end_index>].zip
Examples:
- Single index:
investigate.4280361412.zip
- Range:
investigate.4280361412-4281361512.zip
#
Archive Contents
Each generated ZIP file contains:
- Complete snapshot directory for the most recent snapshot before the target range
- All command log files from the snapshot index through the target range
- Sequential command files ensuring complete state reconstruction
#
Example Archive Structure
investigate.4280361412-4281361512.zip
├── snapshot.4280000000/
│ ├── complete.meta
│ ├── balances.meta
│ ├── balances
│ ├── orders.meta
│ ├── orders
│ └── ...
├── commandLog.4280000001
├── commandLog.4280300000
├── commandLog.4280361000
└── commandLog.4281000000
#
Validation
The script validates:
- Command index format and range
- File accessibility across storage tiers
- Output directory write permissions
- Archive creation success
#
Security Considerations
#
Data Sensitivity
- Investigation archives contain complete trading state
- Implement appropriate access controls on investigations volume
- Consider encryption for sensitive production data
#
Container Security
- Script runs within janitor container context
- Inherits container's access permissions
- Limited to mounted volumes and configured storage locations