# Prometheus Metrics

## Reference

### Prometheus Configuration

The Nekuti Matching Engine exposes metrics via a Prometheus endpoint at `/metrics`. Configure Prometheus to scrape these metrics by adding the control gateway endpoint to your Prometheus configuration:

```yaml
scrape_configs:
  - job_name: 'nekuti'
    static_configs:
      - targets: ['control:8181']
```

### Thread Performance Metrics

The engine exposes two key metrics for each thread:

- `nekuti_processed_commands_total`: Total commands processed
- `nekuti_cpu_usage_nanos_total`: CPU time used (nanoseconds)

These metrics are crucial for monitoring thread performance and system capacity. The engine operates with five CPU-pegging threads:

| Thread | Function |
|--------|----------|
| tcp_reader | Reads commands from gateway TCP connections |
| command_persistence | Writes commands to the command log (write-ahead) |
| network_buffer_recycling | Recycles network read buffers |
| main_loop | Primary command processing |
| tcp_writer | Writes responses back to gateways |

### Load Analysis

#### CPU Usage Calculation
To view thread load percentage in Grafana:
```
rate(nekuti_cpu_usage_nanos_total[1m]) / 10000000
```

#### Performance Characteristics
- Under light load: Higher CPU usage due to latency optimization
- Under heavy load: More efficient due to throughput optimization
- Warning threshold: Threads approaching 100% utilization indicate capacity limits
- Critical threshold: 100% thread utilization creates system bottlenecks

## Explanation

The Nekuti Matching Engine uses CPU-pegging threads that run in tight loops. This design means:

1. Host machine load is not a reliable indicator of system load
2. Thread CPU usage metrics are essential for capacity monitoring
3. Each thread must process every command sequentially
4. Performance characteristics vary based on load conditions

Monitor these metrics closely to ensure optimal system performance and identify potential bottlenecks before they impact service.

### Queue Metrics

The `nekuti_queue_depth_commands` metric tracks commands awaiting processing across different processors:

| Processor | Description |
|-----------|-------------|
| command_persistence | Commands waiting for write-ahead logging |
| network_buffer_recycling | Commands awaiting buffer recycling |
| command_queue | Total commands pending processing |

#### Queue Depth Analysis

The command_queue depth provides the most comprehensive view of system backlog:
- Low depths indicate healthy processing
- High depths suggest processing bottlenecks
- A high command_queue with low preprocessor queues indicates main_loop bottlenecks
- Individual preprocessor queue depths help pinpoint specific bottlenecks

### Instrument Metrics

All instrument metrics are cumulative counters since instrument inception.

#### Order Status Metrics
- `nekuti_accepted_orders_total`: Accepted new orders
- `nekuti_filled_orders_total`: Fully filled orders
- `nekuti_cancelled_orders_total`: Cancelled orders
- `nekuti_expired_orders_total`: Expired orders
- `nekuti_rejected_orders_total`: Rejected orders

Open orders calculation:
```
accepted_orders - (filled_orders + cancelled_orders + expired_orders)
```

#### Amendment Metrics
- `nekuti_accepted_amendments_total`: Accepted amendments
- `nekuti_rejected_amendments_total`: Rejected amendments

Note: Amendments are counted separately from orders.

#### Trade Metrics
- `nekuti_trades_total`: Regular trades
- `nekuti_self_trades_total`: Self-trades
- `nekuti_liquidations_total`: Liquidation events

#### Timestamp Metrics
- `nekuti_last_funding_charge_millis_total`: Latest funding charge time
- `nekuti_last_mark_update_millis_total`: Latest mark update time

### System Account Balance Metrics

The prometheus endpoint publishes balances for system accounts in both major and minor units. Each metric includes labels for both denominations.

| Account | Major Unit Metric | Minor Unit Metric |
|---------|------------------|-------------------|
| Insurance Fund | nekuti_insurance_fund_balance_major | nekuti_insurance_fund_balance_minor |
| Penny Jar | nekuti_penny_jar_balance_major | nekuti_penny_jar_balance_minor |
| Fees | nekuti_fees_account_balance_major | nekuti_fees_account_balance_minor |

### Message Store Metrics

`nekuti_store_size_messages` tracks unpurged messages in the store. Monitor this metric to:
- Detect infrequent message purging
- Anticipate potential memory issues
- Identify message volume spikes

### Memory Metrics

`nekuti_free_memory_bytes` reports JVM free memory. Note that due to garbage collection behavior, actual available memory may exceed reported values.
