Strike Falcon System Guide

Architecture, strategies, ML pipeline, risk management & operations
← Home
Platform Architecture
Strike Falcon is an autonomous options and equity trading system. It scans 338 liquid tickers, filters them through a multi-gate pipeline, selects from 18 defined-risk options strategies based on volatility regime, scores candidates with machine learning, and executes trades through Interactive Brokers — all without manual intervention.

Core Modules

ModulePurposeModeOutput
Peregrine Options engine — 18 strategies, gate filtering, autonomous execution Fully autonomous IBKR orders
Kestrel Stock swing trade scanner — ML-scored, shadow-tracked Scan & alert Discord alerts + IBKR orders
Merlin ML intelligence layer — XGBoost training, regime detection Scheduled (weekly) Model files + regime state
Shared Backtest engine, data cache, ThetaData client, utilities Library Parquet data + backtest results
Dashboard Web UI hub — landing page, Merlin console, system guide Always-on Browser interface

Data Sources

SourceProvides
yfinanceAll stock price data (historical + live batch), VIX, ticker details
ThetaDataOptions chains (real-time snapshots + historical EOD), IV/Greeks raw data
IBKROrder execution + account tracking only — NOT a data source
LunarCrushSocial sentiment scores for catalyst detection
Critical rule: IBKR is used ONLY for execution. It was removed from all data-fetching paths due to ib_insync deadlocks in the threaded model. Stock data comes from yfinance. Options data comes from ThetaData.

Infrastructure

  • Server: Linux (Ubuntu), RTX 4080 GPU
  • Python 3.12 with virtual environments per module
  • ML: XGBoost (GPU-accelerated on RTX 4080)
  • Broker: Interactive Brokers via ib_insync
  • Options data: ThetaData Terminal (Java, port 25503)
  • Storage: SQLite + Parquet files
  • Web: Flask + nginx + Cloudflare Tunnel
  • Alerts: Discord webhooks

Web Endpoints

URLServicePortsystemd unit
strikefalcon.comLanding hub5080strikefalcon.service
strikefalcon.com/guideThis guide5080strikefalcon.service
strikefalcon.com/merlinML intelligence dashboard5080strikefalcon.service
peregrine.strikefalcon.comOptions engine dashboard5055peregrine-web.service
kestrel.strikefalcon.comStock scanner dashboard5001kestrel.service

Data Flow

# Live scanning (every 15 seconds) yfinance → stock bars → Kestrel gates → ML score → Discord alert + IBKR order ThetaData → options chain → Peregrine gates → strategy select → IBKR order # ML training cycle (weekly) Kestrel scans → shadow_log.db (44 feature vectors per candidate) → forward return labeling (1/3/5/10/20 day) → Merlin auto_retrain (Friday 4:05 PM ET) → XGBoost model → Kestrel picks up new model on next scan # Historical backtest pipeline (cron) Parquet cache (stocks/ + options/) → Greeks computation → feature engineering → 18-strategy simulation → trade labeling → training data export → Merlin retrain with historical data # Regime detection VIX + SPY returns + market breadth → regime_state.json → consumed by both Kestrel and Peregrine for strategy selection

Ticker Universe

The system scans 338 liquid tickers dynamically built from seed lists and filtered by options liquidity:

  • Large-cap (~170-180): Mega tech (AAPL, MSFT, GOOGL, NVDA, META), banks (JPM, BAC), healthcare (JNJ, UNH), consumer (WMT, AMZN), energy (XOM, CVX), industrials (BA, CAT), ETFs (SPY, QQQ, IWM, DIA)
  • Small-cap (~160-170): Biotech, EV (RIVN, LCID), fintech (SOFI, UPST), gaming (RBLX), crypto-adjacent (MARA, COIN), meme stocks (GME, AMC), space (RKLB, ASTS)
  • Selection criteria: Minimum 500 options contracts/day, minimum $5 share price, categorized by market cap ($10B threshold)
Peregrine Autonomous
The options engine. Scans 338 tickers through a sequential gate pipeline, selects the optimal strategy based on volatility regime and directional bias, computes Black-Scholes Greeks, and autonomously executes defined-risk trades through Interactive Brokers.

How It Works

  1. Scans 338 tickers on a scheduled cycle (configurable interval)
  2. Each ticker passes through 7 pre-chain gates before any options data is fetched (cheap, fast filtering)
  3. Survivors get a live options chain from ThetaData, then pass through 4 post-chain gates
  4. The strategy selector evaluates all 18 strategies against current regime and scores each with score_strategy_heuristic()
  5. IV and Greeks are computed via Black-Scholes bisection (no external Greeks API needed)
  6. Top candidates are executed via IBKR if position limits, portfolio risk, and correlation checks pass
  7. Open positions are monitored for profit targets (50%/100%), stop-loss (-30%), and time stops (14 DTE)
Pre-Chain Gates

Executed before fetching any options data. Fast, cheap checks using stock price data from yfinance.

G1

RSI Technical Sentinel

Relative Strength Index filter. Hard gate at extremes (<5 or >95), soft gate in moderate range. Prevents entries on extremely overbought/oversold tickers. Default range: 25-75

G2

Pullback (Mean Reversion)

Measures drawdown from recent highs over a 20-60 bar lookback. Rejects falling knives (<-15%) and chasing-ATH setups (>-1%). Default range: -15% to -1%

G3

SMA Trend (Momentum Filter)

Price cannot be more than 20% below the 50-day SMA. Hard gate for standard tickers, soft gate (skipped) for designated momentum tickers like TSLA, NVDA, AMD that regularly trade extended.

G4

HV Rank (Volatility Regime)

Historical volatility percentile rank over trailing 252 days. Rejects low-vol environments where premium is too thin and extreme-vol where risk is unquantifiable. Default range: 20th-60th percentile

G5

IV Rank / Percentile

Implied volatility rank measures where current IV sits relative to its own 1-year range. Min 20%, max 50%. Uses HV as proxy when real-time IV unavailable.

G6

Sentiment & Earnings Veto

LunarCrush social sentiment score + earnings date buffer. Vetoes any ticker within 14 days of an earnings announcement. Rejects tickers with extreme negative sentiment.

G6c

Catalyst Risk (Optional)

Detects unusual news/filing activity. Headline velocity veto: >6 headlines in 6 hours. 8-K filing risk floor: 0.85 probability. Prevents entries ahead of material news events.

Post-Chain Gates

Executed after fetching live options chain data from ThetaData. These are more expensive checks requiring real option prices.

G7

Options Liquidity

Minimum option volume 10 contracts, maximum bid-ask spread 15%. Ensures the trade can be entered and exited without excessive slippage.

G8

IV/HV Ratio

Compares implied to historical volatility. Warns when IV/HV > 1.5x (options expensive, good for selling). Rejects when < 0.8x (no premium to capture).

G9

Real IV Rank (Post-Chain)

Recomputes IV rank from the actual options chain data (not the proxy estimate from pre-chain). More accurate but requires the chain fetch.

G10

Correlation Guard

Checks correlation between new candidate and existing portfolio positions. Max average correlation 0.70. Prevents concentration in correlated names (ported from Kestrel).

VIX Crisis Hard Stop: When VIX exceeds the crisis threshold (configurable, default 25-45), all scanning is suspended. The system waits for volatility to normalize before resuming. This prevents entries during market dislocations when bid-ask spreads blow out and liquidity evaporates.
Dashboard Walkthrough
1

Open Positions

Live options positions: ticker, strategy, expiry, strikes, current P&L, DTE, Greek exposure. Red rows are approaching stop-loss.

2

Scan Log

Real-time log of every ticker scanned, which gates it passed or failed, and why. See the funnel in action.

3

Ticker Inspector

Type any ticker and hit Run. Peregrine runs all gates live and shows exact pass/fail with actual values. Essential for debugging why a specific ticker was rejected.

4

Metrics Panel

KPI cards: realized P&L, win rate, average DTE, position count. Updates live from IBKR portfolio.

Debugging tip: If Peregrine isn't finding trades, use the Ticker Inspector on a known high-IV ticker (e.g., NVDA after earnings). The most common blocker is IV rank below threshold during low-volatility regimes.
Kestrel Scan & Execute
Stock swing trade scanner. Scans 338 tickers every 15 seconds during market hours, applies gate filters, scores candidates with an ML model (70% XGBoost + 30% heuristic), fires Discord alerts, and executes stock trades through IBKR.

How It Works

  1. Fetches live stock data for 338 tickers via yfinance batch API every 15 seconds
  2. Each ticker passes through scanner gates: RSI, HV Rank, Pullback, SMA Trend, VIX, Sentiment, Catalyst/Earnings
  3. Surviving candidates are scored: 70% XGBoost ML model (trained by Merlin) + 30% heuristic blend
  4. Top scores above the alert threshold fire Discord alerts with full breakdowns (gate values, ML score, Kelly position size)
  5. Every candidate (pass or fail) is written to shadow_log.db with its 44-feature vector for Merlin to label and learn from
  6. Position sizing via Kelly Criterion (fractional 25%) with fixed risk fallback (2% per trade, 10% stop-loss)

Scanner Gates

GateWhat It ChecksThreshold
RSIRelative Strength Index (overbought/oversold)15 – 80
HV RankHistorical volatility percentile10th – 95th
PullbackDrawdown from recent high-40% to 0%
SMA TrendPrice vs 50-day SMANot >20% below
VIXMarket-wide fear indexMax 45
SentimentLunarCrush social scoreMin -0.5
Catalyst/EarningsEarnings proximity3-day buffer
Note: Kestrel's gates are intentionally wider than Peregrine's. The goal is to capture maximum shadow data for ML training while still filtering obvious noise. The ML model does the heavy lifting on scoring.

Shadow Tracking

Every scan candidate — whether it passed or failed gates — gets logged to shadow_log.db with its complete 44-feature vector. This is the training data source for Merlin. After each trading day, Merlin's labeler checks back and records what actually happened:

  • 1-day forward return (labeled next day)
  • 3-day forward return (labeled 3 days later)
  • 5/10/20-day forward returns (labeled progressively)
  • Per-strategy win labels via classify_win() for all 18 strategies

This creates a continuously growing dataset of real market observations linked to outcomes. The ML model retrains weekly on this expanding dataset.

Dashboard Walkthrough
1

Account Bar

Portfolio value, day P&L, buying power, position count — live from IBKR.

2

Scan Results

Every ticker from the latest scan sorted by ML score. Green rows passed all gates. Click any row to see individual gate breakdowns.

3

Active Positions

Open equity positions with unrealized P&L, entry price, current price. Live IBKR updates.

4

Reports

Walk-forward backtest performance, win rate by regime, closed-trade history.

Strategy System 18 Strategies
Peregrine selects from 18 defined-risk options strategies based on volatility regime, directional bias, and IV level. Every strategy has a heuristic scoring function, explicit win/loss conditions for ML labeling, and risk-defined max loss.

Strategy Selection Logic

# Decision tree for strategy selection 1. Determine volatility regime from VIX 2. Determine directional bias from technical indicators (trend, momentum) 3. Check IV level — high IV favors selling, low IV favors buying 4. Score all eligible strategies via score_strategy_heuristic() 5. Filter strategies scoring below 0.20 threshold 6. Select top strategy — execute if portfolio risk checks pass

Volatility Regimes

VIX level determines which strategies are viable. The system classifies the current environment into 5 regimes:

Low Vol (VIX < 15) — Premium is thin. Favor directional buying (long calls/puts, debit spreads, diagonals, PMCC).
Normal (VIX 15-20) — Balanced. All strategies viable. Slight edge to defined-risk spreads.
Elevated (VIX 20-30) — Premium is rich. Favor selling strategies (iron condors, bull put spreads, jade lizards).
Crisis (VIX 30-50) — High premium but high risk. Only high-probability credit spreads with wide wings. Reduce position sizing.
Panic (VIX > 50) — Scanning suspended. Wait for normalization.
Batch 0 — Core Strategies (5)

Long Call Buy Bullish

Buy 1 ATM or OTM call. Simple directional bet on upside. Best in low-vol regimes when options are cheap.

Single legWin: stock rises above breakevenMax loss: premium paid

Long Put Buy Bearish

Buy 1 ATM or OTM put. Directional bet on downside or portfolio hedge.

Single legWin: stock falls below breakevenMax loss: premium paid

Bull Put Spread Sell Bullish

Sell OTM put + buy further OTM put (same expiration). Collect credit, profit if stock stays above short strike.

Same-expiry spreadWin: stock doesn't drop >3%Max loss: width - credit

Bear Call Spread Sell Bearish

Sell OTM call + buy further OTM call (same expiration). Collect credit, profit if stock stays below short strike.

Same-expiry spreadWin: stock doesn't rally >3%Max loss: width - credit

Iron Condor Sell Neutral

Bull put spread + bear call spread (same expiration). Profit from range-bound movement and time decay.

4 legs, same expiryWin: stock stays within ±5%Max loss: wider wing width - credit
Batch 1 — Same-Expiry Spreads (3)

Iron Butterfly Sell Neutral

Sell ATM call + put, buy OTM wings. Higher credit than condor but narrower profit zone. Best when expecting pin to ATM.

4 legs, same expiryWin: stock stays within ±3%Max loss: wing width - credit

Debit Call Spread Buy Bullish

Buy ITM/ATM call + sell OTM call (same expiration). Defined risk bullish play, cheaper than naked long call.

Same-expiry spreadWin: stock rises >2%Max loss: debit paid

Debit Put Spread Buy Bearish

Buy ITM/ATM put + sell OTM put (same expiration). Defined risk bearish play.

Same-expiry spreadWin: stock falls >2%Max loss: debit paid
Batch 2 — Multi-Expiration (3)

Calendar Spread Buy Neutral

Sell front-month ATM option + buy back-month same strike. Profits from time decay differential and IV expansion in the back month.

Multi-expiryWin: stock stays within ±3%Max loss: debit paid

Diagonal Spread Buy Bullish

Sell front-month OTM call + buy back-month ITM/ATM call. Calendar with a directional tilt.

Multi-expiryWin: stock rises >1%Max loss: debit paid

PMCC (Poor Man's Covered Call) Mixed Bullish

Buy deep ITM LEAPS call (delta > 0.70) + sell short-term OTM call. Synthetic covered call without owning 100 shares. Capital efficient.

Multi-expiryWin: stock stays flat or rises moderatelyMax loss: LEAPS debit - short credits
Batch 3 — Complex Multi-Leg (4)

Jade Lizard Sell Bullish

Sell OTM put + sell OTM call + buy further OTM call (same expiry). No upside risk if call credit covers spread width. Downside risk on the naked put.

3 legs, same expiryWin: stock doesn't crash >4%Max loss: put assignment minus credits

Broken Wing Butterfly Sell Neutral

Sell 2 ATM options + buy 1 OTM + 1 further OTM (skip-strike). Asymmetric butterfly with reduced risk on one side.

4 legs, same expiryWin: stock stays within ±4%Max loss: skipped wing width

Call Backspread Buy Bullish

Sell 1 ATM call + buy 2+ OTM calls (same expiry). Speculative long-vol play. Profits from large upside moves. Uses 0.7x position sizing.

Ratio spreadWin: stock rallies >5%Max loss: at short strike

Put Backspread Buy Bearish

Sell 1 ATM put + buy 2+ OTM puts (same expiry). Speculative downside play. Uses 0.7x position sizing.

Ratio spreadWin: stock drops >5%Max loss: at short strike
Batch 4 — Stock-Holding Strategies (3)

Covered Call Sell Neutral

Hold 100 shares + sell OTM call. Generate income from existing stock holdings. Requires owning the underlying.

Stock + 1 short callWin: stock doesn't rip past strikeMax loss: stock goes to zero

Cash-Secured Put Sell Bullish

Sell OTM put with cash collateral. Willing to own the stock at a lower price. Requires full cash backing.

Single legWin: stock doesn't crash >5%Max loss: assignment at strike minus credit

Collar Mixed Neutral

Hold 100 shares + buy OTM put (protection) + sell OTM call (pays for put). Defined risk wrapper around existing stock.

Stock + put + short callWin: stock stays within ±8%Max loss: stock to put strike

Regime × Strategy Matrix

RegimeBest StrategiesAvoid
Low Vol (<15)Long call/put, debit spreads, diagonal, PMCC, calendarCredit spreads (premium too thin)
Normal (15-20)All strategies viable; slight edge to spreadsNone
Elevated (20-30)Iron condor, bull put, bear call, jade lizard, CSP, covered callNaked long options (expensive)
Crisis (30-50)Wide-wing credit spreads only, reduced sizingMost strategies — risk too high
Panic (>50)None — scanning suspendedEverything
Merlin ML Intelligence
Global ML processor that mines patterns from trade signals to update Kestrel's brain and feed Peregrine. Shadow-tracks every scan candidate, measures forward returns across multiple horizons, trains XGBoost models on the RTX 4080 GPU, detects market regimes, and feeds predictions back in real time.

Dashboard KPIs (strikefalcon.com/merlin)

MetricValueWhat It Means
Total Historical Data543,225Signals processed into models over the lifetime of the system
Raw Signals (20D)191,161Signals currently in observation cycle (awaiting forward return measurement)
Backlog to Process0Mature signals (24h+) ready to label — 0 means labeling is caught up
Holdout AUC-ROC0.9800Model's ability to distinguish safe vs danger signals on held-out data
Market RegimeRange BoundCurrent regime classification (BT:0.19 BV:0.33 BR:0.18 MR:0.54)
Next Unit Update2026-02-27 16:05Next scheduled auto-retrain (Friday 4:05 PM ET)
Last Train2026-02-20 15:39:31Most recent completed training run
Regime probabilities: BT = Bull Trending (0.19), BV = Bull Volatile (0.33), BR = Bear Trending (0.18), MR = Mean Reversion (0.54). Highest probability wins. Current state: Range Bound (mean reversion dominant).

Features tracked: Momentum, rsi, roc_5, roc_10, roc_20, macd, macd_signal, macd_hist, +148 more feature columns fed into the model.

Training Cycle

1

Data Acquisition (Continuous)

Every Kestrel scan writes a feature vector (44 features) to shadow_log.db. Every candidate is logged — both passes and failures — creating an unbiased training dataset.

2

Forward Return Labeling (Daily)

Each signal gets labeled with actual outcomes: 1-day, 3-day, 5-day, 10-day, and 20-day forward returns. Plus 18 per-strategy win/loss labels via classify_win().

3

Model Training (Weekly — Friday 4:05 PM ET)

XGBoost trains on the full labeled dataset using GPU acceleration (RTX 4080). Walk-forward validation prevents overfitting. Primary label: fwd_return_3d >= -3% (safety label).

4

Model Deployment (Automatic)

New model saved to signal_model_latest.pkl with version logged to model_versions table. Kestrel picks up the new model on its next scan cycle automatically.

44 Feature Columns

Every scan candidate is represented by these 44 engineered features. Each captures a different dimension of the trading setup.

Momentum (9)

rsi roc_5 roc_10 roc_20 macd macd_signal macd_hist stoch_k stoch_d

Trend (3)

sma_pct_diff adx trend_score

Volatility (5)

hv_rank atr_pct bb_pct_b bb_bandwidth zscore

Mean Reversion (2)

pullback dist_from_low

Volume (2)

relative_volume obv_trend

Price Action (2)

gap_pct daily_range_pct

Market Regime (3)

vix spy_change_pct market_breadth

Options Greeks (7)

delta theta vega gamma skew_score term_slope iv

Options Context (5)

diff_pct iv_rank iv_percentile iv_hv_ratio premium_ratio

Sentiment (4)

catalyst_score catalyst_risk headline_velocity_6h sentiment_score

Time & Context (3)

day_of_week hour_of_day is_momentum

Option Params (1)

dte

Label System

LabelDefinitionUsed By
fwd_return_3d3-day forward stock return >= -3% = safePrimary Kestrel model
bs_pnl_pct_5d5-day simulated option P&L via Black-Scholes repricingOptions models
good_for_{strategy}_5dPer-strategy win label via classify_win() with IV-normalized thresholdsStrategy-specific models (18)

The classify_win() function uses different thresholds per strategy archetype: directional strategies need the stock to move in their direction, range-bound strategies need the stock to stay within bounds, and credit strategies need to avoid assignment.

Regime Detection

Merlin writes the current market regime to regime_state.json, consumed by both Kestrel and Peregrine. Regime is determined by:

  • VIX level — primary regime indicator
  • SPY 20-day return — trend direction
  • Market breadth — advance/decline ratio

Regime states: bull_trending, bull_volatile, bear_trending, mean_reversion

Dashboard (strikefalcon.com/merlin)
1

KPI Cards

Total labeled signals, queue size, ready for ML, AUC-ROC, current regime, VIX, and next train time.

2

Model History

Every training run: version, timestamp, row count, AUC-ROC, F1, top features by importance. Target AUC: 0.60-0.75 (higher is suspicious).

3

Merlin Console

Live training logs. Click Execute Intelligence Cycle for manual retrain. Shows options outcome tracker + stock return labeling + XGBoost training in sequence.

Risk Management
Multi-layered risk controls spanning position sizing, portfolio limits, Greek exposure, exit management, and automated circuit breakers. Every trade has defined max loss before entry. The system never risks more than it can quantify.

Position Sizing

  • Fixed Risk: Risk 2% of account per trade with 10% stop-loss. Max single position 25% of account.
  • Kelly Criterion: Fractional Kelly at 25% of full Kelly. Derived from backtest win rate (56%) and avg win/loss ratio ($745/$475).
  • Speculative sizing: Backspreads and other speculative strategies use 0.7x normal sizing.
# Kelly position sizing win_rate = 0.56 avg_win = 745 avg_loss = 475 kelly_f = win_rate - (1 - win_rate) / (avg_win / avg_loss) position = account * kelly_f * 0.25 # fractional Kelly

Portfolio Limits

  • Max per position: 3% of account value
  • Max total exposure: 15% (max 5 concurrent positions)
  • Sector caps: Max 2 positions per sector
  • Correlation guard: Max 0.70 average correlation across portfolio

Greek Limits

GreekLimitWhy
Delta100 per $100KDirectional risk cap
Vega500 totalVolatility exposure cap

Greeks are computed via Black-Scholes (closed-form from solved IV). IV is solved using Brent's method for speed (8-12 iterations vs 50 for bisection).

Exit Management

TriggerAction
P&L >= +50%Close half position
P&L >= +100%Close full position
P&L <= -30%Hard stop — close full
DTE <= 14Time stop — close to avoid gamma risk
DTE <= 7Forced close (backtest)

Circuit Breakers

Automated safety switches that halt trading when conditions deteriorate:

TriggerActionResume
Daily P&L <= -10%Pause all new entriesNext trading day
VIX spike above crisis thresholdSuspend scanningVIX normalizes
IBKR connectivity lossSafe mode — no new ordersConnection restored
Anomaly detectionAlert + pauseManual review

Slippage Model

Realistic transaction cost modeling based on ticker liquidity and market conditions:

TierDaily VolumeFill Assumption
Liquid> 5M shares/day80% of mid (tight spreads)
Mid-liquidity1M – 5M shares/day65% of mid
Illiquid< 1M shares/day50% of mid (wide spreads)

VIX regime scaling: VIX > 30 multiplies slippage by 0.85 (spreads widen in crisis). VIX < 15 multiplies by 1.05 (spreads tighten in calm). Tier determined by trailing 20-day average volume.

Core principle: Every trade has a defined maximum loss before entry. The system never enters undefined-risk positions. If max loss cannot be calculated, the trade is rejected.
Backtest & Training Pipeline
Automated pipeline that transforms 10 years of historical options and stock data (2016-2026) into labeled training data for Merlin. Simulates all 18 strategies across 338 tickers, labels outcomes, and exports training-ready datasets. Zero manual steps. Cron-ready.

Data Foundation

Options data: ThetaData EOD history

  • 338 tickers × 10 years (2016-2026)
  • Parquet files per ticker (~10 GB total)
  • Bid, ask, close, volume, open interest, strike, expiration

Stock data: yfinance daily bars

  • 338 tickers × 10+ years
  • Parquet files per ticker (~35 MB total)
  • OHLCV + adjusted close

Storage: /opt/strikefalcon/shared/data/cache/ with stocks/ and options/ subdirectories.

Pipeline Architecture
1
greeks_computer.py
B-S IV solver (Brent's method) + delta/gamma/theta/vega
2
feature_computer.py
44 technical indicators per ticker per date
3
multi_strategy_simulator.py
Simulate 18 strategies with tiered slippage
4
trade_labeler.py
Win/loss labels via classify_win() + validation
5
training_data_exporter.py
JSON + CSV exports for auto_retrain
6
run_pipeline.py
Parallel orchestrator (4 workers) with checkpoint/resume

Greeks Computation

  • IV Solver: Brent's method via scipy.optimize.brentq (8-12 iterations, much faster than bisection)
  • IV bounds: [0.05, 5.0] — wide enough for meme stocks (GME, AMC regularly hit 300%+ realized IV)
  • Greeks: Closed-form from solved IV — delta, gamma, theta, vega
  • Quality gate: If >30% of chain rows fail IV computation, ticker is flagged as data_quality: degraded

Strategy Simulation

  • Contract selection: ATM or 0.30-0.40 delta, DTE 30-45
  • Multi-leg: Spreads use ~1 std dev width; condors/butterflies use 4 legs
  • Slippage: Tiered by trailing 20-day avg volume with VIX scaling
  • Forward simulation: Look 1-20 trading days ahead for exit triggers
  • Missing data: Interpolate if ≤2 days missing and |delta| > 0.20; otherwise flag as incomplete

Execution & CLI

# Full run, all tickers, all available dates python run_pipeline.py # Test with limited scope python run_pipeline.py --tickers AAPL MSFT --start 2024-01-01 --end 2024-12-31 # Resume from crash python run_pipeline.py --resume # Process new data since last run + trigger ML retrain python run_pipeline.py --retrain --since auto # Re-export from existing checkpoint (no re-simulation) python run_pipeline.py --export-only # Dry run for validation python run_pipeline.py --dry-run --max-tickers 5

Checkpoint System

  • SQLite DB with WAL mode for concurrent writes
  • Tracks per-ticker/date progress for crash recovery
  • Stores simulated trades for re-export without re-simulation
  • Data quality flags per ticker

Validation Gates

  • Win rate bounds per strategy archetype: credit 55-85%, debit 25-55%, directional 20-50%
  • No NaN in required feature columns
  • No future dates in forward return computations
  • Regime distribution across all 4 VIX buckets

Cron Schedule

ScheduleCommandPurpose
Sunday 6 PM ETrun_pipeline.py --retrain --since autoWeekly incremental — process new data + retrain
1st Saturday 2 AM ETrun_pipeline.py --retrain --forceMonthly full rebuild — reprocess everything

Output

/opt/strikefalcon/shared/data/backtest/ ├── pipeline_checkpoint.db # Resumable checkpoint + trades ├── results/ │ ├── pipeline_trades_YYYYMMDD.csv # All trades (human-readable) │ ├── training_data_YYYYMMDD.json # auto_retrain-compatible │ └── strategy_data/ │ ├── long_call_training.csv │ └── ... (18 per-strategy files) ├── reports/ │ └── pipeline_metrics_YYYYMMDD.json # Sharpe, Sortino, Brier, AUC └── logs/ └── pipeline_YYYYMMDD.log

Scale

338 tickers × ~50 weekly scans/year × 10 years × 18 strategies = ~3M+ simulated trades. With 4 parallel workers at ~500 MB each (~2 GB total), estimated runtime is under 1 hour for 2022-2026 and 2-3 hours for the full 2016-2026 dataset.

Walk-forward split: 70% training / 15% calibration / 15% test, split by date ordering (not random). The test set is always the most recent data, preventing any look-ahead bias contamination.
Operations
Service management, manual procedures, troubleshooting, and server safety rules.

Service Management

# Check status of all Strike Falcon services sudo systemctl status peregrine-web kestrel strikefalcon --no-pager # Restart individual services sudo systemctl restart peregrine-web.service # Options engine sudo systemctl restart kestrel.service # Stock scanner sudo systemctl restart strikefalcon.service # Landing + Merlin hub # If port 5080 is stuck (old process survived) sudo fuser -k 5080/tcp && sudo systemctl restart strikefalcon.service # View live logs sudo journalctl -u peregrine-web.service -f --no-pager sudo journalctl -u kestrel.service -f --no-pager sudo journalctl -u strikefalcon.service -n 50 --no-pager

Manual Kestrel Scan

# Via web UI (preferred) kestrel.strikefalcon.com → Manual Scan # Via terminal cd /opt/strikefalcon/kestrel python3 scanner.py --scan-once # Check results sqlite3 shadow_log.db \ "SELECT ticker, score, timestamp FROM shadow_signals ORDER BY timestamp DESC LIMIT 20;"

Manual Merlin Retrain

# Via web UI (preferred) strikefalcon.com/merlin → Console tab → Execute Intelligence Cycle # Via terminal cd /opt/strikefalcon/kestrel/ml python3 auto_retrain.py # Check model ls -lh models/signal_model_latest.pkl

Peregrine Ticker Inspector

# Via web UI peregrine.strikefalcon.com → Ticker Inspector → type NVDA → Run # Via API curl "http://localhost:5055/api/inspect_ticker?ticker=NVDA" \ | python3 -m json.tool

Market Regime Check

# Read live regime state cat /opt/strikefalcon/shared/regime_state.json \ | python3 -m json.tool # Force recalculation cd /opt/strikefalcon/shared python3 regime_detector.py --verbose # Regime values: # bull_trending | bull_volatile # bear_trending | mean_reversion

Options Data Download

# Verify ThetaData Terminal is running curl http://localhost:25503/v2/health # Download recent options history (full universe) cd /opt/strikefalcon python3 download_options.py 2>&1 | tee download_log.txt # Download stock bars python3 download_stocks.py # Check cache size du -sh /opt/strikefalcon/shared/data/cache/options/ du -sh /opt/strikefalcon/shared/data/cache/stocks/

Git Workflow

cd /opt/strikefalcon git add -A git commit -m "feat: describe your change here" git push origin main git log --oneline -5 # If git fails with "index.lock" rm -f .git/index.lock # If git fails with "loose object" errors git fsck --full && git gc
Server Safety Rules
Protected Processes — NEVER kill:
  • tmux sessions: theta (ThetaTerminal), download (data downloads), notify (alerts)
  • ThetaTerminal (Java process on port 25503)
  • Cloudflare tunnel (cloudflared)
  • nginx, MySQL/MariaDB
  • Kestrel and Peregrine trading processes
Forbidden commands:
  • kill / pkill / killall with broad patterns
  • kill -9 on any PID without verifying it's not a tmux child
  • xargs kill piped from ps or grep
  • tmux kill-server
  • systemctl stop on any service without explicit approval
  • rm -rf on /opt/strikefalcon/shared/data/ (cached market data)

Stale Terminal Cleanup

A cron job runs every 4 hours: /opt/strikefalcon/shared/cleanup_stale_terminals.sh. This handles orphaned VS Code Remote shells automatically. Do not manually kill terminals — the cron job handles it.

Common Issues
ProblemLikely CauseFix
Port 5080 in useOld strikefalcon process survivedsudo fuser -k 5080/tcp then restart
No Kestrel alertsScore below threshold or bear regime filterLower MIN_ALERT_SCORE or check regime_state.json
Peregrine finds no tradesIV rank below threshold or VIX too lowUse Ticker Inspector to diagnose gate failures
Merlin retrain failsNot enough labeled signals (< 500)Wait for shadow data to accumulate
IBKR connection dropsTWS/Gateway session expiredReconnect TWS, restart affected service
ThetaData timeoutTerminal not running or overloadedcurl localhost:25503/v2/health + restart terminal
Git push fails with "index.lock"Stale lock filerm -f .git/index.lock
Pipeline crash mid-runMemory or data issuerun_pipeline.py --resume picks up from checkpoint

tmux Session Management

# List all sessions tmux list-sessions # Attach read-only (safe) tmux attach -t theta -r # Send a command to a session tmux send-keys -t download "ls -la" Enter # Check what's running in each session tmux list-panes -a -F '#{session_name} #{pane_pid}' # NEVER detach or kill existing tmux sessions