| Module | Purpose | Mode | Output |
|---|---|---|---|
| Peregrine | Options engine — 18 strategies, gate filtering, autonomous execution | Fully autonomous | IBKR orders |
| Kestrel | Stock swing trade scanner — ML-scored, shadow-tracked | Scan & alert | Discord alerts + IBKR orders |
| Merlin | ML intelligence layer — XGBoost training, regime detection | Scheduled (weekly) | Model files + regime state |
| Shared | Backtest engine, data cache, ThetaData client, utilities | Library | Parquet data + backtest results |
| Dashboard | Web UI hub — landing page, Merlin console, system guide | Always-on | Browser interface |
| Source | Provides |
|---|---|
| yfinance | All stock price data (historical + live batch), VIX, ticker details |
| ThetaData | Options chains (real-time snapshots + historical EOD), IV/Greeks raw data |
| IBKR | Order execution + account tracking only — NOT a data source |
| LunarCrush | Social sentiment scores for catalyst detection |
| URL | Service | Port | systemd unit |
|---|---|---|---|
| strikefalcon.com | Landing hub | 5080 | strikefalcon.service |
| strikefalcon.com/guide | This guide | 5080 | strikefalcon.service |
| strikefalcon.com/merlin | ML intelligence dashboard | 5080 | strikefalcon.service |
| peregrine.strikefalcon.com | Options engine dashboard | 5055 | peregrine-web.service |
| kestrel.strikefalcon.com | Stock scanner dashboard | 5001 | kestrel.service |
# Live scanning (every 15 seconds)
yfinance → stock bars → Kestrel gates → ML score → Discord alert + IBKR order
ThetaData → options chain → Peregrine gates → strategy select → IBKR order
# ML training cycle (weekly)
Kestrel scans → shadow_log.db (44 feature vectors per candidate)
→ forward return labeling (1/3/5/10/20 day)
→ Merlin auto_retrain (Friday 4:05 PM ET)
→ XGBoost model → Kestrel picks up new model on next scan
# Historical backtest pipeline (cron)
Parquet cache (stocks/ + options/) → Greeks computation
→ feature engineering → 18-strategy simulation
→ trade labeling → training data export
→ Merlin retrain with historical data
# Regime detection
VIX + SPY returns + market breadth → regime_state.json
→ consumed by both Kestrel and Peregrine for strategy selectionThe system scans 338 liquid tickers dynamically built from seed lists and filtered by options liquidity:
score_strategy_heuristic()Executed before fetching any options data. Fast, cheap checks using stock price data from yfinance.
Relative Strength Index filter. Hard gate at extremes (<5 or >95), soft gate in moderate range. Prevents entries on extremely overbought/oversold tickers. Default range: 25-75
Measures drawdown from recent highs over a 20-60 bar lookback. Rejects falling knives (<-15%) and chasing-ATH setups (>-1%). Default range: -15% to -1%
Price cannot be more than 20% below the 50-day SMA. Hard gate for standard tickers, soft gate (skipped) for designated momentum tickers like TSLA, NVDA, AMD that regularly trade extended.
Historical volatility percentile rank over trailing 252 days. Rejects low-vol environments where premium is too thin and extreme-vol where risk is unquantifiable. Default range: 20th-60th percentile
Implied volatility rank measures where current IV sits relative to its own 1-year range. Min 20%, max 50%. Uses HV as proxy when real-time IV unavailable.
LunarCrush social sentiment score + earnings date buffer. Vetoes any ticker within 14 days of an earnings announcement. Rejects tickers with extreme negative sentiment.
Detects unusual news/filing activity. Headline velocity veto: >6 headlines in 6 hours. 8-K filing risk floor: 0.85 probability. Prevents entries ahead of material news events.
Executed after fetching live options chain data from ThetaData. These are more expensive checks requiring real option prices.
Minimum option volume 10 contracts, maximum bid-ask spread 15%. Ensures the trade can be entered and exited without excessive slippage.
Compares implied to historical volatility. Warns when IV/HV > 1.5x (options expensive, good for selling). Rejects when < 0.8x (no premium to capture).
Recomputes IV rank from the actual options chain data (not the proxy estimate from pre-chain). More accurate but requires the chain fetch.
Checks correlation between new candidate and existing portfolio positions. Max average correlation 0.70. Prevents concentration in correlated names (ported from Kestrel).
Live options positions: ticker, strategy, expiry, strikes, current P&L, DTE, Greek exposure. Red rows are approaching stop-loss.
Real-time log of every ticker scanned, which gates it passed or failed, and why. See the funnel in action.
Type any ticker and hit Run. Peregrine runs all gates live and shows exact pass/fail with actual values. Essential for debugging why a specific ticker was rejected.
KPI cards: realized P&L, win rate, average DTE, position count. Updates live from IBKR portfolio.
shadow_log.db with its 44-feature vector for Merlin to label and learn from| Gate | What It Checks | Threshold |
|---|---|---|
| RSI | Relative Strength Index (overbought/oversold) | 15 – 80 |
| HV Rank | Historical volatility percentile | 10th – 95th |
| Pullback | Drawdown from recent high | -40% to 0% |
| SMA Trend | Price vs 50-day SMA | Not >20% below |
| VIX | Market-wide fear index | Max 45 |
| Sentiment | LunarCrush social score | Min -0.5 |
| Catalyst/Earnings | Earnings proximity | 3-day buffer |
Every scan candidate — whether it passed or failed gates — gets logged to shadow_log.db with its complete 44-feature vector. This is the training data source for Merlin. After each trading day, Merlin's labeler checks back and records what actually happened:
classify_win() for all 18 strategiesThis creates a continuously growing dataset of real market observations linked to outcomes. The ML model retrains weekly on this expanding dataset.
Portfolio value, day P&L, buying power, position count — live from IBKR.
Every ticker from the latest scan sorted by ML score. Green rows passed all gates. Click any row to see individual gate breakdowns.
Open equity positions with unrealized P&L, entry price, current price. Live IBKR updates.
Walk-forward backtest performance, win rate by regime, closed-trade history.
# Decision tree for strategy selection
1. Determine volatility regime from VIX
2. Determine directional bias from technical indicators (trend, momentum)
3. Check IV level — high IV favors selling, low IV favors buying
4. Score all eligible strategies via score_strategy_heuristic()
5. Filter strategies scoring below 0.20 threshold
6. Select top strategy — execute if portfolio risk checks passVIX level determines which strategies are viable. The system classifies the current environment into 5 regimes:
Buy 1 ATM or OTM call. Simple directional bet on upside. Best in low-vol regimes when options are cheap.
Buy 1 ATM or OTM put. Directional bet on downside or portfolio hedge.
Sell OTM put + buy further OTM put (same expiration). Collect credit, profit if stock stays above short strike.
Sell OTM call + buy further OTM call (same expiration). Collect credit, profit if stock stays below short strike.
Bull put spread + bear call spread (same expiration). Profit from range-bound movement and time decay.
Sell ATM call + put, buy OTM wings. Higher credit than condor but narrower profit zone. Best when expecting pin to ATM.
Buy ITM/ATM call + sell OTM call (same expiration). Defined risk bullish play, cheaper than naked long call.
Buy ITM/ATM put + sell OTM put (same expiration). Defined risk bearish play.
Sell front-month ATM option + buy back-month same strike. Profits from time decay differential and IV expansion in the back month.
Sell front-month OTM call + buy back-month ITM/ATM call. Calendar with a directional tilt.
Buy deep ITM LEAPS call (delta > 0.70) + sell short-term OTM call. Synthetic covered call without owning 100 shares. Capital efficient.
Sell OTM put + sell OTM call + buy further OTM call (same expiry). No upside risk if call credit covers spread width. Downside risk on the naked put.
Sell 2 ATM options + buy 1 OTM + 1 further OTM (skip-strike). Asymmetric butterfly with reduced risk on one side.
Sell 1 ATM call + buy 2+ OTM calls (same expiry). Speculative long-vol play. Profits from large upside moves. Uses 0.7x position sizing.
Sell 1 ATM put + buy 2+ OTM puts (same expiry). Speculative downside play. Uses 0.7x position sizing.
Hold 100 shares + sell OTM call. Generate income from existing stock holdings. Requires owning the underlying.
Sell OTM put with cash collateral. Willing to own the stock at a lower price. Requires full cash backing.
Hold 100 shares + buy OTM put (protection) + sell OTM call (pays for put). Defined risk wrapper around existing stock.
| Regime | Best Strategies | Avoid |
|---|---|---|
| Low Vol (<15) | Long call/put, debit spreads, diagonal, PMCC, calendar | Credit spreads (premium too thin) |
| Normal (15-20) | All strategies viable; slight edge to spreads | None |
| Elevated (20-30) | Iron condor, bull put, bear call, jade lizard, CSP, covered call | Naked long options (expensive) |
| Crisis (30-50) | Wide-wing credit spreads only, reduced sizing | Most strategies — risk too high |
| Panic (>50) | None — scanning suspended | Everything |
| Metric | Value | What It Means |
|---|---|---|
| Total Historical Data | 543,225 | Signals processed into models over the lifetime of the system |
| Raw Signals (20D) | 191,161 | Signals currently in observation cycle (awaiting forward return measurement) |
| Backlog to Process | 0 | Mature signals (24h+) ready to label — 0 means labeling is caught up |
| Holdout AUC-ROC | 0.9800 | Model's ability to distinguish safe vs danger signals on held-out data |
| Market Regime | Range Bound | Current regime classification (BT:0.19 BV:0.33 BR:0.18 MR:0.54) |
| Next Unit Update | 2026-02-27 16:05 | Next scheduled auto-retrain (Friday 4:05 PM ET) |
| Last Train | 2026-02-20 15:39:31 | Most recent completed training run |
Features tracked: Momentum, rsi, roc_5, roc_10, roc_20, macd, macd_signal, macd_hist, +148 more feature columns fed into the model.
Every Kestrel scan writes a feature vector (44 features) to shadow_log.db. Every candidate is logged — both passes and failures — creating an unbiased training dataset.
Each signal gets labeled with actual outcomes: 1-day, 3-day, 5-day, 10-day, and 20-day forward returns. Plus 18 per-strategy win/loss labels via classify_win().
XGBoost trains on the full labeled dataset using GPU acceleration (RTX 4080). Walk-forward validation prevents overfitting. Primary label: fwd_return_3d >= -3% (safety label).
New model saved to signal_model_latest.pkl with version logged to model_versions table. Kestrel picks up the new model on its next scan cycle automatically.
Every scan candidate is represented by these 44 engineered features. Each captures a different dimension of the trading setup.
| Label | Definition | Used By |
|---|---|---|
| fwd_return_3d | 3-day forward stock return >= -3% = safe | Primary Kestrel model |
| bs_pnl_pct_5d | 5-day simulated option P&L via Black-Scholes repricing | Options models |
| good_for_{strategy}_5d | Per-strategy win label via classify_win() with IV-normalized thresholds | Strategy-specific models (18) |
The classify_win() function uses different thresholds per strategy archetype: directional strategies need the stock to move in their direction, range-bound strategies need the stock to stay within bounds, and credit strategies need to avoid assignment.
Merlin writes the current market regime to regime_state.json, consumed by both Kestrel and Peregrine. Regime is determined by:
Regime states: bull_trending, bull_volatile, bear_trending, mean_reversion
Total labeled signals, queue size, ready for ML, AUC-ROC, current regime, VIX, and next train time.
Every training run: version, timestamp, row count, AUC-ROC, F1, top features by importance. Target AUC: 0.60-0.75 (higher is suspicious).
Live training logs. Click Execute Intelligence Cycle for manual retrain. Shows options outcome tracker + stock return labeling + XGBoost training in sequence.
# Kelly position sizing
win_rate = 0.56
avg_win = 745
avg_loss = 475
kelly_f = win_rate - (1 - win_rate) / (avg_win / avg_loss)
position = account * kelly_f * 0.25 # fractional Kelly| Greek | Limit | Why |
|---|---|---|
| Delta | 100 per $100K | Directional risk cap |
| Vega | 500 total | Volatility exposure cap |
Greeks are computed via Black-Scholes (closed-form from solved IV). IV is solved using Brent's method for speed (8-12 iterations vs 50 for bisection).
| Trigger | Action |
|---|---|
| P&L >= +50% | Close half position |
| P&L >= +100% | Close full position |
| P&L <= -30% | Hard stop — close full |
| DTE <= 14 | Time stop — close to avoid gamma risk |
| DTE <= 7 | Forced close (backtest) |
Automated safety switches that halt trading when conditions deteriorate:
| Trigger | Action | Resume |
|---|---|---|
| Daily P&L <= -10% | Pause all new entries | Next trading day |
| VIX spike above crisis threshold | Suspend scanning | VIX normalizes |
| IBKR connectivity loss | Safe mode — no new orders | Connection restored |
| Anomaly detection | Alert + pause | Manual review |
Realistic transaction cost modeling based on ticker liquidity and market conditions:
| Tier | Daily Volume | Fill Assumption |
|---|---|---|
| Liquid | > 5M shares/day | 80% of mid (tight spreads) |
| Mid-liquidity | 1M – 5M shares/day | 65% of mid |
| Illiquid | < 1M shares/day | 50% of mid (wide spreads) |
VIX regime scaling: VIX > 30 multiplies slippage by 0.85 (spreads widen in crisis). VIX < 15 multiplies by 1.05 (spreads tighten in calm). Tier determined by trailing 20-day average volume.
Options data: ThetaData EOD history
Stock data: yfinance daily bars
Storage: /opt/strikefalcon/shared/data/cache/ with stocks/ and options/ subdirectories.
scipy.optimize.brentq (8-12 iterations, much faster than bisection)data_quality: degraded# Full run, all tickers, all available dates
python run_pipeline.py
# Test with limited scope
python run_pipeline.py --tickers AAPL MSFT --start 2024-01-01 --end 2024-12-31
# Resume from crash
python run_pipeline.py --resume
# Process new data since last run + trigger ML retrain
python run_pipeline.py --retrain --since auto
# Re-export from existing checkpoint (no re-simulation)
python run_pipeline.py --export-only
# Dry run for validation
python run_pipeline.py --dry-run --max-tickers 5| Schedule | Command | Purpose |
|---|---|---|
| Sunday 6 PM ET | run_pipeline.py --retrain --since auto | Weekly incremental — process new data + retrain |
| 1st Saturday 2 AM ET | run_pipeline.py --retrain --force | Monthly full rebuild — reprocess everything |
/opt/strikefalcon/shared/data/backtest/
├── pipeline_checkpoint.db # Resumable checkpoint + trades
├── results/
│ ├── pipeline_trades_YYYYMMDD.csv # All trades (human-readable)
│ ├── training_data_YYYYMMDD.json # auto_retrain-compatible
│ └── strategy_data/
│ ├── long_call_training.csv
│ └── ... (18 per-strategy files)
├── reports/
│ └── pipeline_metrics_YYYYMMDD.json # Sharpe, Sortino, Brier, AUC
└── logs/
└── pipeline_YYYYMMDD.log338 tickers × ~50 weekly scans/year × 10 years × 18 strategies = ~3M+ simulated trades. With 4 parallel workers at ~500 MB each (~2 GB total), estimated runtime is under 1 hour for 2022-2026 and 2-3 hours for the full 2016-2026 dataset.
# Check status of all Strike Falcon services
sudo systemctl status peregrine-web kestrel strikefalcon --no-pager
# Restart individual services
sudo systemctl restart peregrine-web.service # Options engine
sudo systemctl restart kestrel.service # Stock scanner
sudo systemctl restart strikefalcon.service # Landing + Merlin hub
# If port 5080 is stuck (old process survived)
sudo fuser -k 5080/tcp && sudo systemctl restart strikefalcon.service
# View live logs
sudo journalctl -u peregrine-web.service -f --no-pager
sudo journalctl -u kestrel.service -f --no-pager
sudo journalctl -u strikefalcon.service -n 50 --no-pager# Via web UI (preferred)
kestrel.strikefalcon.com → Manual Scan
# Via terminal
cd /opt/strikefalcon/kestrel
python3 scanner.py --scan-once
# Check results
sqlite3 shadow_log.db \
"SELECT ticker, score, timestamp
FROM shadow_signals
ORDER BY timestamp DESC
LIMIT 20;"# Via web UI (preferred)
strikefalcon.com/merlin
→ Console tab
→ Execute Intelligence Cycle
# Via terminal
cd /opt/strikefalcon/kestrel/ml
python3 auto_retrain.py
# Check model
ls -lh models/signal_model_latest.pkl# Via web UI
peregrine.strikefalcon.com
→ Ticker Inspector → type NVDA → Run
# Via API
curl "http://localhost:5055/api/inspect_ticker?ticker=NVDA" \
| python3 -m json.tool# Read live regime state
cat /opt/strikefalcon/shared/regime_state.json \
| python3 -m json.tool
# Force recalculation
cd /opt/strikefalcon/shared
python3 regime_detector.py --verbose
# Regime values:
# bull_trending | bull_volatile
# bear_trending | mean_reversion# Verify ThetaData Terminal is running
curl http://localhost:25503/v2/health
# Download recent options history (full universe)
cd /opt/strikefalcon
python3 download_options.py 2>&1 | tee download_log.txt
# Download stock bars
python3 download_stocks.py
# Check cache size
du -sh /opt/strikefalcon/shared/data/cache/options/
du -sh /opt/strikefalcon/shared/data/cache/stocks/cd /opt/strikefalcon
git add -A
git commit -m "feat: describe your change here"
git push origin main
git log --oneline -5
# If git fails with "index.lock"
rm -f .git/index.lock
# If git fails with "loose object" errors
git fsck --full && git gctheta (ThetaTerminal), download (data downloads), notify (alerts)cloudflared)kill / pkill / killall with broad patternskill -9 on any PID without verifying it's not a tmux childxargs kill piped from ps or greptmux kill-serversystemctl stop on any service without explicit approvalrm -rf on /opt/strikefalcon/shared/data/ (cached market data)A cron job runs every 4 hours: /opt/strikefalcon/shared/cleanup_stale_terminals.sh. This handles orphaned VS Code Remote shells automatically. Do not manually kill terminals — the cron job handles it.
| Problem | Likely Cause | Fix |
|---|---|---|
| Port 5080 in use | Old strikefalcon process survived | sudo fuser -k 5080/tcp then restart |
| No Kestrel alerts | Score below threshold or bear regime filter | Lower MIN_ALERT_SCORE or check regime_state.json |
| Peregrine finds no trades | IV rank below threshold or VIX too low | Use Ticker Inspector to diagnose gate failures |
| Merlin retrain fails | Not enough labeled signals (< 500) | Wait for shadow data to accumulate |
| IBKR connection drops | TWS/Gateway session expired | Reconnect TWS, restart affected service |
| ThetaData timeout | Terminal not running or overloaded | curl localhost:25503/v2/health + restart terminal |
| Git push fails with "index.lock" | Stale lock file | rm -f .git/index.lock |
| Pipeline crash mid-run | Memory or data issue | run_pipeline.py --resume picks up from checkpoint |
# List all sessions
tmux list-sessions
# Attach read-only (safe)
tmux attach -t theta -r
# Send a command to a session
tmux send-keys -t download "ls -la" Enter
# Check what's running in each session
tmux list-panes -a -F '#{session_name} #{pane_pid}'
# NEVER detach or kill existing tmux sessions