Back To Top

August 14, 2025

Detecting Market Manipulation

Flag unstable and clean market zones by combining adaptive volatility, spectral, and prediction error signals

Detecting unstable markets is not straightforward. However, here we’ll aim to do just that!

We’ll build a “Composite Manipulation Index” to offer a read on how price, volume, and structural changes interact in real time.

We move beyond standard volatility and blend rolling quantiles, median filters, autocorrelation adjustment, and a dedicated volume signal.

Earnings events are automatically masked out to avoid false positives. New market regime shifts get flagged using structural break detection.

The goal isn’t to predict moves, but to identify when the underlying structure departs from the norm.

Especially when unusual volume and persistent patterns line up.

The complete Python notebook for the analysis is provided below.

Detecting Market Manipulation

1. The Composite Manipulation Index

Market manipulation can appear as erratic price swings, sudden volume spikes, or brief regime changes.

The CMI approaches this and flags periods when market structure departs meaningfully from what’s been typical for that asset.

We use price structure, volume, and event-aware filtering. Specifically, the following aspects consitute “market normalcy” as per the our CMI.

  • Price Structure: Price moves within recent norms, without erratic shifts.
  • Spectral Balance: Trends and noise are balanced; high-frequency “choppiness” doesn’t dominate.
  • Predictive Consistency: Recent prices stay close to what short-term patterns suggest.
  • Volume Activity: Volume stays near its typical range, without outlier spikes or droughts.
  • Persistence: True instability lasts; signals don’t trigger on brief noise.
  • Autocorrelation: No abnormal trendiness or mean-reversion versus recent history.
  • Event Sensitivity: Only assess structure outside scheduled events like earnings.
  • Regime Stability: No major breaks; broad trends and volatility stay intact.

1.1 Price Structure

Normal Condition: Price moves without abrupt or outlier shifts.

CMI Methodology: The CMI measures how much price deviates from recent patterns using an adaptive window that expands during high volatility and contracts during calm periods.

Detecting Market Manipulation
  • BASE_WIN: The minimum window size for all rolling calculations. Larger values smooth out the index; smaller values make it responsive.
  • ATR_MULT: Multiplier that adjusts how sensitive the adaptive window is to changes in volatility.

1.2 Spectral Balance

Normal Condition: The market maintains a stable blend of slow trends and natural noise, without the market turning “jittery”.

CMI Methodology: The spectral instability metric checks whether high-frequency noise is overtaking smoother market rhythms.

A rise in this ratio suggests more erratic, less organized trading activity.

Detecting Market Manipulation
  • highBand: The fast EMA deviation to show high-frequency (choppy) price movement.
  • lowBand: The difference between two slow EMAs to represent longer-term trends.

1.3 Predictive Consistency

Normal Condition: Recent price action does not diverge excessively from its own short-term patterns.

CMI Methodology: The CMI fits a two-term autoregressive model to the closing price, then calculates the error between model and reality.

Elevated error means price is behaving less like itself.

Detecting Market Manipulation

at, bt: Adaptive coefficients, recalculated for each window, that best fit a two-lag model to recent price action.

The adaptive EMA keeps this error measurement up-to-date with changing market regimes.

1.4 Volume Activity

Normal Condition: Trading volume tracks its historical mean, with occasional natural surges for news or trend changes.

CMI Methodology: The CMI uses a rolling z-score to flag when today’s volume is unusually high or low compared to the recent window.

Spikes here, especially alongside price instability, amplify the manipulation signal.

Detecting Market Manipulation

μ_Volume, σ_Volume: The rolling mean and standard deviation of volume.

1.5 Persistence

Normal Condition: Real structural changes last more than one bar; random noise does not.

CMI Methodology: The index only flags high-risk (or clean) zones when instability (or stability) persists for a specified number of consecutive days.

Detecting Market Manipulation

P is the number of consecutive bars required for a signal to be confirmed. Higher persistence reduces false positives from random noise.

1.6 Autocorrelation

Normal Condition: Markets do not show extreme, abnormal trendiness or mean-reversion beyond what’s usual for the regime.

CMI Methodology: Lag-1 autocorrelation of returns is monitored. When autocorrelation is high (either positive or negative), instability readings are adjusted downward to avoid overreacting to benign trends.

Detecting Market Manipulation

Where ρ1,t is the rolling lag-1 autocorrelation of returns.

1.7 Event Sensitivity

Normal Condition: Scheduled events (like earnings releases) are ignored in structural analysis, since they create artificial breaks in price and volume.

CMI Methodology: The CMI masks out readings for dates near earnings or other flagged events.

1.8 Regime Stability

Normal Condition: Broad market structure (e.g. moving average spreads) stays within its usual range.

CMI Methodology: The CMI checks for structural breaks using the rolling standard deviation of the spread between long-term moving averages.

Detecting Market Manipulation

BREAK_Z: The number of standard deviations required to flag a structural break. Larger values mean only extreme shifts trigger a break.

1.9 Composite Scoring

All individual instability measures are blended for a final score:

Detecting Market Manipulation

This raw score is then smoothed with a median filter and an exponential moving average for stability:

Detecting Market Manipulation

And finally, the autocorrelation adjustment is applied:

Detecting Market Manipulation

Signal Generation

Instead of using fixed thresholds, the CMI adapts to recent history via rolling quantiles:

  • High-risk zones: When CMIadj,t exceeds the recent high quantile and volume is abnormally high, confirmed by persistence.
  • Clean zones: When CMIadj,t falls below the recent low quantile and volume is quiet, confirmed by persistence.

2. Implementing CMI in Python

2.1. Parameter Setup

We set all the main parameters up front. These control how sensitive or stable the CMI will be.

You can adjust everything, i.e. lookback windows, smoothing, volume thresholds, event padding, and more.

Each parameter is explain as comments in the code snippet below.

				
					import math
import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from matplotlib.patches import Patch

# ── parameters ──
TICKER         = "TSLA"
START_DATE     = "2023-01-01"
END_DATE       = "2025-12-31"
INTERVAL       = "1d"

BASE_WIN       = 50             # var window & min EMA window; ↑ smooths variance/EMA more but delays signal, ↓ makes CMI more reactive
ATR_LEN        = 14             # ATR lookback; ↑ captures longer volatility trends, ↓ focuses on recent volatility
ATR_MULT       = 1.0            # ATR scaling factor; ↑ increases adaptive window size (smoothing), ↓ tightens window (more sensitivity)
SINE_LEN       = 20             # sine period; ↑ models longer oscillations, ↓ captures shorter cycles
SMOOTH         = 5              # CMI smoothing span; ↑ further smooths final CMI (fewer spikes), ↓ preserves sharp moves

VOL_WIN        = 50             # window for volume variance; ↑ uses longer history (stable), ↓ uses short history (sensitive)
VOL_CONFIRM    = 1.0            # volume z‐score threshold; ↑ requires stronger volume confirmation, ↓ allows weaker volume support
VOL_SCALE      = 0.3            # fraction of price‐axis height to use for max volume bar

MED_FILT       = 3              # median filter length; ↑ removes longer noise clusters, ↓ only kills single-bar spikes
PERSIST_DAYS   = 2              # days in a row to confirm; ↑ demands longer persistence (fewer signals), ↓ allows quicker signals

QUANT_WINDOW   = 20             # days for empirical quantiles; ↑ uses broader regime context, ↓ adapts quicker to new regimes
QUANT_LOW      = 0.25           # lower quantile; ↑ relaxes clean threshold (more green zones), ↓ tightens it (fewer green zones)
QUANT_HIGH     = 0.80           # upper quantile; ↑ restricts high-risk zone to most extreme (fewer red zones), ↓ broadens risk area

EARN_PAD       = 1              # days before/after earnings to mask; ↑ masks larger event window, ↓ only skips exact date

BREAK_WIN      = 250            # window for spread std; ↑ smooths break detection (fewer breaks), ↓ reacts faster to regime shifts
BREAK_Z        = 3.0            # std‐dev threshold for break; ↑ needs larger deviation to flag break, ↓ flags smaller shifts as breaks
				
			

2.2. Helper Functions

Next, define a few helper functions to keep the main code simple:

  • download market price data
  • fetch earnings dates
  • calculate the ‘Average True Range’
  • run adaptive EMAs
  • estimate rolling autocorrelation
				
					# ── helpers ──
def download_data(tkr):
    df = yf.download(
        tkr, start=START_DATE, end=END_DATE,
        interval=INTERVAL, auto_adjust=True,
        progress=False, threads=False
    )
    if isinstance(df.columns, pd.MultiIndex):
        df.columns = df.columns.get_level_values(0)
    return df.sort_index().drop_duplicates()

def earnings_days(tkr):
    tk = yf.Ticker(tkr)
    try:
        edf = tk.get_earnings_dates()
        if isinstance(edf, pd.DataFrame) and not edf.empty:
            return pd.DatetimeIndex(edf.index).normalize()
    except Exception:
        pass
    cal = getattr(tk, "calendar", None)
    if isinstance(cal, pd.DataFrame) and "Earnings Date" in cal.index:
        return pd.DatetimeIndex([pd.to_datetime(cal.loc["Earnings Date"].iloc[0])]).normalize()
    if isinstance(cal, dict):
        val = cal.get("Earnings Date") or cal.get("earningsDate")
        if val:
            return pd.DatetimeIndex([pd.to_datetime(val)]).normalize()
    return pd.DatetimeIndex([])

def compute_atr(df, n):
    tr = pd.concat([
        df["High"] - df["Low"],
        (df["High"] - df["Close"].shift()).abs(),
        (df["Low"]  - df["Close"].shift()).abs(),
    ], axis=1).max(axis=1)
    return tr.rolling(n, min_periods=n).mean()

def adaptive_ema(src, win_s):
    out = np.full(len(src), np.nan)
    for i in range(len(src)):
        w = int(round(win_s.iat[i])) if not math.isnan(win_s.iat[i]) else BASE_WIN
        w = max(1, w)
        α = 2.0/(w+1.0)
        prev = out[i-1] if i>0 else np.nan
        out[i] = src.iat[i] if i==0 or math.isnan(prev) else α*src.iat[i] + (1-α)*prev
    return pd.Series(out, index=src.index)

def lag1_autocorr(x):
    return x.autocorr(lag=1) if x.count()>1 else np.nan
				
			

2.3. Data Loading and Preparation

Next, pull in the historical price and volume data for the chosen ticker.

Set the main rolling volatility measure, and set up adaptive window lengths for everything else.

				
					# ── load & compute ──
df = download_data(TICKER)
df["ATR"] = compute_atr(df, ATR_LEN)
df["adaptive_win"] = (BASE_WIN*(df["ATR"]/df["Close"])*ATR_MULT).round().clip(lower=10).ffill()
				
			

2.4. Earnings/Event Calendar

Pull earnings event dates.

These are needed for masking out periods where volatility is about scheduled information releases.

Prev Post

Extracting Market Crash Probabilities

Next Post

Future Prices with NIG Distributions

post-bars
Mail Icon

Newsletter

Get Every Weekly Update & Insights

[mc4wp_form id=]

Leave a Comment