Complexity Insights
VOL. I  ·  APR 2026
Home/ Projects/ S&P 500 network analysis
← Back to the portfolio
Network Science · Quantitative Finance № 002 · CASE

Mapping the structure of the S&P 500.

Five years of daily prices, folded into a weighted correlation graph. The question: where does systemic risk live, and what happens to the graph when the market breaks?

Window
Jan 2020 Oct 2025
Universe
503 constituents
Observations
1.2 M daily bars
Communities found
11 (Louvain)
AAPLMSFTNVDAAMZNMETA JPMBACXOMCVXJNJ +493
ρ̄ · 0.41
Mean pairwise corr
+37%
Density in crisis window
d · 3.2
Avg shortest path
Q · 0.62
Modularity (Louvain)

Executive summary.

Financial markets are interconnected systems where one asset's move cascades through the network. Understanding that structure is the whole game for risk management, portfolio construction, and spotting systemic vulnerabilities before they matter.

This project analyzes correlation networks among S&P 500 stocks to reveal hidden market structure. By applying network science to five years of daily price data, I identified the most systemically important nodes, the community structure the graph settles into, and — most usefully — how that topology degrades during periods of market stress.

503
Assets
Current S&P 500 constituents
1.2M
Daily bars
2020 – 2025 OHLCV
18.4k
Edges at ρ > 0.5
Thresholded correlation graph
0.62
Modularity
11 communities, Louvain
Key finding — correlation breakdown

During the 2022 drawdown, network density rose +37%. Diversification fails precisely when investors need it most.

When the market sells off, pairwise correlations collapse toward one. The graph densifies, community structure blurs, and what looked like an 11-sector market starts behaving like a single factor.

Methodology & pipeline.

The analysis pipeline was built in Python for reproducibility and scalability. Data was sourced via the Yahoo Finance API, cleaned with Pandas, and modeled as a weighted graph where nodes represent stocks and edges represent pairwise Pearson correlations above a threshold.

1 · Data ingestion

Collected daily adjusted closing prices for all current S&P 500 constituents (Jan 2020 – Oct 2025). Computed daily log returns to stabilize variance and strip out price-level effects before any correlation is taken.

2 · Graph construction

Built the graph using NetworkX. To control noise, I applied a thresholding step: only pairs with |ρ| > 0.5 become edges. Weights carry the sign, so the layout respects both co-movement and counter-movement.

# Constructing the correlation graph import networkx as nx import numpy as np def build_network(corr, threshold=0.5): G = nx.Graph() tickers = corr.columns.tolist() for i in range(len(tickers)): for j in range(i+1, len(tickers)): rho = corr.iat[i, j] if abs(rho) > threshold: G.add_edge(tickers[i], tickers[j], weight=rho) return G

The graph, interactive.

A force-directed layout of 60 representative constituents across the 11 Louvain communities. Drag a node to pull on the whole web. Raise the threshold slider and watch low-signal edges drop away; what's left is the backbone.

FIG. 01 · FORCE-DIRECTED   S&P 500 CORRELATION GRAPH LIVE
Nodes: 60 Edges: 0 Density: 0.00
Financials Tech Energy Consumer Healthcare Industrials Materials Utilities
Figure 1. Force-directed layout of 60 constituents across 11 Louvain communities. Node colour = community; node size = weighted degree. Edge opacity = correlation strength. Drag the threshold slider to prune edges.

Correlation breakdown.

The headline finding of this project lives in a single visual: the sector-level ρ-matrix, compared across three regimes. Toggle between calm, stressed, and crisis windows. In calm markets, sectors are nicely separated (dark diagonal, light off-diagonal). In a crash, the whole matrix floods red — every sector moves together.

FIG. 02 · REGIME COMPARISON   SECTOR ρ-MATRIX INTERACTIVE
Window: JAN 2021 – DEC 2021 Mean ρ: 0.38 VIX ref: 18.4
Calm · 2021 Stress · 2020 Crash · 2022 bear
ρ scale: 0.0 0.4 0.7 1.0
Figure 2. 11 × 11 mean sector correlations across three market regimes. Under stress, diagonal dominance collapses; the block structure that diversification strategies rely on dissolves into a single hot band.

Results & findings.

Sectoral clustering (community detection)

Using the Louvain algorithm for modularity optimization, the network partitioned into 11 distinct communities. These largely aligned with GICS sectors, but the algorithm surfaced a handful of cross-sector dependencies worth naming:

Centrality & systemic risk

Computed eigenvector centrality to identify the most influential nodes. Unlike market cap, centrality measures how connected a stock is to other highly connected stocks — closer to the regulators' notion of systemic importance.

Finding. Financials and Industrials showed higher average centrality than Tech, despite smaller market caps. The practical read: a shock originating in a major bank propagates through the system faster than a same-magnitude shock to a tech giant, because the bank sits closer to the weighted core of the graph.

Technical stack.

The project leverages:

Future directions.

The current analysis uses static correlation windows. To sharpen predictive power for trading applications, the next phase will implement:

§ END OF CASE § N. T. · WASHINGTON, D.C.
← The portfolio
Back to the index of projects
Essay →
Why traditional models miss systemic risk
Have a network worth mapping?

Commission a bespoke analysis.

Custom correlation graphs for specific universes, systemic-risk scoring, or a dashboard that watches the topology in production. Contract work welcome.

Start a conversation →
© 2026 Complexity Insights · Nicholas Thomas · Washington, D.C. ← Return to projects