Essay · Organizational Network Analysis · Spectral Graph Theory

The Mathematics of Organizational Network Analysis

Also available Open in Colab Interactive notebook GitHub repository README

From relationships to structure

Every ONA project starts with the same act: you map who talks to whom, who collaborates with whom, who trusts whom. What you end up with is a graph : a collection of nodes (people, teams, roles) and edges (interactions, ties, flows). For small networks you can draw this on a whiteboard and eyeball it. But organizations are complex, ties are numerous, and your intuition quickly hits its limits. That is where the mathematical machinery enters, and it enters through a single elegant object: the matrix.

The adjacency matrix: encoding the graph as algebra

Take a network of $n$ people. The adjacency matrix $A$ is an $n \times n$ table where each cell $A_{ij}$ equals $1$ if person $i$ and person $j$ have a connection, and $0$ if they do not. In a weighted network the cell holds the tie strength instead of a binary $1$.

This translation from drawing to table is not just convenient notation. It converts the network into a mathematical object you can compute with: multiply it, decompose it, exponentiate it. Everything that follows flows from this.

For undirected networks (where "Alice talks to Bob" implies "Bob talks to Alice") the matrix is symmetric: $A_{ij} = A_{ji}$ always. The row sum of row $i$ gives node $i$'s degree, that is, their number of direct connections, extracted by a trivial arithmetic operation.

A note on directed networks

Reporting hierarchies, approval chains, and information flows are naturally directional: the edge from $A$ to $B$ is not the same as the edge from $B$ to $A$. An analogous Laplacian $L = D_{\text{out}} - A$ can be constructed using out-degrees on the diagonal (where $D_{\text{out}}$ is the diagonal matrix of out-degrees), but it is no longer symmetric, and many of the spectral properties that hold for undirected graphs require modification or fail entirely.

Eigenvector centrality splits into two measures: in-centrality (the HITS authority score, that is, the leading eigenvector of $A^T A$, which captures who receives attention from high-status others) and out-centrality (the HITS hub score, the leading eigenvector of $A A^T$, which captures who sends attention toward high-status others). PageRank exists precisely because directed-graph centrality needed a principled solution. For most of the machinery below, assume undirected networks unless stated otherwise. A further reason: the directed Laplacian $L = D_{\text{out}} - A$ is not symmetric, so its eigenvalues can be complex numbers, which breaks the real-valued spectral decomposition that all subsequent analysis relies on. This is why directed analysis uses HITS scores and PageRank rather than Laplacian eigenvectors.

The degree matrix and the Laplacian

The degree matrix $D$ is the diagonal matrix where $D_{ii}$ equals the degree of node $i$ and all off-diagonal entries are zero. On its own it tells you little. But when you subtract the adjacency matrix from it, something remarkable happens.

The Laplacian matrix $L = D - A$ is the most analytically important object in graph theory. Its construction is simple but its properties are deep:

In an ONA context, $L$ tells you: how many isolated subgroups exist? Where are the structural fractures? If you removed a few people, would the organization fragment into islands? The full object is rich enough to support three useful variants (combinatorial, weighted, and normalised), which are developed in detail in the Laplacian interlude later. For now, all analysis uses the combinatorial version unless noted.

A concrete example: five nodes, one Laplacian

Consider a five-person network: Ana, Bo, Cy, Dev, and Eve, with connections Ana–Bo, Ana–Cy, Bo–Cy, Bo–Dev, and Cy–Eve. No other edges exist.

Ana Bo Cy Dev Eve deg=2 deg=3 deg=3 deg=1 deg=1
The five-node example network. Ana, Bo, and Cy form a dense core triangle; Dev and Eve are peripheral, each connected by a single bridge.

The degrees are: Ana = 2, Bo = 3, Cy = 3, Dev = 1, Eve = 1. The adjacency matrix $A$ and Laplacian $L = D - A$ are:

$$A = \begin{pmatrix} 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 \\ 1 & 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \end{pmatrix} \qquad L = \begin{pmatrix} \mathbf{2} & -1 & -1 & 0 & 0 \\ -1 & \mathbf{3} & -1 & -1 & 0 \\ -1 & -1 & \mathbf{3} & 0 & -1 \\ 0 & -1 & 0 & \mathbf{1} & 0 \\ 0 & 0 & -1 & 0 & \mathbf{1} \end{pmatrix}$$
Reading the Laplacian: The bold diagonal entries are the node degrees. Off-diagonal $-1$ entries mark edges. Every row sums to zero, which is the algebraic signature of a difference operator rather than an absolute-value one.

What is an eigenvalue? The defining equation

Before reading the spectrum, it is worth pausing to ask: what mathematical question are these numbers the answer to?

Given a square matrix $M$ and a non-zero vector $\mathbf{v}$, the eigenvalue equation asks: for which scalars $\lambda$ does multiplying $\mathbf{v}$ by $M$ produce the same result as simply scaling $\mathbf{v}$ by $\lambda$?

$$\boxed{L\,\mathbf{v} = \lambda\,\mathbf{v}}$$

A scalar $\lambda$ satisfying this for some non-zero $\mathbf{v}$ is an eigenvalue of $L$; the corresponding vector $\mathbf{v}$ is its eigenvector. Geometrically: applying $L$ to $\mathbf{v}$ does not rotate it — it only stretches or shrinks it by the factor $\lambda$. The eigenvectors are the directions the matrix "points along"; the eigenvalues tell you how strongly. The complete set of eigenvalues is the spectrum of $L$.

How eigenvalues are found

Rearranging the eigenvalue equation gives $(L - \lambda I)\,\mathbf{v} = \mathbf{0}$. For this to have a non-zero solution $\mathbf{v}$, the matrix $(L - \lambda I)$ must be singular — meaning its determinant must be zero:

$$\det(L - \lambda I) = 0$$

Expanding this determinant yields the characteristic polynomial, a degree-$n$ polynomial in $\lambda$. Its $n$ roots are the eigenvalues. For our $5 \times 5$ Laplacian this polynomial has five roots; finding them by hand would be laborious, but the computation is routine for any linear-algebra library.

Once each eigenvalue $\lambda_k$ is known, its eigenvector is found by solving $(L - \lambda_k I)\,\mathbf{v} = \mathbf{0}$ for $\mathbf{v}$. This is an underdetermined system — any scalar multiple of a solution is also a solution — so a unique answer requires an extra constraint. The standard convention is unit normalisation: we require $\|\mathbf{v}\| = 1$, meaning the sum of squared entries equals one. This pins the scale while leaving a sign ambiguity (both $\mathbf{v}$ and $-\mathbf{v}$ satisfy the unit-norm constraint), which is addressed further in the sign ambiguity section below.

Reading the equation on the five-node graph

To make this concrete: the Fiedler vector $\mathbf{v}_2$ with eigenvalue $\lambda_2 \approx 0.697$ satisfies $L\,\mathbf{v}_2 = 0.697\,\mathbf{v}_2$ exactly. That means if you write $\mathbf{v}_2$ as a column of five numbers (one per node) and multiply by $L$, you get back exactly $0.697$ times those same numbers — the matrix does not mix the entries in any complex way; it just scales the whole vector. We can verify this for Ana's row of $L$:

$$\underbrace{\begin{pmatrix}2 & -1 & -1 & 0 & 0\end{pmatrix}}_{\text{row for Ana}} \underbrace{\begin{pmatrix}0.000\\-0.205\\+0.205\\-0.677\\+0.677\end{pmatrix}}_{\mathbf{v}_2} = 2(0.000) + (-1)(-0.205) + (-1)(0.205) + 0 + 0 = 0.000$$

And $0.697 \times 0.000 = 0.000$. Both sides match. The same check holds for every other row. This is what it means for $\mathbf{v}_2$ to be an eigenvector of $L$ with eigenvalue $\lambda_2$.

Why five eigenvalues, and why ordered? A $5 \times 5$ matrix always has exactly five eigenvalues (counted with multiplicity). Because $L$ is symmetric and positive semi-definite, all five are real and non-negative, so we can sort them: $\lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_5$. $\lambda_1 = 0$ always, because the constant vector $\mathbf{1}$ is always an eigenvector with eigenvalue zero (every row of $L$ sums to zero, so $L\mathbf{1} = \mathbf{0} = 0 \cdot \mathbf{1}$). The rest encode progressively finer structural detail.

The spectrum of $L$ is approximately $\{0,\; 0.697,\; 1.382,\; 3.618,\; 4.303\}$. The Fiedler value $\lambda_2 \approx 0.697$ indicates moderate connectivity. Its eigenvector, the Fiedler vector, assigns:

−0.68 (Dev arm) +0.68 (Eve arm) Ana ≈ 0.00 Bo −0.20 Cy +0.20 Dev −0.68 Eve +0.68 cut
The Fiedler vector colour-coded on the graph. Blue nodes (Bo, Dev) have negative values; red nodes (Cy, Eve) have positive values; Ana sits exactly at zero — the geometric midpoint of the two arms, straddling the structural divide. The dashed pendant edges (Bo–Dev and Cy–Eve) highlight the two peripheral arms; the vertical dashed line marks the Fiedler threshold cut, which separates {Bo, Dev} from {Ana, Cy, Eve}. Note that the true articulation points (nodes whose removal disconnects the graph) are Bo and Cy, not Ana.

Fiedler vector vs eigenvector centrality: geometric position vs influence

It is instructive to compare two quantities on the same five-node graph. The Fiedler vector ($v_2$) reveals geometric position relative to the primary structural cut , specifically where a node sits in the partition. The eigenvector centrality (leading eigenvector of $A$, corresponding to $A$'s largest eigenvalue) reveals influence , that is, who is connected to well-connected others. They answer fundamentally different questions and can point in opposite directions.

Node Degree Fiedler $v_2$ (location) Eigenvector centrality (influence) Interpretation
Ana2$\approx 0.00$ (cut boundary)0.49Geometric midpoint of the two arms; sits on the Fiedler cut boundary. Not an articulation point.
Bo3$-0.20$ (Dev arm)0.57High influence; inner node of the Bo–Dev arm
Cy3$+0.20$ (Eve arm)0.57High influence; inner node of the Cy–Eve arm
Dev1$-0.68$ (leaf, Dev arm)0.25Peripheral leaf on the Bo arm; low influence
Eve1$+0.68$ (leaf, Eve arm)0.25Peripheral leaf on the Cy arm; low influence

Several things are worth noting. Bo and Cy score identically on eigenvector centrality; this is forced by the graph's symmetry: each connects to Ana, to the other hub, and to one leaf, so they are structurally mirror images. Dev and Eve are likewise mirror images and score equally. The graph does not have a symmetry swapping Dev and Eve with each other (Dev attaches to Bo, Eve to Cy), and the Fiedler vector reveals this: they land on opposite sides of the cut. Ana, by contrast, lands exactly at zero: she is the geometric midpoint of the two arms in spectral space, straddling the Fiedler cut boundary. This does not make Ana the most structurally vulnerable node — that distinction belongs to Bo and Cy, which are the actual articulation points (cut vertices): removing Bo disconnects Dev from the rest, and removing Cy disconnects Eve. Removing Ana leaves the remaining four nodes connected. The Fiedler value is not a prestige or vulnerability measure; it is a geometric one that identifies structural position relative to the primary partition.

The natural split revealed is the Bo–Dev arm (negative values) from the Cy–Eve arm (positive values), with Ana straddling the boundary. This entry pattern (diagonal equals degree, off-diagonal equals $-1$ for edges, zero otherwise) is specific to the unweighted combinatorial Laplacian. Weighted and normalised variants modify it in principled ways covered in the Laplacian interlude below.

Eigenvalues and eigenvectors: the spectrum of the organisation

The eigenvalue equation $L\mathbf{v} = \lambda\mathbf{v}$ and the unit-normalisation convention were introduced in the previous section using the five-node graph. Here we interpret what the resulting spectrum tells us about the organisation as a whole. Each eigenvalue measures the strength of one structural mode; its paired eigenvector assigns every node a coordinate along that mode, representing each person's position within that structural dimension.

The Fiedler value $\lambda_2$ and algebraic connectivity

The Laplacian always has at least one zero eigenvalue; the number of zero eigenvalues equals exactly the number of disconnected components. For a connected network there is exactly one zero eigenvalue ($\lambda_1 = 0$). The Fiedler value $\lambda_2$, the second smallest eigenvalue, is perhaps the single most important number in ONA:

Sign ambiguity, symmetry, and eigenvalue degeneracy

The Fiedler vector is defined only up to a sign: if $v$ solves $Lv = \lambda_2 v$, so does $-v$. A different numerical solver could return the flipped vector, with Bo and Dev positive and Cy and Eve negative. What is not arbitrary is the relative sign between nodes (same side of the cut means same sign), the magnitude of each entry, and $\lambda_2$ itself. In practice, always apply a sign convention before displaying or comparing Fiedler vectors, and never interpret the raw sign of a single entry in isolation.

What happens when the network is three-fold symmetric?

In the five-node example, Dev hangs off Bo and Eve hangs off Cy, but Ana has no pendant. Suppose you add Ana2 as a leaf off Ana, giving every triangle node exactly one pendant arm. The network is now three-fold symmetric: the three arms (Ana–Ana2, Bo–Dev, Cy–Eve) are structurally interchangeable.

The spectrum reveals the problem immediately. With three equivalent communities, no single straight cut separates the network into two balanced halves better than any other. Mathematically, $\lambda_2$ is now a degenerate eigenvalue shared by two linearly independent eigenvectors: $\lambda_2 = \lambda_3 \approx 0.697$. Any linear combination of these two eigenvectors is also a valid Fiedler vector. The algorithm picks one based on numerical noise, and a different run could return a completely different-looking result that is equally correct.

5-NODE NETWORK clear eigengap 0.00 0.70 1.38 3.62 4.30 gap λ₁ λ₂ λ₃ λ₄ λ₅ 6-NODE (3-FOLD) degenerate λ₂ = λ₃ 0.00 0.70 0.70 2.00 4.30 4.30 = same λ₁ λ₂ λ₃ λ₄ λ₅ λ₆
Left: the five-node spectrum has a clear eigengap between λ₂ and λ₃ (gap ≈ 0.68), indicating one clean two-way split. Right: adding Ana2 makes the network three-fold symmetric. Now λ₂ = λ₃ exactly: a degenerate eigenvalue shared by two eigenvectors. The zero gap signals three natural communities, not two.

The fix follows directly from the eigengap heuristic: a zero (or near-zero) gap between $\lambda_2$ and $\lambda_3$ signals that you need both eigenvectors, and that the network has three natural communities rather than two. Using $v_2$ and $v_3$ jointly as spectral coordinates and applying $k$-means with $k=3$ recovers the three arms (Ana–Ana2, Bo–Dev, Cy–Eve) perfectly.

ONA implication: Before interpreting the Fiedler vector as a definitive two-way split, check the eigengap. If $\lambda_2 \approx \lambda_3$, the network has (at least) three natural communities and the Fiedler vector alone is unreliable. Use the gap to choose $k$, then run spectral clustering with $k$ eigenvectors.

The Cheeger inequality: a formal bound on silos

The connection between $\lambda_2$ and organisational fragmentation is not merely intuitive; it is a theorem. Define the conductance (or Cheeger constant) of a graph as:

$$h(G) = \min_{S \subset V,\, |S| \leq n/2} \frac{|\partial S|}{\text{vol}(S)}$$

where $|\partial S|$ is the number of edges leaving the set $S$ and $\text{vol}(S) = \sum_{i \in S} d_i$ is the total degree inside $S$. Conductance measures the bottleneckedness of the worst cut, defined as the ratio of boundary edges to internal volume. A small conductance means there exists a large subgroup very poorly connected to the rest of the network.

The Cheeger inequality relates $\lambda_2$ to $h(G)$ with two-sided bounds. Stated for the normalised Laplacian $L_n = D^{-1/2}LD^{-1/2}$, whose Fiedler value we denote $\lambda_2(L_n)$:

$$\frac{\lambda_2(L_n)}{2} \leq h(G) \leq \sqrt{2\,\lambda_2(L_n)}$$

This is a profound result. The left inequality says: if $\lambda_2(L_n)$ is small, conductance must be small, meaning there provably exists a sparse cut somewhere in the network. A small Fiedler value of the normalised Laplacian is not just a signal of silos; it is a proof of their existence. The right inequality says the Fiedler vector of $L_n$ finds that cut efficiently: the threshold cut on $v_2$ achieves conductance at most $\sqrt{2\,\lambda_2(L_n)}$, within a bounded factor of optimal. (An analogous two-sided bound exists for the combinatorial Laplacian but with different constants involving maximum degree.)

ONA implication of the Cheeger inequality: When you report that $\lambda_2(L_n)$ is small, you are not speculating that silos might exist; you are invoking a mathematical guarantee that a sparse cut must exist. The Fiedler vector of $L_n$ then identifies exactly where that cut lies. This transforms silo identification from organisational intuition into a formal proof.

Higher eigenvalues, the eigengap heuristic, and multi-way partitioning

The Fiedler vector gives you the best two-way split. The third, fourth, and fifth eigenvectors give finer partitions, specifically spectral coordinates that, when clustered with $k$-means, reveal community structure at any desired resolution. This is spectral clustering, finding informal teams and silos invisible on any org chart.

But how do you know how many communities to look for? The answer lies in the eigengap heuristic: the number of natural communities $k$ in the network is indicated by the largest gap in the spectrum of $L$ between consecutive eigenvalues $\lambda_k$ and $\lambda_{k+1}$. When $k$ near-zero eigenvalues are followed by a sudden jump to $\lambda_{k+1}$, the network has $k$ communities that are internally dense and mutually sparse; the spectral structure is telling you exactly how many clusters to request.

eigenvalue index k λₖ λ₁ λ₂ λ₃ eigengap → k = 3 λ₄ λ₅ λ₆ λ₇ 3 near-zero → 3 communities
The eigengap heuristic: three near-zero eigenvalues followed by a sharp jump to $\lambda_4$ indicates $k = 3$ natural communities. The gap size measures how well-separated the community structure is.

Spectral graph drawing: eigenvectors as geometry

The eigenvectors of the Laplacian can serve as spatial coordinates, producing drawings of the network that reveal its geometry in a rigorous way. This is spectral graph drawing (Hall 1970), the foundation of most network visualisations in ONA reports.

Recall that $v_1$ is the constant vector; it carries no structural information. $v_2$ (the Fiedler vector) assigns every node a real number reflecting its position across the primary structural division. $v_3$ captures the next structural dimension, orthogonal to the first. Using these as $x$- and $y$-coordinates, then drawing edges as straight lines, gives the spectral layout.

arbitrary layout eigenvectors as coordinates spectral layout v₂ v₃
Left: the same graph drawn with an arbitrary layout where structure is hidden. Right: the spectral layout places nodes as $(v_2, v_3)$ coordinates. Community structure becomes visually self-evident; nodes within each community cluster tightly because intra-community springs are strong.
Why is this optimal? By the Rayleigh-Ritz characterisation (a consequence of the Courant-Fischer theorem), $v_2$ is the unit vector orthogonal to the constant vector $\mathbf{1}$ that minimises the Laplacian quadratic form $\sum_{(i,j)\in E}(x_i - x_j)^2$. The orthogonality constraint $x \perp \mathbf{1}$ removes the trivial constant solution; what remains is the direction of minimum non-trivial variation. This sum equals the total squared edge length in the layout. Minimising it means connected nodes are pulled as close together as possible, globally and simultaneously.

Physical interpretation: springs

Model every edge as an ideal spring with spring constant equal to its weight. Pin a few nodes to fixed positions; let the rest settle into mechanical equilibrium. Physics tells us equilibrium minimises total spring potential energy, $\frac{1}{2}\sum_{(i,j)} w_{ij}(x_i - x_j)^2$, which is exactly the Laplacian quadratic form. The eigenvectors are the natural vibration modes of this spring network. When community structure exists, intra-community springs are dense and strong, pulling community members close; the sparser inter-community springs pull clusters toward each other but cannot merge them.

Extension to higher dimensions

A 3D layout uses $v_2, v_3, v_4$ as spatial coordinates. More generally, a $k$-dimensional spectral embedding uses $v_2$ through $v_{k+1}$. For regular geometric objects this is exact: the vertices of a dodecahedron, embedded using its three-dimensional symmetry eigenspace, reproduce the dodecahedron precisely. The spectrum knows the geometry.

When does spectral drawing work, and when does it not?

structured · low λ₂ · drawable random · high λ₂ · undrawable
Left: a network with three communities (low $\lambda_2$) draws cleanly: clusters are spatially separated and edges are short. Right: a random 4-regular graph (high $\lambda_2$) produces a tangle regardless of the algorithm used. The tangle is not a visualisation failure; it is a structural fact.

The usefulness of a spectral layout depends directly on $\lambda_2$. A good 2D drawing requires a partition of nodes into two large sets with a small boundary, and that is exactly what a small Fiedler value certifies. When $\lambda_2$ is large (an expander graph), every large subset has many boundary edges, so no spatial separation is possible without making many edges long.

Practical diagnostic: before investing time refining a network visualisation, compute $\lambda_2$. If $\lambda_2 \gtrsim 10/\sqrt{n}$, no drawing algorithm will help. (This is a practical rule of thumb, not a hard theorem; treat it as a prompt to check rather than a verdict.) Report the high connectivity as the primary finding; it is itself evidence that the organisation lacks silo structure.

Centrality through the eigenvector lens

Degree centrality is a matrix operation but not a spectral one. Eigenvector centrality is. The idea: not all connections are equally valuable: a connection to a well-connected person matters more than a connection to an isolated one. Eigenvector centrality formalises this recursively: your score is proportional to the sum of your neighbours' scores.

Mathematically, find the leading eigenvector of $A$, specifically the eigenvector corresponding to the largest eigenvalue $\lambda_{\max}(A)$ (distinct from the Laplacian eigenvalues $\lambda_1 \leq \lambda_2 \leq \cdots$). Each node's entry is its eigenvector centrality score. In ONA this identifies not just the most connected people but the most influentially positioned , meaning those who can reach the broadest and most well-connected parts of the organisation quickly.

Katz centrality and PageRank extend this idea with a decay parameter, preventing extremely central nodes from overwhelming everyone else. PageRank, originally Google's link-scoring algorithm for directed networks, translates directly into "whose attention and endorsement matters most?", and it handles two failure modes that make simple eigenvector centrality problematic on directed graphs.

First, the dead-end problem: a node with no outgoing edges (a "dangling node") absorbs centrality from its predecessors but returns nothing, causing scores to drain out of the network. Second, the spider-trap problem: a group of nodes that link only among themselves can accumulate all the centrality in the graph, starving every other node. PageRank fixes both with a damping factor $\alpha$ (typically $0.85$):

$$\text{PR}(i) = \frac{1-\alpha}{n} + \alpha \sum_{j \to i} \frac{\text{PR}(j)}{\text{out-degree}(j)}$$

With probability $\alpha$ the random walker follows an actual edge; with probability $1-\alpha$ it teleports to a uniformly random node. This teleportation ensures every node receives a minimum baseline score and prevents probability mass from pooling in traps. In organisational terms, the teleportation models the possibility that information or attention can reach any part of the organisation through informal channels regardless of formal reporting lines, a realistic assumption.

The Graph Fourier Transform: reading organisational signals

In the Laplacian interlude that follows this section, the eigenvectors are described as "natural vibration modes" of the network. This analogy has a precise mathematical extension: just as the classical Fourier transform decomposes a time signal into frequency components, the Graph Fourier Transform (GFT) decomposes an organisational signal (any scalar quantity defined on nodes) into contributions from each structural mode.

Define an organisational signal $f \in \mathbb{R}^n$ as any vector assigning one value per node: employee engagement scores, performance ratings, absenteeism counts, or even binary indicators of "has left the company." The GFT of $f$ with respect to the Laplacian eigenvectors $\{v_1, v_2, \ldots, v_n\}$ is simply the projection onto each eigenvector:

$$\hat{f}(k) = v_k^T f = \sum_{i=1}^n f(i) \cdot v_k(i)$$

The coefficient $\hat{f}(k)$ tells you how strongly the signal $f$ "resonates" with the $k$-th structural mode. The reconstruction is exact: $f = \sum_k \hat{f}(k) \cdot v_k$.

Low-frequency vs high-frequency organisational signals

The analogy with audio frequencies is precise. Eigenvectors with small eigenvalues correspond to low frequencies , that is, smoothly varying patterns where neighbouring nodes in the network have similar values. Eigenvectors with large eigenvalues correspond to high frequencies , specifically rapidly oscillating patterns where adjacent nodes differ sharply.

Low frequency (small λ) signal varies smoothly across the network 0.8 0.7 0.6 0.5 silo: uniform engagement within group High frequency (large λ) signal oscillates between neighbours 0.9 0.1 0.8 0.2 random noise: no structural pattern
Low-frequency signals (left) vary smoothly; adjacent nodes share similar values. This is the signature of a structural pattern, such as a silo with uniformly low engagement. High-frequency signals (right) oscillate rapidly between neighbours, the signature of random noise with no structural cause.

In ONA practice, this decomposition is immediately useful. If a signal like employee engagement scores has most of its energy in the low-frequency eigenvectors (small $\lambda_k$), the pattern is structurally caused : it follows community boundaries, and the relevant intervention is structural (bridging the silo, rotating people across teams). If most energy is in high-frequency components, the variation is locally random , reflecting idiosyncratic individual differences that do not follow any network pattern, and the relevant intervention is individual rather than structural.

The GFT also enables network smoothing: multiply each Fourier coefficient by a filter function $g(\lambda_k)$ before reconstructing. Setting $g(\lambda_k) = e^{-t\lambda_k}$ gives the heat diffusion filter, which is exactly the heat equation on the graph, smoothing away high-frequency noise while preserving low-frequency structural signals. This is a principled method for denoising ONA data before community detection or centrality computation.

Powers of the adjacency matrix: walks and reachability

When you multiply $A$ by itself, entry $(i,j)$ of $A^2$ gives the number of walks of length exactly 2 between $i$ and $j$ , the number of mutual connections they share. $A^3$ gives walks of length 3, and so on.

The diagonal of $A^2$ recovers each node's degree. The off-diagonal entries of $A^2$ are the genuinely new information: how many people do two employees share as common contacts? High values signal strong structural equivalence.

Crucially, $(A^3)_{ii} / 2$ gives the triangle count for node $i$ directly. The reason: a closed walk of length 3 starting and ending at $i$ must traverse one triangle (since there are no self-loops in a simple graph), and each triangle can be traversed in two directions (clockwise and counterclockwise), so each triangle contributes exactly 2 to $(A^3)_{ii}$. High triangle counts mean a dense clique; low triangle counts with high degree identify brokers : people connecting otherwise-separate groups. This is the computational root of Ronald Burt's structural holes concept.

Structural holes, betweenness, and the constraint matrix

Burt's structural hole theory argues that the most strategically valuable position is between cliques , controlling information and resource flows across gaps. The constraint index that measures this is derived from a series of matrix operations on $A$: if all your contacts already know each other well, you are highly constrained. If your contacts belong to disconnected worlds, you are an unconstrained broker.

Betweenness centrality measures what fraction of all shortest paths pass through you. High betweenness is the clearest computational signal of a critical bottleneck: information, decisions, or resources must flow through that person, and their absence would dramatically disrupt coordination.

On node-removal simulation

Simulating node removals and tracking $\lambda_2$ is a sound resilience diagnostic. Note that removing a node reduces the matrix dimension; you recompute the Laplacian of the reduced graph, not perturb the original. For large networks, an efficient shortcut is to identify edges bridging nodes with very different Fiedler vector values: those edges, when severed, produce the largest drops in algebraic connectivity.

The normalised Laplacian and random walks

The normalised Laplacian $L_\text{norm} = D^{-1/2} L D^{-1/2}$ adjusts for degree differences. (The interlude explains precisely why: the diagonal becomes uniformly $1$ and off-diagonals become $-1/\sqrt{d_i d_j}$, correcting for the dominance of high-degree nodes.) Its eigenvalues are always in $[0, 2]$, making them comparable across networks of different sizes and densities.

The spectral gap (defined as $\lambda_2(L_n)$ of the normalised Laplacian, or equivalently the separation between the largest and second-largest eigenvalue of $D^{-1/2} A D^{-1/2}$), and this quantity controls how fast a random walk mixes. A large spectral gap means the walk quickly reaches every part of the network regardless of starting position: the organisation diffuses information rapidly, with no isolated echo chambers. A small spectral gap means the walk gets trapped in local neighbourhoods: fragmented groups where information diffuses locally but barely crosses community boundaries.

The determinant and network signatures

The characteristic polynomial of $A$ (whose roots are the eigenvalues) is a compact algebraic fingerprint of the entire network structure. A note of caution: two networks with the same characteristic polynomial are called cospectral, sharing all spectral properties while potentially differing in structure. While theoretically significant, cospectral pairs are combinatorially rare and require very specific structural symmetries to construct. In real-world ONA data (noisy, continuous-weighted, and measured on networks of dozens to thousands of nodes) the probability of two genuinely different organisational networks having identical spectra is mathematically negligible. For practical ONA purposes the spectrum is a reliable structural signature for longitudinal tracking within the same organisation. The cospectral caveat is important for theoretical completeness but should not discourage practitioners from using spectral signatures to monitor structural change over time. For cross-unit comparison, treat shared spectral properties as strong evidence of structural similarity rather than proof of identical structure.

A team whose spectrum is shifting toward smaller Fiedler values, shrinking spectral gap, and increasingly unequal eigenvector centrality is becoming more siloed, more bottlenecked, and more hierarchical, often visible in the spectrum before it shows up in engagement surveys or performance data.

A note on data quality, sparsity, and thresholds

ONA matrices are almost always extremely sparse. The choice of threshold for binarising weighted data has a substantial effect on spectral results. Eigenvalue stability under threshold variation is a useful diagnostic: results that change dramatically when the threshold moves by a small amount should be treated with caution. For large, sparse networks, use sparse eigensolvers such as Lanczos iteration and similar algorithms that exploit sparsity rather than densifying the matrix, enabling efficient decomposition of networks with tens of thousands of nodes.

Thresholding strategies for spectral comparisons

When comparing networks across departments or time periods, the choice of thresholding strategy affects not just individual results but the comparability of spectral quantities across contexts. Three strategies differ in what they hold constant:

For longitudinal tracking within a single organisation, fixed density thresholding is usually the right choice, as it ensures that changes in $\lambda_2$ over time reflect genuine structural change rather than shifts in the overall volume of measured interactions.

Temporal spectral tracking: the organisation over time

Organisations are dynamic. Mergers integrate previously separate networks; restructurings sever established communities; new hires gradually accumulate ties; project endings dissolve clusters. A single spectral snapshot captures structure at one moment; tracking the spectral signature over time reveals how organisational health evolves.

At each time point $t$, compute the adjacency matrix $A(t)$, its Laplacian $L(t)$, and the full spectrum $\{\lambda_1(t), \lambda_2(t), \ldots, \lambda_n(t)\}$. The trajectories of these eigenvalues over time constitute the dynamic spectral signature of the organisation.

time value λ₂ gap healthy integration restructure re-integration Q1 Q2 Q3 Q4
Schematic spectral signature over time. Both the Fiedler value $\lambda_2$ (solid blue) and the spectral gap (dashed green) dip sharply during a restructuring event, then recover as the organisation re-integrates. The spectral signature often shows these effects before they appear in engagement surveys or productivity metrics.

What to watch for

A falling $\lambda_2$ is the earliest quantitative warning sign of organisational fragmentation. Communities are becoming more isolated; bridge connections are being lost. If this trend precedes a restructuring by one or two quarters, it suggests the restructuring was partly precipitated by pre-existing structural weakening rather than just a top-down decision.

A falling spectral gap signals slowing information diffusion. Even if the organisation remains formally connected ($\lambda_2 > 0$), a shrinking gap means the random walk on the network is mixing more slowly, meaning ideas and decisions take longer to percolate. This often precedes the formation of information silos that later calcify into structural ones.

A merger or acquisition initially shows as a sudden increase in $n$ (network size) with a sharp drop in $\lambda_2$: the two separate networks are now nominally joined but have few bridge edges. A successful integration is visible in the gradual rise of $\lambda_2$ and spectral gap over subsequent quarters as cross-organisational ties form. A failed integration stays at low $\lambda_2$ indefinitely, with the two original communities remaining spectrally distinct.

Tracking the distribution of the full spectrum, not just $\lambda_2$, over time gives a richer picture. The number of near-zero eigenvalues tells you how many effectively disconnected communities exist at each time point. Rising eigenvalue variance signals increasing heterogeneity, with some parts of the network becoming more connected while others fragment.

Hierarchy vs network: the spectrum of a tree

Most organisations exist somewhere between two structural extremes: the pure hierarchy (a tree graph, where the org chart is the network) and the random graph (where every pair has an equal chance of connecting, producing no structure at all). The spectrum of each extreme is instructive.

A tree on $n$ nodes has exactly $n-1$ edges, the minimum needed to keep the network connected. Its Laplacian spectrum has one zero eigenvalue (it is connected), but the Fiedler value $\lambda_2$ is extremely small, often $O(1/n^2)$ for a path graph or $O(1/n)$ or smaller for a balanced binary tree. This means a pure hierarchy has mathematically negligible algebraic connectivity: removing a single edge (or a single person) immediately disconnects the network. Every manager is a single point of failure. Information must travel up and across the tree, a path of length $O(\log n)$ or worse.

Tree (pure hierarchy) · small λ₂ SPOF Network (with cross-links) · larger λ₂ informal ties
Left: a pure tree hierarchy, formally connected, but every node is a potential single point of failure (SPOF) and $\lambda_2$ is tiny. Right: the same hierarchy enriched with informal cross-links (dashed), which dramatically increase $\lambda_2$, reduce path lengths, and create redundant routes. ONA makes these informal bridges visible and quantifies their structural value.

A random graph sits at the other extreme. With even moderate edge density, $\lambda_2$ grows with $n$, path lengths are $O(\log n)$, and the network is extremely robust to node removal. But it is also uninterpretable and undrawable: no layout reveals meaningful structure because no structure exists.

Real organisations lie between these poles. The Laplacian spectrum positions them precisely: a spectrum that looks "tree-like" (very small $\lambda_2$, widely spread eigenvalues) indicates a rigid hierarchy where formal reporting lines dominate. A spectrum with moderate $\lambda_2$ and a clear eigengap indicates a healthy hybrid: formal structure overlaid with rich informal collaboration. A spectrum that looks "random-like" (large $\lambda_2$, eigenvalues bunched near the average degree) indicates a highly integrated organisation that may lack the structural differentiation needed for specialisation.

The ONA value proposition, made precise: A pure hierarchy has $\lambda_2 \approx O(1/n)$. Adding even a small number of informal cross-links (the kind that ONA identifies and strengthens) can increase $\lambda_2$ by an order of magnitude. The mathematical cost of rigid hierarchy is measurable, and so is the benefit of the informal network that compensates for it.

Putting it together: what spectral ONA actually tells you

Interlude

On the Laplacian: intuition, structure, and three variants

Start with something simple: a row of people, each holding a number (say, how much each person knows about a particular project. Ask, for each person: how different am I from my neighbours? The Laplacian is the operator that computes exactly that quantity for every node in the network.

In continuous mathematics, the Laplacian appears in the heat equation, describing how temperature spreads through a material. A hot spot surrounded by cool material has a large Laplacian value, so heat flows away from it rapidly. A region of uniform temperature has a Laplacian of zero: there is no local difference to drive any flow. The graph Laplacian is the same idea, discretised: instead of a continuous material, you have a network with values at nodes, and the Laplacian measures the mismatch between each node's value and its neighbours'.

The three variants side by side

The entry pattern changes depending on which variant of the Laplacian you are using:

Variant Diagonal $L_{ii}$ Off-diagonal $L_{ij}$ (edge) Off-diagonal $L_{ij}$ (no edge) Eigenvalue range
Combinatorial
$L = D - A$
Degree $d_i$ $-1$ $0$ $[0, \infty)$ (scales with degree)
Weighted
$L_w = D_w - W$
Weighted degree $\sum_j w_{ij}$ $-w_{ij}$ $0$ $[0, \infty)$ (scales with weights)
Normalised
$L_n = D^{-\!1/2} L D^{-\!1/2}$
$1$ (always, uniformly) $-1/\!\sqrt{d_i d_j}$ $0$ $[0, 2]$ (bounded, always)
Combinatorial A B C D E 2 3 3 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 0 0 Normalised A B C D E 1 1 1 1 1 −.41 −.41 −.41 −.33 −.58 −.41 −.33 −.58 −.58 −.58 0 0 0 0 0 0 0 0 0 diagonal edge (−1 or weight) no edge (0) normalised edge
Combinatorial Laplacian (left) vs normalised Laplacian (right) for the five-node network. The normalised version has a uniform diagonal of 1, but three distinct off-diagonal values: −.41 for edges between the degree-2 hub (Ana) and the degree-3 hubs (Bo, Cy) since −1/√(2·3)≈−0.41; −.33 for the Bo–Cy edge since −1/√(3·3)≈−0.33; and −.58 for edges to degree-1 leaves since −1/√(3·1)≈−0.58. Higher-degree edges attract smaller-magnitude weights: the normalisation penalises well-connected nodes.

The weighted Laplacian

When edges carry weights $w_{ij}$ (representing tie strength, interaction frequency, or spring constants), the off-diagonal entry becomes $-w_{ij}$ instead of $-1$, and the diagonal becomes the weighted degree: $\sum_j w_{ij}$. The unweighted version is the special case where every weight equals $1$. Every property of the Laplacian carries over unchanged: rows still sum to zero, eigenvalues are still non-negative, $\lambda_2$ is still algebraic connectivity. In ONA, the weighted Laplacian is almost always the right object when you have survey data with ordinal tie strengths or communication data with frequency counts.

The normalised Laplacian

The normalised Laplacian $L_n = D^{-1/2} L D^{-1/2}$ is a deliberate rescaling. Working through the arithmetic: diagonal entries become $1$ uniformly; off-diagonal entries become $-1/\sqrt{d_i \cdot d_j}$ for connected pairs. The motivation is to correct for degree heterogeneity: in the raw Laplacian, a node with degree 100 dominates the quadratic form simply because it has more edges. The normalisation divides out degree so every node contributes comparably. One consequence: eigenvalues of $L_n$ are always in $[0, 2]$ for any simple undirected graph; they do not scale with degree, making them comparable across networks of very different sizes and densities.

What all three share

All three variants preserve the essential character of the Laplacian: the quadratic form still measures how much a vector varies across edges, rows still sum to zero, and the number of zero eigenvalues still counts disconnected components. What changes is the metric in which variation is being measured: raw edge counts, weighted strength, or degree-normalised units.

Edge case: When a node has degree zero (an isolate), $D^{-1/2}$ is undefined. The convention is to set $L_n(i,i) = 0$ for such nodes, treating them as contributing no signal rather than causing a division-by-zero error. In practice, verify that isolates in your data are genuine before applying the normalised Laplacian.

Multiplying $L$ by a vector of values (one per node) gives at node $i$: the degree of $i$ times its own value, minus the sum of its neighbours' values. That is the local mismatch quantity. A uniform signal has no mismatch anywhere (correct: $Lv = 0$ for constant $v$). Values that vary sharply across a network cut give large Laplacian values at the cut. The Fiedler value corresponds to the smoothest possible non-trivial variation across the network, specifically the mode that varies as gently as it can while still varying at all, and whose shape traces out the deepest structural division in the graph.

Bipartite networks in ONA: when people connect through things

In classical ONA you ask "who talks to whom?" and observe a direct person-to-person graph. But a large and analytically rich class of organisational data does not work this way. Instead, you observe people connected to things such as documents, projects, Slack threads, and meetings, and the person-to-person network is implicit. Two people are related not because they interacted directly, but because they share membership in some context. This is a two-mode or bipartite network.

People Projects / Artifacts Ana Bo Cy Proj-A Proj-B Proj-C P = B·Bᵀ Ana+Bo
A bipartite network: people (left) connect to projects (right), never directly to each other. The projection $P = B \cdot B^T$ (dashed arrow) creates an implied person-to-person network where edge weights count shared project co-memberships.

The two projections

From the incidence matrix $B$ (rows = people, columns = artifacts) you derive two one-mode networks:

$$P = B \cdot B^T \quad \text{(person × person: co-membership counts)}$$ $$Q = B^T \cdot B \quad \text{(artifact × artifact: shared participant counts)}$$

$P_{ij}$ equals the number of artifacts that person $i$ and person $j$ both appear in. After zeroing the diagonal and optionally normalising by artifact size (to discount mass-participation events), $P$ is a weighted adjacency matrix and all the spectral machinery from earlier sections applies directly.

Singular value decomposition of B

Rather than projecting $B$ into $P$ and then analysing $P$, you can analyse $B$ directly. SVD decomposes $B = U \Sigma V^T$ where $U$ is the left singular vectors (one per person), $V$ is the right singular vectors (one per artifact), and $\Sigma$ is the diagonal matrix of singular values. The left singular vectors of $B$ are the eigenvectors of $P$; the right singular vectors are the eigenvectors of $Q$. SVD places both people and artifacts in the same low-dimensional space simultaneously, answering questions that require reasoning across both node types at once.

Hyperedges and multi-layer networks

The adjacency matrix rests on an assumption easy to miss: that every relationship worth modelling is between exactly two entities. But organizations are full of irreducibly group-level interactions. A standing committee of seven people is not a collection of twenty-one bilateral relationships. A brainstorming session, a cross-functional task force, a real-time Slack troubleshooting thread: these happen among participants simultaneously, not between pairs.

The hypergraph Laplacian

A hypergraph allows edges (now called hyperedges) to connect any number of vertices. The incidence matrix $H$ is $n \times m$ where entry $H_{ik} = 1$ if node $i$ belongs to hyperedge $k$. With vertex degree matrix $D_v$, hyperedge size matrix $D_e$, and hyperedge weight matrix $W$, the hypergraph Laplacian is:

$$\Delta = D_v - H W D_e^{-1} H^T$$

When every hyperedge has exactly two members, $D_e^{-1}$ divides each edge's contribution by its size (2), so the hypergraph Laplacian reduces to $\frac{1}{2}(D - A)$, a scalar multiple of the ordinary graph Laplacian. This means the eigenvectors are identical and the eigenvalues are halved; all qualitative properties (connectivity, community structure, the Fiedler vector) are preserved. The $D_e^{-1}$ term is what makes a meeting of twenty contribute less connectivity per pair than a meeting of four, which is exactly the right behaviour, at the cost of this constant factor when comparing to the standard pairwise Laplacian.

Multi-layer networks and the supra-Laplacian

A multi-layer network preserves multiple relationship types rather than flattening them. It consists of $\ell$ layers, each an adjacency matrix $A^{(\ell)}$ over the same node set, where $\ell$ indexes the relationship type (reporting, collaboration, trust, communication…).

The full object is encoded in the supra-adjacency matrix $\mathcal{A}$ of size $n\ell \times n\ell$. Each diagonal block of size $n \times n$ is one layer's adjacency matrix. The off-diagonal blocks encode interlayer connections: for the common case of uniform coupling between the same node across layers, the block linking layer $p$ to layer $q$ is $c_{pq} I_n$, where $c_{pq}$ is the interlayer coupling strength and $I_n$ is the identity. The supra-Laplacian $\mathcal{L} = \mathcal{D} - \mathcal{A}$ generalises everything from the single-layer case. A practically useful quantity is the algebraic connectivity of $\mathcal{L}$ relative to the individual layer Laplacians: when the interlayer coupling is weak relative to the within-layer structure, the two networks remain spectrally distinct: the people central in the reporting network are not the same people central in the trust network. That divergence is one of the most common and most actionable findings in ONA practice.

A closing note on the elegance

What is remarkable about the spectral approach is that all of this organisational insight lives, from the beginning, inside the adjacency matrix. The matrix does not represent the network; it is the network, in a form that can be analysed with the full power of linear algebra. Eigenvalues are not statistics computed on top of the graph; they are invariants of the graph's structure itself.

This is why spectral methods scale: a 50,000-person network is still just a matrix, and modern sparse eigenvalue solvers can decompose it efficiently using algorithms like Lanczos iteration that exploit sparsity rather than treating the matrix as dense.

For ONA practitioners, the conceptual payoff is substantial. Instead of computing a list of separate metrics and hoping they tell a coherent story, spectral analysis gives you a single mathematical object, the spectrum, that encodes the entire structural story at once. Learning to read that spectrum is learning to read the hidden architecture of your organisation.