Network analysis — Glossary Aria Research

Extended definition

Network analysis (social network analysis in social sciences, graph analysis in computer science) is the family of formal methods for studying relations among entities represented as nodes (vertices) and edges (ties). Mathematical structure: graph $G = (V, E)$ with vertex set $V$ and edge set $E$ . Central typologies: one-mode (people-people) or two-mode (people-events) networks, directed (Twitter follow) or undirected (Facebook friendship), weighted or binary, static or temporal. Local metrics include degree centrality:

C_D(v) = \frac{\deg(v)}{n - 1}

where $\deg(v)$ is the number of $v$ ‘s neighbors and $n$ the network size. Other centralities: betweenness (frequency with which a node lies on the shortest path between others), eigenvector (connection to also-central vertices), closeness (average proximity to all others). Global metrics: density (proportion of existing vs. possible edges), diameter (longest shortest path), clustering coefficient (local transitivity). Wasserman and Faust (1994, Social Network Analysis, Cambridge) is the classical social science reference; Borgatti et al. (2009, Science) offered a modern synthetic view. Community analysis via Girvan-Newman, Louvain, modularity. Generative models (ERGM, blockmodels) and dynamics (SIR on networks, contagion). Visualization via Gephi, Cytoscape, igraph (R/Python), NetworkX.

When it applies

Network analysis applies to any phenomenon with substantive relational structure: scientific coauthorship networks, citation networks (paper cites paper), organizational collaboration networks, social networks (friendship, follow), biological networks (protein-protein, gene regulation), brain neural networks (connectome), ecological networks (food webs), transportation and infrastructure networks, financial networks (transactions, exposure). It applies in advanced bibliometric and scientometric research (scientific field mapping via coauthorship/citation networks). It applies in epidemiological studies of disease propagation and in innovation diffusion studies. It applies in discourse analysis (words as nodes, co-occurrence as edges) and in digital humanities (characters in narratives, letters in historical correspondence).

When it does not apply

It does not apply to data without substantive relational structure: isolated tabular variables do not constitute a network in the analytical sense. It does not apply as loose metaphor: describing a system as a “network” without operationalizing nodes and edges with clear criteria produces empty analysis. It does not apply directly when data are simple one-mode without replications or context: multivariate regression may capture more. It does not replace causal inference: structural correlation in a network does not imply causation; diffusion analysis requires specific design. In very large networks (millions of nodes), exact metrics become computationally prohibitive — sampling approximations and scalable algorithms (Spark, GraphX) are needed.

Applications by field

— Bibliometrics and scientometrics: field mapping via coauthorship, co-citation, bibliographic coupling networks; VOSviewer, CiteSpace, bibliometrix tools. — Epidemiology: SIR models on networks to understand propagation; importance of hubs and bridges. — Social sciences: organizational power analysis, social capital, behavior propagation. — Digital humanities: literary character networks, historical correspondence networks, cultural cartography analysis.

Common pitfalls

The first pitfall is failing to define nodes and edges with explicit criteria before analysis: post-collection definition produces arbitrariness. The second is confusing centrality with causal importance: a node with high betweenness can be structurally important without driving the phenomenon. The third is interpreting density out of context: a coauthorship network with 1,000 authors has naturally low density; comparing to a 50-collaborator lab network is meaningless. The fourth is failing to address statistical uncertainty: many studies report network metrics without CIs; network bootstrap (Snijders 2018) is emerging modern practice. The fifth is confusing observed network with underlying network: incomplete sampling, detection bias, and absence of latent relations can distort structure — robustness analysis to missing edges is prudent.