# Small-World Network

(Redirected from Small world)
Post-publication activity

Curator: Mason A. Porter

A small-world network refers to an ensemble of networks in which the mean geodesic (i.e., shortest-path) distance between nodes increases sufficiently slowly as a function of the number of nodes in the network. The term is often applied to a single network in such a family, and the term "small-world network" is also used frequently to refer specifically to a Watts–Strogatz toy network.

## Definition

A network (or graph) consists of nodes (i.e., vertices) connected by edges. A path (or walk) in a network is a sequence of alternating nodes and edges that starts with a node and ends with a node such that adjacent nodes and edges in the sequence are incident to each other (Bollobás, 2001; Newman, 2018). Nodes or edges can appear multiple times in the same path, and the number of edges in a path is the length of the path. If a graph is connected, then any node can be reached via a finite-length path starting from any other node. The shortest path between a pair of nodes is called a geodesic path and there can be more than one such path.

Between any pair of nodes in an unweighted network, one can calculate the geodesic distance, which is given by the minimum number of edges that must be traversed to travel from the starting node to the destination node. (One can consider directed and weighted generalizations as well.) The number of edges in a path is the length of the path. The distance from me to one of my friends is 1 (as I can reach them via one "hop" along the network), the distance from me to a friend of my friend who is not also my friend is 2, and so on.

A graph's diameter is the maximum of the geodesic distances between node pairs, and the world encapsulated by a graph is "small" if the expected number of hops between two randomly chosen people is small in some sense. In particular, a network is said to be a small-world network (or to satisfy the small-world property) if the mean geodesic distance between pairs of nodes is small relative to the total number of nodes in the network—usually, one wants this length $$\ell$$ to grow no faster than logarithmically as the number of nodes tends to infinity. That is, $$\ell = O(\log N)$$ as $$N \rightarrow \infty$$. The base of the logarithm doesn't matter.

Importantly, one needs an ensemble of graphs to define the small-world property rigorously. Many people who focus on empirical data find this unsatisfactory, as intuitively it can be desirable to consider whether or not an individual network is a "small world", and the definition above does not allow one to do so. (On the bright side, one can still compute all of the usual diagnostics for an empirical network and thereby examine its properties directly. One can also randomize a given network in some way and compare its properties, such as mean geodesic distance, to those in the ensemble that is created by the randomization process.)

## Background

In the late 1920s, Hungarian writer Frigyes Karinthy wrote a short story, called Láncszemek (which translates to Chains or Chain-Links in English), in which two of the characters believed that any two individuals on Earth could be connected to each other through a chain of no more than five acquaintances (Karinthy, 1929). In the 1950s, Ithiel de Sola Pool and Manfred Kochen wrote a manuscript called "Contacts and Influence" (de Sola Pool and Kochen, 1978–1979) that articulated important ideas in social networks and included a discussion of quantifying the distance between people through chains of connections. (This article was circulated widely among academics for many years and finally published as the inaugural article—aside from a short editorial—in the journal Social Networks in the late 1970s.) To test for the existence of short paths of social connections between people, experimental psychologist Stanley Milgram conducted landmark studies in the 1960s on the small-world phenomenon in human social networks (Milgram, 1967; Travers and Milgram, 1969). Milgram sought to quantify the typical distance between actors (i.e., nodes) in a social network and to show that one should expect it to be small. His series of experiments attempted to test the idea that the world had become increasingly interconnected amidst increasing globalization.

In one of his experiments, Milgram sent 96 packages to people living in Omaha, NE, USA who he chose "randomly" from a telephone directory. Each package contained an official-looking booklet that included the crest of Harvard University (where Milgram was on the faculty). Each package also included instructions that recipients should attempt to get this booklet to a specific target individual (a friend of Milgram's who lived in Boston, MA, USA). The only information supplied about Milgram's friend was his name (and thus, indirectly, his gender), his address, and the fact that he was a stockbroker. Each recipient was instructed to send the package to somebody that they knew on a first-name basis who they felt would be socially "closer" to the target individual (e.g., by having a more similar occupation, living closer to the target, etc). Each person who subsequently received one of the packages was then supposed to follow the same instructions to try to get the package to the target.

The target received 18 of the 96 packages. This success rate was higher than expected, and a modern version of Milgram's experiment that used e-mail communication had a significantly smaller success rate (Dodds, 2003). Milgram asked the participants to record in the package each step of the path, and the mean number of hops of completed paths was about 5.9. This led to the popularization of the idea that there are no more than about 6 steps between each pair of people in the world, which is encapsulated by the phrase "6 degrees of separation." Even more fascinating than the small number of hops that Milgram found between people is that the participants in his experiment was the ease in which the social network could be navigated using very little (and mostly local) information.

The actor Kevin Bacon became an unwitting part of this story a couple of decades ago when some college students, scheming to get on Jon Stewart's show on MTV (Barabási, 2003; Durrett, 2007), seemingly decided that "6 degrees of Kevin Bacon" sounded enough like "6 degrees of separation" that it must imply that Kevin Bacon was the "center" of the acting universe. (There are, in fact, numerous notions of centrality in networks.) An actor has a "Bacon number" of 1 if he appeared in a movie with Kevin Bacon, a Bacon number of 2 if he appeared in a movie with someone with a Bacon number of 1 and doesn't himself have a Bacon number of 1, etc.

Mathematicians were already doing something like this many years earlier, as it is quite popular to try to find the shortest path from oneself to Paul Erdős, who is one of the pioneers of graph theory (Hoffman, 1998). One considers a network whose nodes are defined by people and whose unweighted edges indicate that two nodes coauthored at least one paper together. (There are much more sophisticated, and appropriate, ways to study coauthorship networks if one wants to investigate them scientifically (Leydesdorff, 2001; Newman, 2001a; Newman, 2001b; Redner, 2005).) "Erdős numbers" are then defined analogously to Bacon numbers. For example, I know that my Erdős number is at most 4 because I have used MathSciNet) to find a path of length 4 between Paul Erdős and me. One such path is the following: I have coauthored a paper with Shui-Nee Chow (who has an Erdős number of 3), who has coauthored a paper with David Green Jr. (Erdős number 2), who has coauthored a paper with Jiuqiang Liu, who has coauthored a paper with Erdős. The reason I state explicitly that my Erdős number is bounded above by 4 is that the number of 4 reported by MathSciNet uses only publication venues that are indexed by MathSciNet, so there might be a shorter path via a publication that it doesn't include. This, along with Milgram's experiment, lead to the idea of network navigation. In particular, it is one thing to state that geodesic paths are short on average, but it is typically much harder to actually find a short path—and especially difficult to guarantee having found the shortest one—without full knowledge of the network structure.

One can also calculate an Erdős–Bacon number, which is equal to the sum of one's Erdős number and one's Bacon number. Some people, such as Natalie Portman, have low Erdős–Bacon numbers. (As described on the Wikipedia entry for Erdős–Bacon numbers, Portman has a collaboration path that includes Joseph Gillis, whose Erdős number is 1.)

It is also worth noting that there are some issues with traditional Erdős numbers, and some generalizations have been developed (Morrison, 2010).

## Watts–Strogatz networks

The best known family of small-world networks was formulated by Duncan Watts and Steve Strogatz in a seminal 1998 paper (Watts and Strogatz, 1998) that has helped network science become a popular medium of expression for numerous physicists, mathematicians, computer scientists, and many others. There is also a wonderful interactive reimagining of their paper (Victor, 2011). The term "small-world networks" (or the "small-world model") is often used to mean Watts–Strogatz (WS) networks or its variants, though many consider it preferable to define small-worldness in a more general fashion (as in the present article) (Newman, 2000). However, there is disagreement in the literature, as others (including the original authors) prefer to reserve the term to describe networks with both small mean geodesic path lengths and significant local clustering (Watts and Strogatz, 1998). Note additionally that examination of graphs that satisfy the small-world property predates the Watts–Strogatz model. For example, (Bollobás and Chung, 1988) contains a proof that the diameter of a network that consists of an $$N$$-cycle plus a random matching scales logarithmically with $$N$$ with probability $$1$$ as $$N \rightarrow \infty$$. (A closed path, which starts and ends at the same node, is called a cycle.) In the matching, one partitions the set of nodes into $$N/2$$ node pairs if $$N$$ is even or into $$(N-1)/2$$ node pairs and one singleton if $$N$$ is odd; one then adds a new edge between the nodes in each of these pairs. Figure 1: (a) A ring network in which each node is connected to the same number $$l = 3$$ nearest neighbors on each side. (This network resembles a one-dimensional lattice with periodic boundary conditions.) (b) A Watts–Strogatz network is created by removing each edge with uniform, independent probability $$p$$ and rewiring it to yield an edge between a pair of nodes that are chosen uniformly at random. (c) The Newman–Watts variant of a Watts–Strogatz network, in which one adds "shortcut" edges between pairs of nodes in the same way as in a WS network but without removing edges from the underlying lattice. This figure, which appeared in (Newman, 2003), is used with permission from Mark Newman and SIAM. Copyright © 2003 Society for Industrial and Applied Mathematics. Reused with permission. All rights reserved.

To discuss the Watts–Strogatz family of small-world networks, we need a notion of clustering. A global clustering coefficient is (Newman, 2003; Newman, 2018; Barrat and Weigt, 2000)$\begin{equation}\tag{1} C = \frac{3 \,\,\, \times \mbox{ number of triangles}}{\mbox{number of connected triples}}\,, \end{equation}$ Figure 2: Clustering coefficient $$C$$ and mean geodesic distance $$\ell$$ between nodes in the Newman–Watts variant of the Watts–Strogatz small-world model as a function of rewiring probability $$p$$. Observe that there is a regime with high clustering but low mean geodesic distance. (For each value of $$p$$ in this plot, one computes the values of $$C$$ and $$\ell$$ as means over an ensemble of NW networks.) This figure, which appeared in (Newman, 2003), is used with permission from Mark Newman and SIAM. Copyright © 2003 Society for Industrial and Applied Mathematics. Reused with permission. All rights reserved.

where a triangle consists of 3 nodes that are completely connected to each other (i.e., a 3-clique) and a connected triple consists of three nodes $$\{i,j,k\}$$ such that node $$i$$ is connected to node $$j$$ and node $$j$$ is connected to node $$k$$. The factor of 3 arises because each triangle gets counted 3 times in a connected triple. The clustering coefficient $$C$$ indicates how many triples are in fact triangles. A complete graph, in which every pair of nodes is connected by an edge, with $$N \geq 3$$ nodes yields the maximum possible value of $$C = 1$$, as all triples are also triangles. The minimum value of the clustering coefficient is $$C = 0$$. It can also be useful define a local clustering coefficient (Watts and Strogatz, 1998; Newman, 2018), but this won't be necessary for our discussion.

The Watts–Strogatz model of small-world networks demonstrates how to construct a tractable family of toy networks that can simultaneously have significant clustering (i.e., what sociologists might call high transitivity) and small geodesic distances (Watts and Strogatz, 1998; Newman, 2003; Newman, 2018). This model generates a family of unweighted, undirected networks (with no self-edges or multi-edges) that interpolates between two limiting situations. One extreme consists of an unweighted, undirected network with $$N$$ nodes, which you should imagine are arranged in a circle. As depicted in panel (a) of Figure 1, each node is connected to the $$l$$ nearest neighbors on each side. This ring graph, which I will call the "substrate" of the WS model, is a large world, as one can only take slow "local" steps to travel between a pair of nodes. The ring graph has a nonzero clustering coefficient $$C$$ provided $$l \geq 2$$. The other extreme of the WS model is an Erdős–Rényi (ER) random graph, in which each pair of nodes has a uniform and independent probability of being connected to each other. This yields a small world, as one can typically travel between a pair of nodes using a short path. However, the clustering coefficient $$C \rightarrow 0$$ as the number of nodes $$N \rightarrow \infty.$$

The WS model, which is parametrized by $$p \in [0,1]$$, includes a regime of networks that simultaneously exhibit both significant clustering and the small-world property. When $$p = 0$$, we obtain a ring graph in which each node is coupled to its $$c = 2l$$ nearest neighbors; when $$p = 1$$, we obtain an ER random graph. As illustrated in the middle panel of Figure 1, $$p$$ gives the probability of rewiring an edge: one considers each edge in the graph; with independent and uniform probability $$p$$, one removes that edge and replaces it with a "shortcut" edge between two nodes that are chosen uniformly at random from the $$N$$ nodes. (As discussed in (Newman, 2018), this is technically a variant of the original WS model.) In the rewiring, one specifically avoids self-edges and multi-edges, and one preserves the total number of edges. Because the WS model is defined probabilistically for $$p \neq 0$$, studying it entails examining an ensemble of graphs for each $$p$$ (for a fixed value of $$N$$). When one states a property of the WS model for a fixed set of parameter values (and with $$p \neq 0$$), it is to be understood as a mean over an ensemble of graphs (aside from the properties that are fixed a priori for each graph from the definition of the ensemble).

One network property that we need to calculate is the clustering coefficient. For $$p = 0$$, the clustering coefficient is (Newman, 2018) $$\begin{equation} \tag{2} C = \frac{3(c-2)}{4(c-1)}\,, \end{equation}$$ which is independent of the number of nodes (as long as the network is large enough to avoid a finite-size saturation effect) and ranges from $$C = 0$$ for $$c = 2$$ (i.e., nearest-neighbor coupling) to $$C \rightarrow 3/4$$ for $$c \rightarrow \infty$$. (More precisely, the $$C \rightarrow 3/4$$ result holds until finite-size effects come into play because $$c$$ has become too large relative to $$N$$.) For $$c > 2$$, we get a family of graphs that have both small mean geodesic distances $$\ell$$ (i.e., they are small worlds) and significant clustering for a large range of rewiring probabilities $$p$$. Barrat and Weigt (Barrat and Weigt, 2000) showed for the WS model that $$\begin{equation}\tag{3} C \sim \frac{3(c - 2)}{4(c - 1)}(1-p)^3 \quad \text{as} \quad N \rightarrow \infty\,. \end{equation}$$ The value of $$C$$ is large even for large values of $$p$$, and the value of $$\ell$$ is small even for small values of $$p$$.

The WS model's famous feature—namely, its possession of large range of $$p$$ that produces small-world graphs with significant clustering—has inspired numerous subsequent research. For example, in the late 1990s and early 2000s, paper after paper appeared that investigated variants of the Watts–Strogatz model (Newman, 2000; Newman, 2018). In particular, Newman and Watts (Newman and Watts, 1999a) introduced a variant family of WS networks (see the right panel of Figure 1) in which one introduces new "shortcut" edges just as in the WS model, but one no longer removes edges from the ring substrate. That is, for each edge in the ring, there is an independent, uniform probability $$p$$ of adding a shortcut between a pair of nodes that are chosen uniformly at random. A similar model was introduced concurrently by Monasson (Monasson, 1999).

One can see using numerical computations that WS networks have high $$C$$ and low $$\ell$$ for many values of $$p$$, but it is not easy to verify these features with rigorous calculations. In an NW network, the substrate ring remains intact, which simplifies calculations enormously. The downside is that $$C \not\rightarrow 0$$ as $$p \rightarrow 1$$ (for $$c > 2$$), as one no longer obtains an ER random graph in that limit. Hence, the NW variant of the WS model has similar behavior for small and intermediate values of $$p$$ but not for values of $$p$$ that are too close to $$1$$ (see Figure 2). As $$N \rightarrow \infty$$, the clustering coefficient of the NW model is (Newman, 2018; Barrat and Weigt, 2000) $$\begin{equation}\tag{4} C \sim \frac{3(c-2)}{4(c-1) + 8cp + 4cp^2}\,. \end{equation}$$

To examine the small-world property, it is desirable to have an expression for the mean geodesic distance $$\ell$$. Exact expressions have proven elusive, but it is possible to find approximate ones (Newman, 2018; Newman and Watts, 1999b; Newman et al., 2000). For example, in NW networks, one can show using scaling considerations and a mean-field approximation that $$\begin{equation}\tag{5} \ell \approx \frac{N}{c}f(Ncp)\,, \end{equation}$$ where $$\begin{equation}\tag{6} f(x) \approx \frac{2}{\sqrt{x^2 + 4x}}\mathrm{tanh}^{-1}\left(\sqrt{\frac{x}{x+4}}\right) \end{equation}$$ and $$Ncp$$ is the expected number of shortcut edges. Because of the nature of mean-field calculations, equations (5) and (6) work best either when $$Ncp$$ is very large or when it is very small. When $$Ncp \gg 1$$, for example, one finds that $$\begin{equation}\tag{7} \ell \sim \frac{\ln(Ncp)}{c^2p}\,. \end{equation}$$ In other words, as long as the number of shortcuts is much larger than $$1$$, the mean geodesic distance $$\ell$$ increases logarithmically with the number of nodes $$N$$. That is, these networks exhibit the small-world property.

Intuitively, one expects many real-world social networks to exhibit the small-world property (Newman, 2018). (Therefore, if one somehow is able to examine multiple realizations of similar social networks, one expects logarithmic or slower scaling of mean geodesic distance with the number of nodes.) Watts–Strogatz networks and their variants provide tractable models in which the small-world property and significant local clustering occur simultaneously. This is what makes them interesting to study, although neither the degree distribution of WS networks (and similar ensembles) nor their clustering properties resemble those of real social networks. As one can observe in Figure 2, small-world scaling of mean geodesic distance between nodes can occur for the same values of $$p$$ at which local clustering (and hence transitivity) in the network is significant. Sometimes, it really is a small world after all.

## Other examples

As I mentioned above, Ref. (Watts and Strogatz, 1998) has led to a huge amount of subsequent work (Newman, 2000; Strogatz, 2001; Newman, 2018; Newman, 2003), and the notion of small-world networks has been extremely influential in both theory and applications.

Some classes of networks can yield especially small worlds. For example, consider the construction of an unweighted, undirected, random network with a specified degree distribution (Aiello et al., 2001; Durrett, 2007) $$\begin{equation} \tag{8} p(k) = b k^{-\lambda}\,, \quad k = m\,, m+1 \,, \ldots \,, k_{\mathrm{max}}\,, \end{equation}$$ where $$b \approx (\lambda - 1)m^{\lambda - 1}$$ is a normalization factor and $$m$$ and $$k_{\mathrm{max}} = mN^{1/(\lambda - 1)}$$ provide lower and upper cutoffs for what is otherwise a power-law form. (The degree of a node is the number of edges connected to it, and the degree distribution of a network can either be determined based on a set of given node degrees or be given directly by some probability distribution from which node degrees are drawn.) If $$\lambda \in (2,3)$$, then the graph's mean geodesic distance $$\ell \sim a_0\ln \ln N$$ for some constant $$a_0$$ (Cohen and Havlin, 2003; Dorogovtsev et al., 2002; Chung and Lu, 2002; Durrett, 2007).

It is important to remark, especially in the context of the previous example, that the degree distribution does not tell us much by itself about geodesic distances in a network. Crucially, one must consider how nodes are connected. For example, Marek Biskup (Biskup, 2011) considered graph diameters by adding shortcut edges to the $$d$$-dimensional hypercube lattice $$\mathbb{Z}^d$$. Additionally, the idea of the "smallest" small-world network was explored in (Nishikawa et al., 2002), and the importance of the smallness of a small-world network for dynamical systems on networks was discussed in (Melnik et al., 2011).

The most amazing aspect of Milgram's experiments was not that the world is small but that people seemed to be very good at navigating a small world with almost entirely local information. (See my comments above concerning my attempt to find a short path of paper coauthorships between Paul Erdős and me.) To investigate network navigation, Jon Kleinberg has studied message passing on networks (Kleinberg, 2000; Easley and Kleinberg, 2010; Newman, 2018). Kleinberg introduced a model whose substrate is a ring of nodes with nearest-neighbor coupling. He incorporated geographic effects by assuming that the nodes around the ring are aware of how close they are to each other. Shortcuts are still placed between pairs of nodes at random, but now one chooses the particular pair of nodes so that the probability of a particular shortcut covering a distance $$r$$ (i.e., the two nodes in question are $$r$$ hops apart around the ring) is $$p_r = Kr^{-\alpha}$$, where $$\alpha \geq 0$$ is a bias parameter and the normalizing constant $$K$$ ensures that one has a well-defined probability. When placing a shortcut, one first chooses a distance $$r$$ from the probability distribution, and one then places a shortcut between a pair of nodes (chosen uniformly at random) that are exactly $$r$$ hops apart on the ring. The limiting case $$\alpha = 0$$ yields the NW model. When $$\alpha > 0$$, connections between nearby nodes are chosen preferentially, and the effect becomes progressively stronger as $$\alpha$$ becomes larger.

There is, of course, much more. For example, one interesting question to ask is which networks are easy to navigate using only local information, and one can also ask how small-world scaling might arise in the first place (Clauset and Moore, 2003, Mogren et al., 2009).