Scale-free networks

Post-publication activity

A network that has a power-law degree distribution, regardless of any other structure, is called a scale-free network.

Power-Law degree distribution

The degree of a node is the number of links adjacent to it. If we call the degree of a node \(k\ ,\) a scale-free network is defined by a power-law degree distribution, which can be expressed mathematically as \( P(k)\sim k^{-\gamma} \) From the form of the distribution it is clear that when:

\(\gamma<2\) the average degree diverges.
\(\gamma<3\) the standard deviation of the degree diverges.

It has been found that most scale-free networks have exponents between 2 and 3. Thus, they lack a characteristic degree or scale, and therefore their name. High degree nodes are called hubs.

Scale-free network models

There are several models that are able to create a scale-free network. But most of them introduce in one way or another two main ingredients, growth and preferential attachment. By growth we mean that the number of nodes in the network increases in time. Preferential attachment refers to the fact that new nodes tend to connect to nodes with large degree. One can naively argue that for large networks this is nonsense because it requires a global knowledge of the network, i.e., knowing which are the high degree nodes, but this is not the case. There are several local mechanisms that introduce preferential attachment (see below).

In mathematical terms, preferential attachment means that the probability that a node with degree \(k_i\) acquires a link goes as \(P(k_i)=\frac{k_i}{\sum_{i}k_i}\)

The Barabasi-Albert model

The Barabasi-Albert model (a.k.a. BA model) introduced in 1998 explains the power-law degree distribution of networks by considering two main ingredients: growth and preferential attachment (Barabasi and Albert 1999). The algorithm used in the BA model goes as follows.

Growth: Starting with a small number (\(m_0\)) of connected nodes, at every time step, we add a new node with \(m(<m_0)\) edges that link the new node to \(m\) different nodes already present in the network.
Preferential attachment: When choosing the nodes to which the new node connects, we assume that the probability \(P\) that a new node will be connected to node \(i\) depends on the degree \(k_i\) of node \(i\ ,\) such that

\[P\sim \frac{k_i}{\sum_{i}k_{i}}\]

Numerical simulations and analytic results indicate that this network evolves into a scale-invariant state with the probability that a node has \(k\) edges following a power law with an exponent \(\gamma=3\ .\) The scaling exponent is independent of \(m\ ,\) the only parameter in the model.

Analytical solution for the BA model

This model can be solved analytically by setting up a differential equation in which the rate at which a node acquires links is equal to the number of links added times the probability of acquiring a link \[ \frac{dk_i}{dt}= m\frac{k_i}{\sum_{j}k_j} \]

This equation can be simplified by realizing that at each time step \(m\) links are added, thus \[\sum_j k_j = 2mt\] and \[ \frac{dk_i}{dt}=\frac{k_i}{2t} \] \[ \ln{k_i}=\frac{1}{2}\ln{t} + C \]

where \(C\) is an integration constant that can be determined by using the fact that the \(i^{th}\) node arrived to the network at time \(t^i\) having degree \(m\ .\) Thus \[k_i = m(\frac{t}{t_i})^{1/2}\ .\] The degree distribution can be calculated by finding the probability that a node has a degree smaller than \(k\) \[ P(k_i(t) > k) = P(t_i < \frac{mt}{k^2}) = 1 - P(t_i > \frac{mt}{k^2}) \] Without loss of generality we can assume that nodes are added at a constant rate thus \[ P(t_i) =\frac{1}{m_0 + t} \] where \(m_0\) is the total degree of the nodes that got the network started. Using this distribution \[ P(k_{i} - k) = 1 -\frac{mt}{k^2}\frac{1}{m_0 + t} \] Finally we get the degree distribution by differentiating and conclude that \[ \frac{d}{dk}P(k_i < k) = P(ki = k) = \frac{2m^2 t}{k^3}\frac{1}{m_0 + t} \]

This mechanism was first introduced by Yule in the early 20th century to explain the distribution of different Taxa and was later generalized by Price in the 70s and coined as cumulative advantage. The example shown here is not the most general version of the Price model that can be found in the original paper as well as in Newman (2005). In any case, the lesson that should be learned from this is that whenever we found a system in which the probability of increasing is proportional to the actual value, we should expect its distribution to follow a power-law.

Local alternatives for preferential attachment

Following links

We could imagine that when nodes join the network they follow a link. For example, you move to a new town and an old friend tells you that you should visit a friend of him. If we consider that link to be a randomly chosen one, the probability \(\pi\) that you were referred to a person with degree \(k\) is \[\pi(k)= k P(k)\] which is the precise definition of preferential attachment. This is because \(k\) links end up in an agent of degree \(k\ .\)

Duplication and divergence model

In a biological context the "duplication and divergence" model has been proposed as an explanation for the scale-free nature of protein-protein interaction networks. Let us consider a protein that interacts with \(k_p\) other proteins. Eventually some of the genes that encode a protein get duplicated and now 2 copies become available. This redundancy allows one copy of the gene to mutate without changing the fitness of the organism. This process ends up with different proteins which are likely to share some interactions. If these duplication and divergence process occurs at random, proteins that have a high degree are more likely to have one of its neighbors change, and again, the probability of winning a neighbor through this process is proportional to the actual number of neighbors. This is another example of linear preferential attachment.

Limited information

A different scenario that one can imagine is the one in which a node incorporates to the network with a limited or local information about it. If linear preferential attachment is used as the rule to create links in the local information context, a power-law degree distribution is also recovered, regardless of having limited information.

Properties of scale-free networks

Scale-free networks have qualitatively different properties from strictly random, Erdos and Renyi, networks. These are:

Scale-free networks are more robust against failure. By this we mean that the network is more likely to stay connected than a random network after the removal of randomly chosen nodes.
Scale-free networks are more vulnerable against non-random attacks. This means that the network quickly disintegrates when nodes are removed according to their degree.
Scale-free networks have short average path lengths. In fact the average path length goes as \(L\sim \log{N}/\log{\log{k}} \)