Magnetism: mathematical aspects

From Scholarpedia

This article has not been published yet; It may be unfinished, contain inaccuracies or unapproved changes.

Author: Prof. Daniel C. Mattis, Dept. of Physics, University of Utah,Salt Lake City, UT 84112-0830

Already as an infant, Albert Einstein (Nobel prize winner, 1921) wondered about the physics of magnetic “action at a distance”. His was not the only brilliant mind entranced by magnetic phenomena. Among other Nobelists who have made incidental or even major contributions to our understanding of this field we note (along with the year of their prize) the names of H. Lorentz and P. Zeeman (1902), P. Curie (1903), W. Heisenberg (1932), W. Pauli (1945), F. Bloch (1952), C.N. Yang (1957), H. Bethe (1967), L. Néel (1970) and P.W. Anderson (1977). The list grows longer if we include the ancillary topics of magnetic resonance, Hall effect, and superconductivity or developments in unrelated fields, such as the concept of the Goldstone mode in high-energy physics inspired by the spin waves of the theory of magnetism. It is not a coïncidence that in "statistical mechanics", which comprises the study of physics at finite temperature, contemporary concerns such as phase transitions and their critical exponents evolved out of the corresponding microscopic properties of magnetic materials at the Curie point. In short, much of the mathematics that was originally developed to unveil the sources of magnetism found subsequent applications in other branches of theoretical physics and – in the case of the Ising model – far afield in cryptology, epidemiology, economics, political science and even sociology!

In this article we show how theories of magnetism are classified according to their internal and external symmetries, spatial dimensionality and various other physical properties. But from the outset, it is important to clarify to the reader that we are only seeking to understand the material origins of magnetic phenomena, insofar as they might originate in the many-body comportment of electrons in atoms, molecules and solids. This is quite distinct from studies of the resulting electromagnetic field, whether this last is treated in the classical version due to Maxwell, dating back to the mid-19th Century, or in the quantized (QED) version developed in the middle of the 20th‚ the more so when we turn to antiferromagnets in which the concomitant magnetic fields cancel already at a microscopic level.

What we present below is a small survey of a few idealized models of magnetism culled from an incredibly large array; we discuss the motivation behind them together with the interesting mathematics that arises in the course of solving these many-body problems.

Contents

Effects of spatial dimensionality and of various symmetries

Setting aside Dirac's hypothetical magnetic monopole and the fragile current loop of Ampère, the leading physical source (and the unit) of magnetism has to be the permanent magnetic dipole‚ such as the elementary "Bohr magneton" carried by each and every electron. It is sometimes useful to idealize magnetic materials as periodic arrays of unit cells each of which contains one or more atoms or molecules sporting one or more Bohr magnetons.

As example of this periodicity imagine a family of "hypercubic solids" in arbitrary dimensions, ranging from d= 0 (just 1 cell), to N cells in d \ge 1, and up to unphysically high dimensions d \gg 1. Each cell has 2d nearest-neighbors and N is large. The lattices are called the linear chain (lc) in 1D, the square (or simple quadratic sq) in 2D, the simple cubic (sc) in 3D, etc. In the lc, identical cells lie along a given axis at points R_n = an with n= 1,2,N their label and a the lattice parameter. Thus d=1 applies to an hypothetical, ideal, polymer of great length. Cells of the sq lattice are located at R_{n,m} = a (n,m) and those of the sc lattice are at R_{n,m,l} = a(n,m,l), etc.

It is found that the leading term of various correlation functions of particles confined to such lattices, when expanded in powers of 1/d, yield formulas identical to those obtained in the self-consistent "mean-field", aka "molecular field" approximation. "Critical properties", that is, the singular contributions to thermodynamic functions (specific heat, susceptibility, etc.) near a phase transition, can sometimes be extrapolated to 3D from d = 4 using the renormalization group (RG) rigged for d = 4- \epsilon‚ upon expanding in powers of \epsilon. This has justified the study of leading large d approximations, even though physical limitations restrict magnetic systems as well as other materials to d \le 3.

To further define a physical model it is necessary to specify the contents of each cell and the interactions among neighboring cells. For many models it is possible to solve the multi-cell model in d dimensions (or at the very least, to analyze its thermodynamical properties) by a "transfer matrix approach" on a d -1 dimensional lattice. More precisely, the partition function in d dimensions is related to the largest eigenvalue of a transfer operator on a lattice in d -1 dimensions. The eigenvalue problem that needs be solved is usually trivial in d=1 but is, with some exceptions, too cumbersome to carry out for d > 2.

On the other hand and at the opposite extreme, for d \ge 4 or 5 or even greater (d \gg 1), the statistical mechanics and phase diagram of many models of magnetism become easily solvable and predictable in leading 1/d approximation, just as is the case in quantum field theories.

Clearly, whether d is large or small is important. Just as single strand polymers differ from 3D solids even when the constituent atoms are identical, the properties of magnetic polymers differ from those of magnetic solids. Once d is specified, the dynamics of the individual magnetic moments (point-group symmetry) together with the symmetries of their interactions (bonds) are what determine the collective properties in the ground state and at finite T. These symmetries and dynamics fall into various classes or categories.

Typically, once the class or category of the model is given, it is in the three-dimensional world d = 3 in which we live that it is most difficult to find precise or merely reliable mathematical solutions. That is what keeps theoretical physicists in business. As an example let us next examine arguably the simplest model of anisotropic nearest-neighbor interactions among quantized spins on lattices in various dimensions d, the one named after E. Ising who first studied it in the 1920's.

The one-dimensional Ising model vs. other possibilities

Ising’s model was based in the “old” quantum theory in which spins, affixed to pounts on a space lattice, can only point “up” or “down.” This discrete algebra is denoted Z(2). In d=1 the Hamiltonian is H = J \sum_n S_n S_{n+1} (where each S_n = \pm 1).

The Ising model is better described in terms of Pauli matrices \vec S = \frac{\hbar}{2} (\sigma_x, \sigma_y, \sigma_z) with J absorbing the factor \left ( \frac{\hbar}{2} \right )^2 as follows: each spin at n is assumed to interact with nearest-neighbors at n \pm 1 via a highly anisotropic 3 X 3 “exchange” matrix characterizing the nearest-neighbor bond, which then assumes the appearance: -(\sigma_x, \sigma_y, \sigma_z). \begin{pmatrix}  0 & 0 & 0 \\  0 & 0 & 0 \\  0 & 0 & J  \end{pmatrix}. \begin{pmatrix}  \sigma_{x,n+1} \\  \sigma_{y,n+1} \\ \sigma_{z,n+1}  \end{pmatrix}. The total interaction with an external or self-consistently generated magnetic field is given by -B \sum_n \sigma_{z,n} if it is “longitudinal” or -B \sum_n \sigma_{x,n} if “transverse.” Also -(\sigma_x, \sigma_y, \sigma_z). \begin{pmatrix}  J & 0 & 0 \\  0 & J & 0 \\  0 & 0 & J  \end{pmatrix}. \begin{pmatrix}  \sigma_{x,n+1} \\  \sigma_{y,n+1} \\ \sigma_{z,n+1}  \end{pmatrix} is the Hamiltonian of an individual nearest-neighbor bond in Heisenberg’s model, with -B \sum_n \sigma_{z,n} the interaction with any orientation external field (as here, by symmetry, there can be no distinction between parallel and transverse.)

Finally, the interaction -(\sigma_x, \sigma_y, \sigma_z). \begin{pmatrix}  J & 0 & 0 \\  0 & J & 0 \\  0 & 0 & 0  \end{pmatrix}. \begin{pmatrix}  \sigma_{x,n+1} \\  \sigma_{y,n+1} \\ \sigma_{z,n+1}  \end{pmatrix} describes the “X-Y” model in which we again need distinguish “in-plane” from “out-of-plane” external fields. These three models are discussed separately below.

The most general bilinear Hamiltonian for a bond connecting two sites is given by an Hermitean 3 X 3 matrix J_{\beta}^{\alpha}. This generalization has 9 independent parameters, with the Ising version in its most anisotropic limit, all the way to the totally isotropic Heisenberg model, J_{\beta}^{\alpha} = J \delta_{\beta}^{\alpha} , representing isotropic limit, given that it is explicitly invariant under arbitrary spatial rotations. ( delta_{\beta}^{\alpha}=1 if \alpha = \betaand 0 otherwise is Kronecker’s delta.) Additionally, quartic forms for two-site interactions and interactions that involve three sites, have sometimes been considered in the literature in connection with various physical applications, but they are not discussed further here.

Ground states, elementary excitations and Tc

The spin 1/2 Ising lc of N sites connected by A-1 (ferromagnetic) bonds J > 0 was described above. The combined ground state (lowest energy) solution of the Hamilton or Schrödinger equation ( H = E ) is E_0 = -J(N-1) for either of two ground state configurations: all spins “up” (+1) or all “down” (–1). Either configuration manifests perfect long-range order (LRO).

The next higher energy levels are associated with a single “domain wall” defined as follows: the first q spins are all parallel, say “up,” followed by spins numbered q+1,\ldots, N, that are all “down.” Because the q^th bond is promoted from energy -J to energy + J, the energetic cost of this defect is +2J. There are N-1 possible values of q, that is, N-1 distinct positions on which to place the break. A second break can occur at q + r (r \neq 0) where - q < r < N-q ; a third one at some third distinct place, etc. Because a bond can only be broken once a sort of “exclusion principle” prevents any two breaks from sharing the same bond. Thus it appears that the domain walls are fermions, unusual only in the sense that each such fermions adds a constant amount 2J to the total energy – regardless how many others are present. The corresponding Boltzmann factor of each is e^{- \frac{2J}{k_B T}} . It follows that the entropy \mathcal{S} ( \mathcal{S} = Boltzmann’s constant natural logarithm of the number of allowed configurations) is the sum of k_B \log 2 (recall: initially there were two configurations) and of (N-1)k_B \log (1+e^{-2J/k_B T}). The free energy F = E_0 - T\mathcal{S} becomes:

(N-1)k_B T \log (1+e^{-2J/k_B T})-k_B T \log 2

The last term can be neglected in the large N limit. From this one can infer that at temperature T the thermodynamic average of the number \mathcal{N} of domain walls is,

\mathcal{N}=(N-1) \times \frac{1}{e^{-2J/k_B T}+1}

in which the second factor is the “Fermi-Dirac distribution function” (being the average number of fermions that lives on each of the N-1 bonds in thermal equilibrium at temperature T.) Both F and \mathcal{N} are analytic in T except in the limit T \to 0, signaling there is an order-disorder thermodynamic phase transition at T=0. The model can be “solved” more formally and the same results obtained more directly following a nonlinear transformation to bond variables. Let s_1 \equiv S_1 (=\pm 1), s_2=s_1S_2 ,\ldots , s_n =s_{n-1}S_n , … , each s_n=\pm 1 independent of the others. Then H =-J \sum_{n-2}^N s_n. This is just an Hamiltonian of N-1 noninteracting pseudo-spins in a pseudo external magnetic field J. Because s_1 has two values but does not explicitly enter H, each configuration \{ s_1; s_2, s_3 ,\ldots, s_N \} is two-fold degenerate. This accounts for k_B T log2 in (1).

In d \geq 2, Ising’s model exhibits a genuine phase transition; the second derivative of F is discontinuous at a finite temperature T identified as the “Curie point” T_c , a quantity proportional toJ that is also a function of d. Table I lists T_c to three decimal places in terms of z \equiv \sharp of nearest-neighbor sites on hypercubic lattices.

Table 1: Critical temperature of Ising ferromagnet on hypercubic lattices
Lattice type Coordination number z kTc/zJ
lc (d=1) 2 0
sq (d=2) 4 0.567
sc (d=3) 6 0.752
hypercubic (in d ≥ 4) 2d 1-0.596/d

More on Tc

The critical temperature of zero in d=1 shown in Table I was obtained trivially. The nonvanishing values of T_c on any of the three standard lattices in d=2 (the sq, honeycomb and triangular) can also be found exactly using the duality relations of Kramers and Wannier. Duality is what relates K=J/k_B T on a spin lattice to K^{*}=J/k_B T^{*} on a dual lattice that is constructed on the bonds of said lattice. The sq lattice is self-dual with coördination number z=4 and the triangular and honeycomb lattices are duals of each other with z=6 and 3 respectively. The duality relations tanhK^{*}=e^{-2K} and tanhK=e^{-2K^{*}} were originally derived by comparing high temperature series expansions of the partition function with low T expansions, without needing to actually evaluate either sum. Armed with just this sort of information, Onsager showed that T_c(z) is given by sinh2K_c=tan \frac{\pi}{z} for Ising ferromagnets on any one of the 3 principal lattices in 2D.

There is no such formula for the values of T_c listed in Table I for the Ising ferromagnet in d=3 and d \geq 4 dimensions, so there the values of T_c have to be obtained numerically.

Symmetry breaking

The exact eigenvalue of the transfer matrix of the Ising model on a sq lattice, hence the exact evaluation of the partition function and of the free energy in this model, including the particulars of the singular phase transition (both specific heat and magnetic susceptibility diverge at Tc ,) were first obtained by L. Onsager in the early 1940’s by the use of spinor algebra. A subsequent version based on a more familiar fermion field theory was constructed by T.D. Schultz, D.C. Mattis and E.H. Lieb and will be sketched below. But it seems that some properties of these exact solutions generalize to all model ferromagnets in arbitrary dimensions, viz.:

Above T_c in zero external magnetic field, the “up-down” symmetry is maintained perfectly. When cooling below T_c this symmetry is spontaneously broken by the onset of LRO. It is noteworthy that the application of an homogeneous external magnetic field (B), which also breaks the symmetry of the ferromagnet, also creates LRO in any dimension d. It follows that a finite real external magnetic field pushes the critical temperature all the way up to T_c \to \infty.

Stated otherwise: regardless of the ground state and of the nature of the low-temperature phase of a magnetic substance(and regardless of spatial dimension d = 1, 2, 3, \ldots,) in the presence of a finite, real, external field B \ne 0 there will be created LRO at all finite T \geq 0. The only interesting question is, how does this magnetic order behave as function of \left\vert B \right\vert in the limit \left\vert B \right\vert \to 0  ? If it remains finite in that limit, we have spontaneous ferromagnetism; if in that limit it vanishes, the information to be sought is in the magnetic susceptibility, a quantity related to the short-range order.

Transfer matrix in d = 1 Ising model

In Gibbsian statistical mechanics, the partition function Z is related to the free energy F by Z=e^{-\beta F}=Tr  \{ e^{-\beta H} \}, hence knowledge of the one yields the other. The temperature is given by \beta = 1/k_B T. The trace (abbreviated: Tr) is defined as the sum over all diagonal elements of the argument, treated as a matrix. Because the number of such terms is exponential in N this sum cannot be performed efficiently in the limit N \to \infty, especially near T_c . (Otherwise one could obtain, for example, the energy as a function of temperature \langle E \rangle = {\partial (\beta F) \over \partial \beta} and other thermodynamic quantities, numerically.) The following calculation, carried out explicitly for the Ising lc, shows how to get around this difficulty in 1D.

In the 1D model with B=0, the quantity inside the Tr \{ \} operation can be written as e^{-\beta H}=e^{\beta JS_1S_2} e^{\beta JS_2S_3} \ldots e^{\beta JS_nS_{n+1}}, \ldots, upon ordering interactions consecutively. We note that each factor has exactly the same form, V = \begin{pmatrix} e^{\beta J} & e^{-\beta J} \\ e^{-\beta J} & e^{\beta J}  \end{pmatrix} = e^{\beta J} \mathbf{l} +e^{-\beta J} \delta_x . (Here \mathbf{l} is the unit 2 X 2 matrix.) It follows that,

Tr \{ e^{-\beta H} \} = Tr \{ V \cdot V \cdot \ldots V \} = Tr \{ V^N \} = \lambda_1^n + \lambda_2^n

where a dot "\cdot" indicates ordinary matrix multiplication. The \lambda's are the two eigenvalues of the 2X2 “transfer matrix” V, viz., \lambda_1 = 2 \cos{h \beta} J and \lambda_2 = 2 \sin{h \beta} J . Then, Z = \lambda_1^n+ \lambda_2^n = \lambda_1^n (1+ \left ( \frac{\lambda_2}{\lambda_1} \right )^N ). Thus only the larger eigenvalue survives, given that the contribution of the smaller one to the partition function (and to F) is exponentially smaller when we proceed to the thermodynamic limit N \to \infty.

The largest eigenvalue also acquires a special significance if we identify the spinor (p_j, 1-p_j) as the probability of the jth spin being “up” (p_j) and “down” (1-p_j). With p_j in the interval 0,1 each of these probabilities is positive and they add up to 1. The corresponding probability of the j+n_{th} spin being “up” or “down” is then (p_j, 1-p_j) \cdot V_n \propto (p_{j+n}, 1-p_{j+n}). We present this as a proportionality and not as an equation because the sum of the probabilities at j + n also need to be normalized. (In probability theory, by normalization is meant that each entry is positive and the sum of all entries is unity.) Therefore the correct equation is:

(p_j, 1-p_j) \cdot V_n = z(n) (p_{j+n}, 1-p_{j+n})

with z(n) to be determined. In the present example it is found that the right-hand vector tends to (½,½ ) asymptotically at large n, regardless of the initial p_j. If the initial state belonged to the largest eigenvalue of V, which is \lambda_1 = 2 \cos{h \beta} J, then p_j = ½ and the preceding result holds for all n and not just asymptotically. It follows that the function z(n) is z^n and that Z = z^N. The other eigenvalue of V, 2 \sin{h \beta} J, belongs to a spinor (½ , –½ ) that cannot be interpreted in terms of probabilities.

Thus, the evaluation of Z can be reduced to an ordinary eigenvalue problem subject to the following famous, if obvious, theorem:

Frobenius’ Theorem: “The largest eigenvalue of a matrix of arbitrary dimension, all elements of which are positive, belongs to an eigenvector that has only non-negative elements.”

For want of a better name we shall call this the “largest eigenvector”. Because all other eigenvectors must have one or more changes of sign in order to be orthogonal to the “largest eigenvector,” it is the only eigenvector that can be normalized according to the following rule: each of its entries must lie in the interval 0,1 and the sum of all its entries must = 1. Thus normalized, the “largest eigenvector” becomes the “reduced density matrix” and its entries are probabilities. Only this “largest eigenvector” is ever needed in the calculation of Z and of free energy F, whereas all eigenvectors are required in the evaluation of any nontrivial correlation function.

Application to Ising model on Sq lattice

In 2D statistical mechanics, the matrix “transferring” the n^{th} column of spins to n+1 is a sort of “quantum” lc. The rows are labeled m=1,2, \dots ,M. All bonds -J' \sum_{m=1}^M S_{n,m} S_{n,m+1} that connect spins on rows m and m+1 within a single vertical n_{th} column must be included. Combining these with horizontal transfers of the n_{th} column into the n+ 1_{st} we obtain a complete transfer operator V_n = \prod_m e^{\beta J' \sigma_{z,(n,m)} \sigma_{z,(n,m+1)}} (e^{\beta J} l_{(n,m)}+e^{-\beta J} \sigma_{x,(n,m)}). The horizontal contributions are given by the V defined just above Eq. (3). Because all references in this operator are to the n^{th} column we can omit the column index n for the sake of notational simplicity.

The second factor in the V shown above is exponentiated as follows: (e^{\beta J} l_{m}+e^{-\beta J} \sigma_{x,m}) \equiv \sqrt{2 \sinh 2 \beta J} e^{K^* \sigma_{x,m}} after defining K= \beta J, using a trivial Pauli operator identity, and defining \tanh K^* = exp -2K as before. After similarly defining K'= \beta J' for vertical bonds, we obtain the full 2D transfer matrix (all m) in the form:

W = C^M e^{K^* \sum_m \sigma_{x,m}} e^{K' \sum_m \sigma_{z,m} \sigma_{z,m+1}} where C = \sqrt{2 \sinh 2K} (5A)

It “transfers” all the spins on the n^{th} column to n + 1. The exponent is congruent to a d=1 Ising lc with nearest-neighbor bonds K' in a transverse magnetic field K^*, an exactly solvable model. We therefore seek to solve the eigenvalue problem: W \Psi = z \Psi for the largest possible value of z. This can be done following a sequence of simplifying transformations. The first of these is a global rotation about the y-axis by 90º, i.e., \sigma_{x,m} \Rightarrow \sigma_{z,m} and \sigma_{z,m} \Rightarrow - \sigma_{x,m} . After this (5A) becomes:

W = C^M e^{K^* \sum_m \sigma_{z,m}} e^{K' \sum_m \sigma_{x,m} \sigma_{x,m+1}} (5B)

It is possible to express both the spin operators \sigma_{z,m} = 2 \sigma_m^+ \sigma_m^- -1 and \sigma_{x,m} = \sigma_m^+ + \sigma_m^- entirely in terms of the spinor raising/lowering operators, in such a way that the exponents are homogeneously quadratic in the \sigma^{\pm '} s. But because these operators are neither fermions (which anticommute) nor bosons (which commute), these exponents cannot be readily diagonalized. The eigenvalues of the exponentiated quadratic forms have no obvious significances.

In fact, the \sigma^{\pm '} s satisfy the following mixed commutation relations:

\sigma_j^+ \sigma_j^- + \sigma_j^- \sigma_j^+ = 1, whereas for j \ne l, \sigma_j^+ \sigma_l^- + \sigma_j^- \sigma_l^+ = 0. (6)

To proceed we make use of a highly nonlinear “Jordan-Wigner” transformation. Such a mapping of fermions onto spins was originally invented in the 1920’s to prove that it was mathematically possible to construct a fermionic field theory out of an array of spins ½. Here we invert the construction, expressing each spin by a fermion operator c that carries an exponential wake made up of “earlier” fermion operators. The algebra of the fermions is postulated to be pure anticommutation:

c_j^{\dagger} c_k + c_k c_j^{\dagger} \equiv \{ c_j^{\dagger}, c_k \} = \delta_j^k,  \{ c_j^{\dagger} c_k^{\dagger} \} = \{ c_j, c_k \} = 0 (7)

We note the following trivial identities: c_j e^{\pm i \pi c_j^{\dagger} c_j} = -c_j, e^{\pm i \pi c_j^{\dagger} c_j} c_j = + c_j, e^{\pm i \pi c_j^{\dagger} c_j} = -1 and e^{\pm 2 i \pi c_j^{\dagger} c_j} = 1.

We construct the Pauli spin operators in (5B) out of such fermion field operators.

\sigma_m^+ = e^{i \pi \displaystyle \sum_{j<m} c_j^{\dagger} c_j} and \sigma_m^- = c_m e^{-i \pi \displaystyle \sum_{j<m} c_j^{\dagger} c_j}. (8)

The reader will want to verify that this representation of the \sigma operators satisfies the mixed commutation relations in (6). Then, inserting (8) into (5B) with the aid of the above identities yields the transfer operator as a product of two exponential forms, each quadratic in fermion operators. Thus, the eigenvalues of the quadratic forms in the following expressions are useful in the evaluation of Z.

W = C^M e^{K^* \sum_m (2 c_m^{\dagger} c_m -1)} e^{K' \sum_m (c_m^{\dagger} - c_m)(c_{m+1}^{\dagger} + c_{m+1})} (9A)

We expand the local operators c_m in plane waves, c_m = \sqrt{\frac{1}{M}} \sum_{k=-\pi}^{\pi} e^{ikm} a(k). (For didactic reasons we have imposed periodic boundary conditions on the lc of fermions, setting c_{m+N}=c_m , but with little additional effort solutions can be found for more general or even for arbitrary boundary conditions, or for periodic boundary conditions on the original spin operators.) Because the Fourier expansion takes the form of a unitary transformation it preserves the algebra. Therefore the a’s satisfy the same set of anticommutation relations as the c’s in Eq. (7), viz.,

a^{\dagger}(k)a(q)+a(q) a^{\dagger}(k) \equiv \{ a^{\dagger}(k), a(q) \} = \delta_{k,q} with all other anticommutators = 0.

By translational invariance this procedure breaks the transfer matrix up into N/2 noninteracting sectors, each labeled by k and containing a form bilinear in fermions:

W = C^M e^{K^* \sum_k (2 a^{\dagger}(k)a(q)-1)} e^{K' \sum_k e^{-ik} ( a^{\dagger}(-k)-a(k) )( a^{\dagger}(k)+a(-k) )} = \prod_{k>0} W(k) (9B)

Each factor on the rhs takes the form,

W(k) \propto e^{2K^* \big( a^{\dagger}(k)a(k) +  a^{\dagger}(-k)a(-k) \big) } e^{K' \big( e^{-ik}(a^{\dagger}(-k)-a(k))(a^{\dagger}(k)+a(-k) ) + e^{ik} (a^{\dagger}(k)-a(-k))(a^{\dagger}(-k)+a(k)) \big) }

Because factors in different k-sectors commute, the individual 4X4 W(k)’s can be diagonalized individually. We need to find the largest solution \lambda_k of the equation W(k) \Psi = \lambda_k \Psi in each separate k-sector. Their product yields the partition function Z. Consequently the free energy F, the logarithm of Z, is explicitly a sum which turns into an integral in \lim M \to \infty,

F = -kTN \sum_{k=0}^{\pi} \log \lambda_k = -kT \frac{NM}{2 \pi} \int_{0}^{\pi} dk \log \lambda_k

The thermodynamic properties are obtained from F by successive differentiations. These derivatives of F/T can be calculated in closed form in terms of elliptic functions. Note that F is (correctly) extensive (proportional to the area NM.) It is easily shown that the identical integral would have been obtained in the large NM limit, had we transferred the rows instead of the columns. So, even though the procedure might have seemed asymmetric, actually all symmetries are preserved.

Without going into details of the evaluation, we find the free energy F has singular derivatives at T_c (the actual value of T_c is easily calculated and agrees with the earlier estimates.) Above T_c there is no LRO and the magnetization is zero. At or below T_c there develops LRO because of a two-fold degeneracy of the ground state of the transfer operator. Correlations can be calculated with the aid of Toeplitz matrix theory. Both the magnetization and the LRO increase with decreasing T until, at T=0, all spins are precisely parallel, all “up” or all “down”.

The two-dimensional transfer matrix of the d = 3 dimensional Ising model can also be written in the form of Eq. (5B) provided the column label m is replaced by a planar label (n,m), with bonds to (n \pm 1,m) and (n,m \pm 1). That is, the transfer matrix for the 3D Ising model can be mapped onto a two-dimensional Ising model in a perpendicular field. Because only the largest eigenvalue is needed in the calculation of Z or F, a variational approximation is useful, as is the renormalization group (RG). Unfortunately the Jordan-Wigner transformation itself fails to be of help, because the exponential tails fail to cancel for half the bonds; therefore the quadratic form in spins on a 2D plane cannot be transformed into a quadratic form in either fermions or bosons. The transfer operator can, however, be reexpressed as a quartic form in fermions. Thus the 3D Ising model falls into the realm of problems (\Phi field theories) that are generally well understood yet have not yet found an exact mathematical solution outside of approximate or RG procedures.

Other seemingly simple model ferromagnets that remain unsolved at the present time include the 2D Ising model in a real, finite, external magnetic field (whether homogeneous or staggered,) as well as most three-dimensional models of any kind.

More symmetry considerations

In the above, the spontaneous breaking of discrete up/down symmetry in the ground state at T = 0 was of no particular consequence. But what if the symmetry had been continuous? Let us consider spins that can point into any direction according to either the O(2) (circular) or O(3) (spherical) symmetries. Then the excitation spectrum above the ground state becomes gapless. Consider the following magnetic polymer (lc) whose dynamics are described by the following Hamiltonian:

H = -J \sum_{n=1}^{N-1} \vec S_n \cdot \vec S_{n+1} (10)

in which we assume the individual spins are themselves classical two- or three-dimensional vectors of unit length (all \vec S_n^2 = 1) and not operators. The dot product (\cdot) ensures the bond energies are scalar under rotations. This H is known as the “classical” Heisenberg Hamiltonian if the spin vectors are three-dimensional or as the “classical” “X-Y” model if the spins are constrained to lie in the x,y plane. We distinguish the classical spins here from any of the quantum versions discussed supra, in which the components of the individual spins fail to commute.

(The distinction between ferromagnetism in the extreme quantum limit of s=½ and the classical models of ferromagnetism may, in fact, be academic, as all interesting properties, correlations, etc. are qualitatively independent of the magnitudes of the spins; not so, the difference between models with Z(2), O(2) and O(3) symmetries. Such rotational symmetries, and not the quantum mechanics, seem to be a determining factor in the thermodynamics.)

The classical O(2) X-Y model is also known as the “plane rotator” model, given that bonds connecting nearest-neighbor sites i,j take the form -J \cos (\vartheta_i - \vartheta_j) of rigid coupled pendulums.

On a d = 2 lattice, neither the O(2) nor the O(3) model can sustain LRO at any T > 0. The lack of spontaneous symmetry breaking at any finite T in both models is the result of a rigorous no LRO theorem first proved by Hohenberg and later generalized by Mermin and Wagner to all systems having a continuous symmetry in d \le 2 spatial dimensions. This theorem clearly does not apply to the Ising model because of its discrete symmetry; nor does it address the issues of the existence or nonexistence of a phase transition at finite T.

In fact, the X-Y model (but not the Heisenberg model!) does exhibit an unusual phase transition on a two-dimensional lattice, at a finite T_{K-T} approximately equal to 0.9 J/k_B separating two disordered phases. This, the well-known “Kosterlitz-Thouless” phase transition, is a two-dimensional version of a liquid <–> vapor transition. There are many ways to examine its critical properties.

In one of them, the transfer matrix of the classical model is mapped onto a one-dimensional anisotropic Heisenberg lc of spins ½ in which the J-matrix takes the form \begin{pmatrix}  0.9 & 0 & 0 \\  0 & 0.9 & 0 \\  0 & 0 & g  \end{pmatrix}, with the effective parameter g varying as 1/k_T. The critical point is thus at g=0.9. Internal excitations (here physically interpreted as the spectrum of quantized clockwise or anticlockwise vortices) are gapped (bound) at low temperatures but have a continuous spectrum above T_{K-T}.

On the same two-dimensional lattice, the O(3) model remains in its high-temperature phase at all finite T without undergoing any phase transition whatever. The cause is, presumably, the high density of low-energy hedgehog-like excitations called skyrmions that are allowed in this model but not in the other.

Both models do support a gapless spectrum of spin waves at low T. In dimensions d \ge 3, both O(2) and O(3) models exhibit rather ordinary order-disorder second-order phase transitions at a finite T_c . Now let us examine some details in d=1 as the simplest example.

Here again the ground state energy is the same E_0 but the excitation spectrum can now be vanishingly small, as a small twist \phi in the orientation \vartheta of each spin relative to its neighbor (say, \vartheta_{n+1} = \vartheta_n + \phi with \vartheta_1=0) only costs an energy 1/2 J \phi^2 per bond. The imposition of any boundary conditions – say, requiring that the first and last spins be parallel – causes \phi to become discretized, e.g. \phi = 2n \pi/N, with n an integer \le N. The ratio N/n = \lambda is related to the wavelength of the excitation, in units of the lattice parameter a. We call this excitation a spin wave; its energy forms a quasi-continuum and vanishes as n^2/N. The lack of an energy gap in the large N limit is a feature of many field theories with continuous symmetries, as was first remarked in the 1950’s by Nambu, Goldstone, et al, by analogy with the spin waves that Bloch found 20 years earlier. Where the “Goldstone mode” relates to the spin wave spectrum, the Goldstone boson relates to the “magnon”, which is an elementary bosonic particle that results from further quantization of spin dynamics. It carries 1 unit of angular momentum \hbar.

But just as there is an exception in particle physics, e.g. for the Higgs boson, there is one in magnetism for integer quantum spins in a 1D antiferromagnetic Heisenberg chain, where the lowest excitations have to surmount a “mass” gap. We turn to this interesting anomaly next.

Role of spin in d = 1 antiferromagnets

The antiferromagnetic Heisenberg Hamiltonian, i.e., the lc of Eq. (10) with J < 0, in which spins are S = 1, 2, \ldots operators, as opposed to classical vectors considered earlier, has a magnon spectrum exhibiting a finite excitation gap even at the longest wavelengths. This is quite unlike the gapless spin wave spectrum \omega \propto k^2 of the ferromagnet, J > 0, and from the excitations of the S=½ antiferromagnet – the spectrum of which, \omega \propto \left | k \right | , is calculated exactly by “Bethe’s ansatz” discussed at the end of this article. The spin wave excitations of S= 3/2, 5/2 \ldots  lc antiferromagnets and of classical spin antiferromagnets also are all gapless. So what happens to integer spin operators of magnitude S(S+1) = 2, 6, 12, \ldots, to make them that different?

We start with the proof that the excitation spectrum for all half-odd-integer spins (1/2, 3/2, \ldots) is continuous and gapless, regardless whether the sign of J is positive or negative.

The proof in 1D is as follows: take the ground state wave function \Psi_0 for a lc subject to periodic boundary conditions and operate on it by \Gamma = \prod_n \gamma_n in a way that distorts the n^{th} individual bond only by a small amount 1/N, thus each of the N bonds sees its energy rise in an amount O(1/N^2). If \Gamma \Psi_0 is orthogonal to \Psi_0 we have constructed an excited state of total energy O(1/N) above the ground state. One may consider this as the variational calculation of the energy of a 1-magnon state. (The proof in d = 2 and d = 3 just extends the proof for the lc to finite-width strips or cylinders.) If the spins are higher but of the form half-odd-integer, an operator \Gamma having these properties is easily constructed. In the case of integer spins, however, the corresponding operator, when applied to the ground state wave function, fails to yield a state orthogonal to the ground state and the proof fails.

This energy gap in the spectrum of integer spin antiferromagnetic lc’s was first conjectured by D. Haldane and it is named after him; it turns out to exist only in 1D antiferromagnets and only if S is an integer: S =1, 2 , 3, \ldots. The magnitude of Haldane’s gap goes to zero as S is increased. (This is expected if the model is to approach its gapless correspondence limit smoothly.) The gap also vanishes in any dimension d > 1. Thus we are dealing with a feature that is both interesting and fragile, which is optimum for spins S =1 in d = 1 and will be extensively revisited at the end of this article.

We can discern the dichotomy in a lc of as few as 3 spins S arrayed in a triangle. For 3 spins the Heisenberg Hamiltonian is diagonalizable: H = \frac{-J}{2} \left[ T(T+1)-3S(S+1) \right] , where T = the total combined spin. T has a maximum value 3/2 for spins S=½ and a maximum 3 for spins 1. In the ferromagnet (J > 0) the ground states for both these values of S belongs to their respective maxima and are similar in all aspects.

In the antiferromagnetic (J < 0) triangle of 3 spins, the ground states belong to a total spin minimum. The minima are T = 0 for spins 1, and T = ½ for spins ½. After some back-of-the-envelope exact calculations, one determines that the ground state of the three spins ½ consist of two degenerate doublets, i.e., that it is 4-fold degenerate. Thus there is no energy gap separating the two lowest-lying states.

For the antiferromagnetic triangle made of spins S = 1, however, the ground state belongs to a unique T = 0 singlet state. All other eigenstates in this model lie at energies that are at least J higher so that here, the Haldane gap is J.

Exercise for the reader: contrast the eigenstates of S= ½ and S=1 Heisenberg antiferromagnetic chains of 4 spins when laid out on a single square plaquette, with or without diagonal linkages. (Either geometry can be done analytically in closed form.) The conclusions are quite similar. Discussion of the Haldane gap in the limit of large N is reprised near the end of this article. References to early and pertinent literature are given in ref. 2.

The antiferromagnetic lc of spins s = 1 exhibits an additional idiosyncracy at large N: the two ends, at n = 1 and N respectively, act like free spins ½ in their response to external fields, as in paramagnetic resonance (EPR.) The characteristics of EPR allow one to determine the spin; the ends of a chain of spins 1 display spins ½ ! This surprising behavior was, in fact, predicted theoretically – and confirmed by experiment – almost simultaneously. Chains of S = 2 spins would, presumably, have ends that exhibit the properties of spins 1, etc. The situation is similar to that in the defective antiferromagnets discussed below, given that dangling ends of the chain at n=1 and N can be viewed as breaks in a longer chain, or as a symmetry-breaking disruption of translational invariance, caused by cutting the bond connecting the last spin at N to the first, in a chain with periodic boundary conditions.

No magnetism at finite T in 1D

In 1D one can also obtain the free energy of the classical O(3) Heisenberg model without separate calculations of energy and entropy, using a transfer matrix for the partition function Z = e^{-\beta F}. To within boundary terms its largest eigenvalue yields,

F=-(N-1)k_BT \log \Big( \frac{k_BT}{J} \sinh \frac{J}{k_BT} \Big) (11)

an expression that translates to a (thermal averaged) angle between neighboring spins of < \left\vert \phi \right\vert > \approx \sqrt{\frac{k_BT}{J}} at low T. The correlations of two spins separated by a macroscopic distance na fall off \propto exp -n < \left\vert \phi \right\vert >. Thus, in this gapless one-dimensional model with continuous symmetry the ground state LRO disappears exponentially at any finite T > 0. Following the Mermin-Wagner theorem some such result was to be expected. It is also notable that many years earlier, L.D. Landau had already observed that no model with finite range interactions can sustain LRO in 1D at any finite T.

The nature of local moments

Given all these choices of models, what are the most physically plausible magnetic contents of a unit cell? Typically, spins and angular momenta that characterize an atom or ion disappear (are quenched) in solids. The earth elements are counter-examples, in that they have unfilled f-shells that can accommodate up to 7 electrons in an orbital that is somewhat smaller than typical inter-atomic distances and which are, therefore, not much disturbed by the crystal symmetry. A factor of 7 Bohr magnetons puts these spin magnitudes largely in the classical limit, such as was assumed in the classical Heisenberg model treated above.

Actually, the Hamiltonian in Heisenberg’s 1928 model of magnetism, or what is commonly understood today to be the Heisenberg Hamiltonian, is similar to Eq. (4) in form but uses operators for the components of individual vector spins. As we know, in quantum theory all angular momenta satisfy an algebra \vec S \times \vec S=i\hbar\vec S with 3 operator components \vec S = (S_x,S_y,S_z). In the extreme quantum limit, for a single electron, the S’s have an irreducible representation in 2 \times 2 Pauli spin matrices that anticommute with one another but commute with operators at all other sites. For higher spins the irreducible representations are 2s+1 dimensional, where \vec S^2 = \hbar^2 s(s+1)  ; the various components are operators that satisfy \left[ S_x,S_y \right] =2i\hbar S_z , which is equivalent to the generic \vec S \times \vec S=i\hbar\vec S.

Despite the introduction of operators...

Nature of magnetic interactions in cells and in metals

Antiferromagnetism and ferrimagnetism

Ferrimagnets

The Kondo lattice

Nagaoka mechanism of ferromagnetism

Frustration

Defective antiferromagnets

Decomposition of 1D quantum antiferromagnets into fermions

Conclusion

Physics and mathematics have always had a close relation, but none closer than the calculation of magnetism using concepts in analysis, number theory, algebra and group theory. We have tried to show this in the present article by concentrating on just a few of the topics that have come up in an evolving theory of magnetism. Hopefully, an even broader understanding of magnetic phenomena will follow new mathematics and mathematical concepts.

References

A. Levinovitz and N. Ringertz, Eds., The Nobel Prize, the first 100 Years, Imperial College Press, London, 2001

Daniel C. Mattis, The Theory of Magnetism Made Simple: an introduction to physical concepts and to some useful mathematical methods, World Scientific Publ. Co., Singapore, 2006; the development of various aspects of magnetism and of its associated theories is the subject of chapters 1 and 2 and an extensive bibliography leads to the original documentation.

much of this material is discussed in greater depth in the various chapts. 3 – 9 of ref. 2.

P.A.M. Dirac, Proc. Roy. Soc. (London) A123, 60 (1931), also see pp. 76,78, ref. 2. The magnetic monopole has never been observed, but the existence of just one such monopole would necessitate quantization of all electric charges in the universe, a known fact of nature – and one that is otherwise unexplained.

ref. 2, p. 31 recounts the 1920’s history of this concept that culminated in the constant , in which me is the mass of the electron

ref. 2, chapt. 8.

H. Kramers and G. Wannier, Phys. Rev. 60, 252, 263 (1941)

L. Onsager, Phys. Rev. 65, 117 (1944)

T. Schultz, D. Mattis and E. Lieb, Rev. Mod. Phys. 36, 856 (1964)

see ref. 2, chapt. 3, §3.12

As we shall see later, this statement does not apply to the special case of one-dimensional antiferromagnets, in which the magnitude of the individual spins plays an important role.

N. Mermin and H. Wagner, Phys. Rev. Lett. 17, 1133 and 1307 (1966)

D. Mattis, Phys. Lett. 104, 357 (1984)

H. Bethe, Zeit. f. Physik 71, 205 (1931), reprinted in English translation in D. Mattis, The Many-Body Problem, an encyclopedia of exactly solved models in one dimension, World Scientific Publ., Singapore, 2009 (3rd Printing with revisions and corrections.)

E. Lieb, T. Schultz and D. Mattis, Ann. Phys. (NY) 16, 407 (1961), also reprinted in its entirety in The Many-Body Problem cited above. Among recent applications, note J. Jing and H. Ma, Level Crossing and Quantum Phase Transition of XY Ring, Mod. Phys. Lett. B22, 535 (2008)

W. Heisenberg, H. Wagner and K. Yamazaki, Nuov. Cim. LIX A, (1 Feb. 1969)

We list some early papers: P. W. Anderson, Heavy-electron superconductors, spin fluctuations and triplet pairing, Phys. Rev. B30, 1549 (1984), G. Baskaran and P.W. Anderson, Gauge theory of high-temperature superconductors and strongly correlated Fermi systems, Phys. Rev. B37, 580 (1988), P. Fazekas and E. Muller-Hartmann, Magnetic and nonmagnetic ground states of the Kondo lattice, Zeit. f. Phys. B85, 285 (1991), M. Sigrist, H. Tsunetsuga and K. Ueda, Rigorous results for the one-electron Kondo lattice model, Phys. Rev. Lett. 67, 2211 (1991), J.A. White, Numerical exact diagonalization of the one-dimensional symmetric Kondo lattice, Phys. Rev. B46, 13905 (1992), P.Paul and D. Mattis, Exctinction of spin interactions in the 2D Kondo lattice, Int. J.Mod. Phys. B24, 3199 (1995), H. Tsunetsuga, M. Sigrist and K. Ueda, The ground state phase diagram of the one dimensional Kondo lattice model, Rev. Mod. Phys. 69, 809-864 (1997), S. Capponi and F.F. Assaad, Spin and charge dynamics of the ferromagnetic and antiferromagnetic two-dimensional half-filled Kondo model, Phys. Rev. B63, 155114 (2001)

Z. Gulacsi and D. Vollhardt find this in a similar model (the periodic Anderson model) that can be solved exactly; see arXiv:cond-mat/0504174v1 (7 April, 2005)

G. Forgacs, Phys. Rev. B22, 4473 (1980). See also D. Mattis and R. Swendsen, Statistical Mechanics Made Simple, 2nd Edition, World Scientific Publ. Co, Singapore, 2008, §8.12, for the explicit solution of the transfer matrix in this example and for a discussion of unfrustrated (separable model) spin glasses.

N. Nagosa, Y. Hatsugai and M. Imada, J. Phys. Soc. Jpn. 58, 978 (1989), with further refs. given in ref 2., chapt. 5.

F.D.M. Haldane, Phys. Lett. A93, 454 (1983) and Phys. Rev. Lett. 50, 1153 (1983), see also: D. Controzzi and E. Hawkins,


See also

Invited by: Prof. Vieri Mastropietro, Mathematics Department, Univ. Roma Tor Vergata, Italy
Assistant editor: Mr. Abdellatif Nemri, Department of biological sciences, University of Montreal, Canada
For authors