Coarse Grained

Mathematics and physics through TCM, in a non-representative stream

The polite fiction of symmetry breaking

The concept of spontaneous symmetry breaking (SSB) is well-studied and used in condensed matter. Most prototypically, the first textbook example is that of a weakly interacting Bose gas, where the global $U(1)$ symmetry is broken to give a Bose condensed phase, with massless Goldstone bosons as the residual evidence of that continous symmetry. This is usually described as $\langle \phi \rangle$ being an order parameter, and becoming non-zero, where $\phi$ is the annihilation operator for the a particle. The intuition is clear: given a potential like $V = |\phi|^4 - |\phi|^2$ there is a degenerate ring of non-zero $\phi$ which has the lowest energy, and by picking a particular phase we can minimise the energy.

The story becomes a bit messier upon closer inspection. The overall resolution is actually not that complex, but it seems to be mostly folklore, in that no-one has written it down.

To begin with, notice that the Hamiltonian $H$ commutes with the number operator $N = \phi^\dagger \phi$ — we do not, after all, randomly create or destroy rubidium atoms in a trap (losses are accounted for in a density matrix formalism, something we will come back to). Thus, all energy eigenstates, and particularly the ground state, will be an eigenstate of $N$. Thus by elementary arguments, $\langle \phi \rangle=0$, always. Fundamentally, by picking a definite phase for $\phi$, we introduce uncertainty in $N$.

There are some conventional ways around this. One is to say that what we really care about is off-diagonal, long-range order (ODLRO). That is, $\langle \phi^\dagger (r) \phi(0) \rangle$ as $r \rightarrow \infty$. This certainly is completely well-defined... except for real systems. The first experimental systems were a few hundred to thousand atoms, and very small. It is not clear that infinity range is applicable, and since experimentally they did in fact condense, nor necessary. In addition, ODLRO is usually quite hard to really compute.

The other standard method is to pass to a grand canonical ensemble, and argue that in the thermodynamic limit this will produce the same answers for means of macroscopic variables. This looks like it lets us have fluctuations in $N$. This shares the same problem as the previous method, but allows us to use easier mathematics. On the other hand, it's just wrong: the grand canonical ensemble is defined by $H - \mu N$, which again has eigenstates labelled by $N$ — so $\langle r \rangle$ is still zero! Indeed, what we end up with is a density matrix, with eigenvalues defined by $\exp(H - \mu N)$ in the $N$-basis! Even there, the expectation value of $\phi$ is still zero.

Antony Leggett, having thought about this quite hard, comes at the problem from an experimental side — for him the important thing is to make contact with the available experiments, which is quite reasonable. He works without 2nd quantisation, and considers the single-particle density matrix; he shows that in the normal phase, this has eigenvalues of the same order, whereas in the broken symmetry phase, one eigenvalue is near unity, with other eigenvalues of order $1/N$ or smaller. From a theoretical point of view however, this seems to throw out so much of the elegance in SSB as a generic mechanism of describing critical behaviour.

A truly mathematically correct justification is to add a term $-\epsilon \phi$ to $H$, which explicitly breaks the symmetry. Then we carefully take the limit of $N \rightarrow \infty$ first, before $\epsilon \rightarrow 0$. Analysis of the Hilbert space shows that it breaks into separate super-selection sectors, indexed by the phase. Indeed, for things like ferromagnetic transitions, one can reasonably declare that it would be unphysical for there not to be some stray fields in an experiment which would provide a small but definite symmetry breaking potential. Unfortunately for Bose condensation, there is no way to introduce a non-Hermitian term such as $\phi$ into $H$ — physical loss of atoms from the trap needs to be accounted for by a density matrix again, and the loss term only appears in the Lindbladian, not the Hamiltonian. Indeed, even for the ferromagnetic transition, one could argue that this simply transfers the problem — the universe as a whole is presumably symmetric under $SO(3)$, so the symmetry breaking term which is classical is just an approximation to a properly invariant quantum interaction term with a quantum environment.

So far, the analysis is standard, and may be found in Leggett's book Quantum Liquids. What follows is the folklore.

The final analysis of the above is actually pretty close to the mark. The missing ingredient is decoherence. A useful view point is that the symmetry breaking system is acting as an amplifier — it turns a very small asymmetric potential into a macroscopically observable one, and fluctuations in the environment certainly counts. More precisely, the ground state of the total system with its environment is a symmetric combination of states with entanglement between the large coherent fluctuation in the system and the much smaller fluctuation in the enviroment. Upon taking the thermodynamic limit and tracing out the environment, we are left with a density matrix of states equivalent to the superselection sectors description above, justifying its use.


Topological insulators II: disorder

Last time we saw how a one particle, pure crystal theory of electrons might give rise to quantised transverse (i.e. Hall) conductance. The theory is simple, but rather leaves out crucial aspects of the phenomenon. Important aspects missing are interactions and disorder. Interactions lead to fractional quantum hall effects, and we do not wish to worry about those here. It's the usual post hoc justification where we see that a single particle theory seems to work, even though precisely why is mysterious — we ask for deliverance from the god of Landau, etc. As far as disorder goes however, things are not so simple. Loss of (lattice) translational invariance leads to a loss of the Brillouin zone, and its attendant topological structures. The classification of states by de Rham cohomology (the TKNN invariant) doesn't work if we don't have a manifold and a differential structure.

Disorder itself is a vast topic. One approach is to simply write down a Hamiltonian with a disorder potential written in, i.e. quenched, and work to compute averages over the disorder realisations. Powerful methods include using supersymmetric field theory and renormalisation. However, in keeping with the idea of simple physics, hard maths, we're going to hit the problem head on with noncommutative geometry.

Heuristically, we can build a picture via Anderson localisation. Although we can no longer use momentum to label states, we can still talk about the density of states for single particles. The disorder first causes a broadening of the Landau levels. Second, we know that in 2D arbitrarily weak disorder can cause localisation, albeit with a divergent localisation length. In a magnetic field the Landau levels are mobility edges, and so on approaching the level, the states start spanning the whole sample. Thus for a finite sample, around the Landau level there are extended states. It seems reasonable to hope that as long as the Fermi level is moving through the localised states, we still get a quantised transverse conductance. To show this, we just need to generalise the TKNN invariant to apply to a system which doesn't have momentum operators commute with the Hamiltonian. The full details can be found in arxiv:cond-mat/9411052.

First, we should say a few words about how noncommutative geometry works. On a usual space $X$ (a manifold say), we can consider the functions $C(X)$ of continuous functions from $X$ to (say) the reals. These functions from a commutative $C^*$-algebra. But we can also go backwards: start with an abstractly defined commutative algebra, ask for the space on which these elements have representation as functions on that space. It turns out that these two views are exactly equivalent, and we can therefore calculate topological quantities (such as the Chern classes) of $X$ by computations on the algebra $C(X)$.

Noncommutative geometry is what you get if you remove the restriction to commutative algebras. It turns out that many of the geometric concepts can be pushed through, and we get new, novel spaces in which to play. In our case, the strategy is to take the TKNN invariant, replace anything which mentions geometry by its algebraic equivalent, and then remove the commutativity requirement, and hope that it remains well-defined and integer valued. The paper above by Bellissard et al. also go through a first principles derivation of the algebraicised Kubo conductance formula, and show that it is related to the Chern class the same way as the TKNN invariant and consider the corrections to the quantum hall regime (it's an excellent paper).

To give a flavour of how it works, recall that the TKNN invariant is basically a suitably integrated derivative of some operator by momentum. Integration can be replaced by a suitable trace; the fact that it is over the Brillouin zone can be considered a normalisation, i.e. it is actually an integral divided by the area. Differentiation can be treated algebraically by treating it as a derivation: $$ \partial_{k} A \rightarrow -i[x,A], $$ relying on the commutator obeying a Leibniz rule and $[x,k] = i$. Thus, we have again turned the problem into a mathematical one, which means we can find clever people to solve it for us. In this case, it can be shown that subject to states at the Fermi level being localised, the algebraicised TKNN invariant is still integer valued. Thus we have a very direct way to show that the topological order is robust in the face of disorder.


Topological insulators

Currently, topological insulators are very fashionable. However, the usual discussions we've been enjoying (enduring?) have gone pretty much over the heads of most of us little ones. I think the primary problem so far has been that there are plenty of obvious things, which really should have been repeated, but have not. Part of the problem is that people seem to think that the integer quantum hall effect is well-understood by the audience. Below, we hopefully introduce the necessary context.

We are dealing with pure electrons in a lattice. No phonons, no impurities, no interactions. This means that the many-body theory reduces in the canonical way to single particle theory — the many body wavefunction is just a suitably antisymmetrised product of single particle states. Further, invariance under translation by lattice vectors imply that the single particle space separates into sectors labelled by momentum $\mathbf{k}$, which are not mixed by the Hamiltonian. Thus for each sector $\mathbf{k}$ we have a separate Hamiltonian $H_\mathbf{k}$, the total Hamiltonian $H$ being the direct sum of these, over all $\mathbf{k}$. We may assume that there are no degenerate energy levels in any of the sub-Hamiltonians, since either they are protected by some symmetry, or they will not be generic, i.e. we are not in a stable phase. This is essentially just band theory.

The lattice induces the topology of a torus in reciprocal space. The sub-Hamiltonians $H_\mathbf{k}$ can be seen as a mapping from the torus $T^d$ to self-adjoint operators on a suitable Hilbert space. For physically reasonable suitations, the sub-Hamiltonians $H_\mathbf{k}$ should be continuous as functions of $\mathbf{k}$. We can now invoke some algebraic topology, and ask how many equivalence classes exist of such mappings, where equivalence is defined to be any continuous deformation without causing any of the $H_\mathbf{k}$ to become degenerate by having a level-crossing, i.e. how many phases exist. This is a well-defined mathematical question, and has a well-defined answer in terms of the Chern classes, which physically are called TKNN invariants. It unimportant for us what they are, but just that they exist and may be computed for real situations.

Note that in a magnetic field the above is not strictly true. A uniform magnetic field causes orthogonal translations to become non-commutative. We are therefore unable to simultaneously label the states with $k_x$ and $k_y$ (say in 2D). However, at particular values of $B$-field the commutator vanishes — exactly those at which the quantum hall plateaus exist. The TKNN invariant also turns out to be proportional to the transverse conductance. The different integer filling factors are thus proper phases, separated by some complex quantum phase transition. In practise, impurities actually produce most of the observed phenonmenon (i.e. the broad plateaus), but these can be thought of as simply fighting against the magnetic non-commutativity and maintaining the phase even for not exactly correct field strengths.

Now, since we are concerned with insulators, we may assume that our Fermi level lies in a gap; this is also true of the integer quantum hall states. The vacuum outside our sample is also such a state. It has, by assumption, a different value for the TKNN invariant. Thus somewhere on the edge, there will be a level crossing, and the gap will close forming a conducting edge state. Thus we see that topological phases will necessarily be accompanied by edge states.

This understanding is still deficient in a very crucial way. The construction above relies strongly on single particle behaviour and momentum (or at least pseudomomentum) being a good quantum number. As mentioned, magnetic fields which are not exactly integer quanta per unit cell cause problems --- in fact we get fractional quantum hall effects, where the interaction leads to intrinsically many-body effects. Furthermore, it is not clear that these effects are impervious to impurities — to say that it is pretected because it is topological is getting things mixed up; we can only declare it to be topological because it is insensitive to impurities. Indeed, in the case of the integer quantum hall effect impurities actually extend the "radius of convergence" of the phase.


Higher derivative theories

We tend to not use higher derivative theories. It turns out that there is a very good reason for this, but that reason is rarely discussed in textbooks. We will take, for concreteness, $L\left(q,\dot q, \ddot q\right)$, a Lagrangian which depends on the 2nd derivative in an essential manner. Inessential dependences are terms such as $q\ddot q$ which may be partially integrated to give ${\dot q}^2$. Mathematically, this is expressed through the necessity of being able to invert the expression $$P_2 = \frac{\partial L\left(q,\dot q, \ddot q\right)}{\partial \ddot q},$$ and get a closed form for $\ddot q \left(q, \dot q, P_2 \right)$. Note that usually we also require a similar statement for $\dot q \left(q, p\right)$, and failure in this respect is a sign of having a constrained system, possibly with gauge degrees of freedom.

In any case, the non-degeneracy leads to the Euler-Lagrange equations in the usual manner: $$\frac{\partial L}{\partial q} - \frac{d}{dt}\frac{\partial L}{\partial \dot q} + \frac{d^2}{dt^2}\frac{\partial L}{\partial \ddot q} = 0.$$ This is then fourth order in $t$, and so require four initial conditions, such as $q$, $\dot q$, $\ddot q$, $q^{(3)}$. This is twice as many as usual, and so we can get a new pair of conjugate variables when we move into a Hamiltonian formalism. We follow the steps of Ostrogradski, and choose our canonical variables as $Q_1 = q$, $Q_2 = \dot q$, which leads to \begin{align} P_1 &= \frac{\partial L}{\partial \dot q} - \frac{d}{dt}\frac{\partial L}{\partial \ddot q}, \\ P_2 &= \frac{\partial L}{\partial \ddot q}. \end{align} Note that the non-degeneracy allows $\ddot q$ to be expressed in terms of $Q_1$, $Q_2$ and $P_2$ through the second equation, and the first one is only necessary to define $q^{(3)}$.

We can then proceed in the usual fashion, and find the Hamiltonian through a Legendre transform: \begin{align} H &= \sum_i P_i \dot{Q}_i - L \\ &= P_1 Q_2 + P_2 \ddot{q}\left(Q_1, Q_2, P_2\right) - L\left(Q_1, Q_2,\ddot{q}\right). \end{align} Again, as usual, we can take time derivative of the Hamiltonian to find that it is time independent if the Lagrangian does not depend on time explicitly, and thus can be identified as the energy of the system.

However, we now have a problem: $H$ has only a linear dependence on $P_1$, and so can be arbitrarily negative. In an interacting system this means that we can excite positive energy modes by transferring energy from the negative energy modes, and in doing so we would increase the entropy — there would simply be more particles, and so a need to put them somewhere. Thus such a system could never reach equilibrium, exploding instantly in an orgy of particle creation. This problem is in fact completely general, and applies to even higher derivatives in a similar fashion.


Central Limit Theorem

There's a proof of the Central Limit Theorem which I am very fond of, which is not often seen in textbooks. It is a sort of renormalisation argument. In a way, it's not even a rigorous proof — often the way with renormalisation — but in conjunction with the more classical proofs it lends a powerful insight.

Without loss of generality, let's consider a whole bunch of identical, univariate variables $X_n$, each with zero mean. Thus we have quite trivially that a sum of $N$ of them will still have zero expectation, and a variance of $N$.

Now, rather than summing all of them at the same time, we do it in steps, and renormalise along the way to see what happens. So concretely, let $X$ have a distribution given by $f$, which is assumed to be sufficiently well-behaved for whatever we need to do. Then $X+X \sim f \star f$, where the multiplication is a convolution. This is our coarse-graining step, so we still need to re-scale, so that we get back a univariate distribution: $$f^\prime(x) = \sqrt 2 (f \star f)(\sqrt 2 x).$$ Convolutions are easiest to take in Fourier space: $$\widetilde{f^\prime}(k) = \left[ \tilde{f}(k/\sqrt{2}) \right]^2.$$ It is then fairly trivial to check that the univariate Gaussian $\widetilde{f^*}(k) = e^{-k^2/\sqrt{2}}$ is a fixed point.

We can view this step as a transform on the space of distributions, and so it makes sense to linearise about this fixed point and look at what happens to small perturbations: \begin{align*} \widetilde{f^*}(k) + \widetilde{g_n}(k) &\rightarrow \widetilde{f^*}(k) + 2 \widetilde{g_n}(k/\sqrt{2}) \widetilde{f^*}(k/\sqrt{2}) + \mathrm{h.o.t.} \\ &= \widetilde{f^*}(k) + \lambda_n \widetilde{g_n}(k) \end{align*} Which has solutions as $\widetilde{g_n}(k) = (ik)^n \widetilde{f^*}(k)$ with eigenvalues $\left|\lambda_n\right| = 2^{1-n/2}$; these correspond to mathematically meaningful perturbations if and only $n$ is an integer greater than 0, for reasons of convergence and normalisation. That still leaves $n=1$ or $n=2$ as being relevant and marginal; the former correspond to shifting the mean, the latter to changing the variance — and since those are not allowed by assumption, we find that the Gaussian is a stable fixed point.

Notice that this does not say anything about the size of basin of attraction, so if another fixed point existed it could cause finite perturbations to actually flow away. Indeed, this makes it not quite a proper proof. On the other hand, this procedure gives the actual rate of convergence to a Gaussian, something that the classical proofs do not give.


Product representation

In dealing with complex analytic functions, it is often quite handy to represent them as their Taylor expansions, i.e. a summation representation. We can generalise a little bit and expand even around non-essential, isolated singularities with a Laurent series. However, quite often we really care about the zeros of a function (e.g. Yang-Lee circle theorem on zeros of the grand partition function for Ising-esque models), and extracting those out of summations is unwieldy. Therefore, it would be much nicer to have a product representation instead.

For polynomials, such a representation is obvious, and unique — the fundamental theorem of algebra guarantees the factorisation into linear factors: $$p(z) = p(0) \prod_{\mathrm{finite~}n} \left(1 - \frac{z}{z_n}\right). $$ The various $z_n$ are then the location of the zeros. We would like to extend this to more general functions.

However, in general this is difficult, and non-unique (see Weierstrass factorization theorem for an existence statement). For one particular subset however, we can create an effective procedure for manufacturing these: functions with only simple isolated zeros and no poles.

So let $f(z)$ be such a function. If $f$ has a zero of order $m$ at $z=0$, then we can divide out by $z^m$ and get something without a zero at the origin, and so without loss of generality that's what we'll assume. Let $z_n$ be the location of remaining (infinite number of) zeros. Then $g(z) = [\ln f(z)]^\prime = f^\prime(z)/f(z)$ has only simple isolated poles with unit residues at $z_n$; thus if we find a summation representation of $g$ we could integrate and exponentiate to find a product representation for $f$.

Now consider: $$\frac{1}{2\pi i} \oint_{\Gamma_n} \frac{g(z^\prime)}{z^\prime-z} dz^\prime = g(z) + \sum^n_j \frac{1}{z_j - z}$$ where $\Gamma_n$ is a contour which encloses the closest $n$ poles to the origin. Then $$g(z) - g(0) = \frac{z}{2\pi i} \oint_{\Gamma_n} \frac{g(z^\prime)}{z^\prime(z-z^\prime)} dz^\prime + \sum_j^n \left(\frac{1}{z-z_j} + \frac{1}{z_j}\right).$$ Thus if we can find a sequence of contours $\Gamma_n$ such that $g$ remains bounded on them, the integral will converge to zero as $n \rightarrow 0$. In that case, we find $$g(z) = g(0) + \sum_n \left(\frac{1}{z-z_n} + \frac{1}{z_n}\right).$$

So now we can return to factorising $f$. Integrating $g$ gives $$\ln f(z) - \ln f(0) = cz + \sum_n \left[ \ln\left(1-\frac{z}{z_n}\right) + \frac{z}{z_n} \right]$$ where $c = g(0)$; re-exponentiating gives $$f(z) = f(0) e^{cz} \prod_n \left(1 - \frac{z}{z_n}\right) e^{z/z_n}.$$

As an example, consider $f(z) = \sin(z)/z$. Then $g(z) = -1/z + \cot(z)$; we can pick contours $\Gamma_n$ which sit between the poles in $\cot z$, and our procedure will converge. A bit of limit work shows that $g(0) = 0$ and $f(0) = 1$. The zeros sit at $z_n = n \pi$, with $n$ being any non-zero integer. Thus we find \begin{align*} \frac{\sin(z)}{z} &= \prod_{n \neq 0} \left(1 - \frac{z}{\pi n}\right) e^{z/n\pi} \\ &= \prod_{n=1}^{\infty} \left(1 - \frac{z^2}{\pi^2 n^2}\right). \end{align*}


$\mathbb{C} \succ \mathbb{R}$

Suppose I have two functions $f$ and $g$, on the real numbers (or some compact interval if it makes you feel warmer inside). Is it reasonable to expect that if their derivatives match at some point, i.e. $f(x)=g(x)$, $f^\prime(x)=g^\prime(x)$, and so on, then they are equal? Furthermore, suppose this is true for all values $y \le x$?

As a counter-example, consider the function: $$f(x) = \begin{cases}0 & x \le 0 \\ \exp(-1/x) & x \gt 0 \end{cases}$$ This function is continuous at $x=0$, and its derivatives there are all zero. In other words, at $x=0$ it "exactly looks like" the constant function $g(x)=0$.


In a way, it seems perverse that we can't "sense" the impeding rise as we move through the origin. Another way to say it is that the derivatives of $f$ are well defined, and so can be used to form a Taylor series; however, the function does not equal its Taylor expansion, even though the latter exists.

This all makes a bit more sense when discussed in the context of complex analysis. The function $f(z)$, regarded as a function over $\mathbb{C}$ has an essential singularity at $z=0$. This is an example of the fact that although smooth functions over the reals seems nicer, from an elementary point of view, smooth functions over the complex numbers enjoy more "globally" nice properties, e.g. over $\mathbb{C}$ the existence of a Taylor expansion is the same as being smooth and defined almost everywhere (see Wikipedia's entry on this theorem).


Let's get serious

The intention of this blog is to have a record of interesting papers, ideas and discussions. Regular features will probably be David's Fairy Tales, TCM seminars, correlated systems lunches and maybe even some biology.

A central part of this blog will be the existence of maths, so here some rough tests. Inline: $\sqrt{1-\xi^2}$. Display: $$\zeta(2) = \sum_{n=1}^\infty \frac{1}{n^2} = \frac{\pi^2}{6}.$$