2 minute read

Maximum Capacity of Hopfield Model

A Simple Way to Calculate the Maximum Capacity of Hopfield Model

Here I use the way proposed by Peretto (1988). I also assume readers have basic knowledge of Hopfield model, or you will find it difficult to understand this post.

Suppose we use \( N \) neurons to store \( p \) states, where \( N, p \to \infty \), and define \( \alpha = \frac{p}{N} \).

$$ S_i(t+1) = 1 \text{ with prob } \propto \exp{\beta \sum_j J_{ij} S_j} $$

$$ S_i(t+1) = -1 \text{ with prob } \propto \exp{-\beta \sum_j J_{ij} S_j} $$

The expectation value is $$ \langle S_i \rangle = \langle \tanh{\beta \sum_j J_{ij} S_j} \rangle $$

Applying mean-field theory, we get $$ \langle S_i \rangle = \tanh{\beta \sum_j J_{ij} \langle S_j \rangle} $$

This is a closed system of equations for thermally averaged spins. From Hebbian learning rules we know $$ J_{ij} = \frac{1}{N} \sum_\mu \xi_i^\mu \xi_j^\mu $$

Then, define overlaps between spins and patterns, which is our first order parameter, by $$ m_\nu = \frac{1}{N} \sum_i \xi_i^\nu \langle S_i \rangle $$

Combining the equations above, we get $$ m_\nu = \frac{1}{N} \sum_i \tanh{\beta \sum_\mu \xi_i^\mu \xi_i^\nu m_\mu} $$

Here we make an important assumption: We retrieve a simple pattern \( \xi^1 \), and other patterns contribute a noise. The noise from the small components adds up as Gaussian. For \( \nu = 1 \), $$ m_1 = \frac{1}{N} \sum_i \tanh{\beta \sum_i \xi_i^1 \xi_i^\mu m_\mu} = \int \frac{dz}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} \tanh{\beta(m_1 + \sqrt{\alpha r} z)} $$

where \( \alpha \), known as the random overlap parameter, is defined as $$ r = \frac{1}{\alpha} \sum_{\nu \ne 1} m_\nu^2 $$

To calculate \( r \), we first calculate \( m_\nu \): $$ m_\mu = \frac{1}{N} \sum_i \tanh{\beta \sum_\mu \xi_i^\nu \xi_i^\mu m_\mu} = \frac{1}{N} \sum_i \xi_i^\nu \tanh{\beta \left( \xi_i^1 m_1 + \sum_{\mu \ne 1} \xi_i^\mu m_\mu \right)} $$

So, we get $$ m_\nu = \frac{N^{-1} \sum_i \xi_i^\nu \xi_i^1 \tanh{\beta(m_1 + \sum_{\mu \ne 1, \nu} \xi_i^\mu \xi_i^1 m_\mu)}}{1 - \beta(1 - q)} $$ where $$ q = \int \frac{dz}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} \tanh^2{\beta(m_1 + \sqrt{\alpha r} z)} $$

After getting \( m_\nu \), we continue to get \( m_\nu^2 \) in order to calculate \( r \): $$ m_\nu^2 = N^{-2}[1 - \beta(1 - q)]^{-2} \sum_{i,j} \xi_i^\nu \xi_i^1 \xi_j^\nu \xi_j^1 \tanh{\beta(m_1 + \sum_{\mu \ne 1, \nu} \xi_i^\mu \xi_i^1 m_\mu)} \tanh{\beta(m_1 + \sum_{\mu \ne 1, \nu} \xi_j^\mu \xi_j^1 m_\mu)} $$

Averaging over all possibilities of \( \xi \), \( \xi_i^\nu \xi_j^\nu \) reduces to \( \delta_{ij} \), hence we get $$ m_\nu^2 = N^{-2}[1 - \beta(1 - q)]^{-2} \sum_i \tanh^2{\beta(m_1 + \sum_{\mu \ne 1, \nu} \xi_i^\mu \xi_i^1 m_\mu)} $$

Finally, \( r \) can be described by $$ r = \frac{1}{\alpha} \sum_{\nu \ne 1} m_\nu^2 \approx \frac{1}{\alpha} \sum_\nu m_\nu^2 = N m_\nu^2 = \frac{q}{[1 - \beta(1 - q)]^2} $$

We also have two previous equations: $$ m_1 = \int \frac{dz}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} \tanh{\beta(m_1 + \sqrt{\alpha r} z)} $$ $$ q = \int \frac{dz}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} \tanh^2{\beta(m_1 + \sqrt{\alpha r} z)} $$

In the Hopfield Model, as \( T \to 0 \) and \( \beta \to \infty \), in this limit: $$ m_1 = \text{erf}\left(\frac{m_1}{\sqrt{2\alpha r}}\right), \quad q = 1 - \frac{1}{\beta} \sqrt{\frac{2}{\pi \alpha r}} \exp{\left(-\frac{m_1^2}{2\alpha r}\right)} $$

Set \( y = \frac{m_1}{\sqrt{2\alpha r}} \), we obtain: $$ y = \frac{\text{erf}(y)}{\frac{2}{\sqrt{\pi}} e^{-y^2} + \sqrt{2\alpha}} $$

When \( \alpha < \alpha_C = 0.138 \), this equation has physically meaningful solutions, and when \( \alpha > \alpha_C \), it only has a trivial solution \( y = 0 \).

Haaa! We finally get the maximum capacity of the Hopfield Model! If you want to use a more fancy way, refer to Replica Method!