Legendre Memory Unit (LMU)

What are LMUs?

The Legendre Memory Unit, or LMU, is a new kind of recurrent neural network that lies at the heart of Applied Brain Research’s time series technology. With the LMU, we have been achieving state-of-the-art results on a wide variety of tasks, ranging from language understanding to advanced signal processing. The aim of this post is to provide a non-technical explanation of how it works, why it’s unique, and who it’s most useful for.

Background of the LMU

The LMU is meant to solve time series problems – that is, problems where the order of data is important for understanding it. For example, this would be the case if you wanted to summarize long passages of text, or work with noisy measurements of someone’s pulse.

The challenge for dealing with such data is one of memory. To identify patterns or make good inferences about the future, it’s crucial to remember things about the past. Take the image below, for instance, and suppose we want to predict the growth of what is being measured in the graph. What we would have to do is analyze how that quantity has behaved over time. Extrapolating from the past activity, we could then forecast the future activity.

Figure 1. A time series prediction problem.

This seemingly simple task is complicated by the fact that even a short continuous signal can carry an infinite amount of information. So, we face the issue of how to represent information accurately, but with finite resources. The LMU is the best way to do so.

Time series problems have traditionally been tackled with recurrent neural networks, which can feed their own output back into themselves as input. This allows them to tell the network in the future what it processed in the past, and keep track of the order of data. So, it is a very efficient way to process time series data.

Figure 2. A recurrent neural network. Neural networks like this one can feed their own output back into themselves as input, allowing them to have a memory.

Long ago (in AI years) a type of recurrent neural network called long short-term memory, or LSTM, was the best game in town for dealing with time series problems. In more modern AI times, a different class of neural network known as transformers has taken over due to their ability to learn complex time series tasks well (no doubt you’ve heard of ChatGPT). Transformers are not recurrent neural networks; instead, they look at a full sequence of data all at once.

Both approaches have serious limitations. LSTMs are difficult to train at large scales, and cannot process long sequences effectively. Transformers, on the other hand, store and process information in ways that are computationally demanding. The LMU advances beyond both, boasting the efficiency of a recurrent neural network and the trainability of a transformer without the downsides of either.

State-of-the-Art performance with the LMU

The secret behind the LMU’s superiority is how it handles compression. What the LMU does, in essence, is reduce one’s data set to a summary form that is more manageable. Importantly, for any given set of computational resources, we can mathematically prove this compression it performs to be optimal for time series.

Figure 3. An instance of data compression. After compression, the same data is represented in an abbreviated form. Depending on the task and resources available, the granularity of the compressed data will vary.

In other words, the LMU does the best possible job of remembering data given the constraints it has to work under. Few neural networks offer guarantees of this kind, which is part of what makes the LMU special.

As a result, the LMU is highly accurate. With the same data set and model size, it outperforms rival time series AI. Alternatively, the LMU can match its competition with less data or a smaller model (or sometimes both).

Thus, the LMU can promise substantial improvements to larger systems, and also excels in contexts with limited memory or compute, like in the case of small offline devices. Imagine full natural language processing on a wearable device that isn’t connected to the cloud, for example. Wherever the LMU is implemented, it does more with less, bringing down power demands and energy costs.

The LMU is therefore ideal for a wide variety of users working with time series. Indeed, it has already been put to effective use in speech recognition for hospitals and smart wearables processing health data.

The LMU is for you if:

  • You are dealing with time series data and looking to make accuracy gains. Problems like long time sequences and noisy data are our speciality.
  • You have concerns about the power consumption and energy costs of your large AI models. The bigger your model, the more you could be saving with an LMU-based alternative.
  • You want the best possible performance from your smaller, well-curated data sets.
  • You are looking to run highly functional AI in constrained resource settings.
  • You want to run your cloud model locally to increase privacy, decrease latency, and avoid ongoing fees from computing in the cloud.

To start leveraging our algorithmic advantage in your own technology and products, send us a message!


Sr Research Scientist II

Travis has a PhD in systems design engineering with a focus in computational neuroscience from the University of Waterloo. He is one of the co-founders of ABR, and is an expert in the design of neuromorphic and edge AI applications and building computational models of the motor control system in the brain.

Strategy AnalysT

Christian brings a diverse academic background from institutions like the University of Toronto and University of Waterloo. Specializing at the intersection of science, technology, and ethics, his role at ABR focuses on market strategy and making intricate AI concepts, like Legendre Memory Units (LMUs), accessible to a wider audience.