We sat down with Dr. Madeleine Bartlett, a researcher in the field of human-robot interaction, to delve into her paper, "Estimating Levels of Engagement for Social Human-Robot Interaction using Legendre Memory Units." Madeleine's work with Legendre Memory Units (LMUs) has opened new avenues in the realm of human-robot interaction and artificial intelligence. Join us as we explore her journey, the challenges she encountered, and the implications of her research, shedding light on the role of LMUs in revolutionizing the way we understand and employ technology.
My name is Madeleine Bartlett, and my journey into the world of research began at the University of Exeter in the UK, where I pursued both my bachelor's and master's degrees in Psychology. This period marked the beginning of my fascination with human and animal learning, and more specifically, the application of computational models in understanding cognitive processes. This fascination guided me towards the intersection of human-robot interaction and human-computer interaction. I saw immense potential in exploring how our comprehension of human and animal information processing could enhance the design of intelligent technologies.
The transition from experimental psychology to technology was a natural progression in my academic journey. After obtaining my master's degree, I pursued my doctorate research at the University of Plymouth, where I had the opportunity to work with LMUs. This experience allowed me to delve deeper into human-robot interaction and signal processing. It was during my postdoctoral research at the University of Waterloo that I further explored the field of reinforcement learning, which opened up a whole new world of opportunities for me. I began to see how it was possible to transpose our human social abilities onto artificial agents, like robots and computers.
As social beings, we constantly interpret the emotional and engagement states of those around us. We pick up on visual cues and other subtle signals to understand if someone is interested in a conversation or how they are feeling. It was this nuanced skill that I wanted to embed within artificial agents. The primary goal was to understand levels of engagement.
My research essentially builds on observable behavioural cues. Although the primary focus is on engagement rather than emotional states, I firmly believe it's a step towards creating a world where artificial intelligence can truly understand and react to human social cues. The exploration of this idea is what propelled my doctoral research.
The primary objective of our study was to create a classifier capable of recognizing a student's level of engagement in a game played on a tabletop device. Our source of information was visual cues captured on video, including body postures, movements, and some details of facial expressions. One of our key areas of interest was to see whether we could minimize the volume of training data required for a network to perform this task.
This possibility stemmed from the understanding that many internal states, such as engagement, operate on a continuum of intensity. You can visualize this as a scale where on one end, a person is deeply engrossed in a task, while on the other, they're barely involved, and of course, there are various levels in between. Our aim was to exploit this continuum of intensity in gauging engagement.
At the time of this study, many of my colleagues were working on implementing robots in classroom environments. This influenced my decision to focus on this particular area, as there was an increasing interest in using robots as tools to assist teachers and provide tailored learning experiences.
These classroom robots are envisioned as more than just pre-programmed machines with a 'one size fits all' approach.
The ideal is to develop robots that can adapt in real-time to a student's needs. For instance, if a student appears bored, the robot could re-engage their attention, or if they are struggling with a task, it could provide an easier challenge or offer hints.
This responsive adaptability could revolutionize the learning experience offered by robotic/artificial tutors, making it more personalized and effective. It's important to clarify that these robots are tools for teachers, not replacements. Bridging the gap between human interaction and technological intervention in education was the core focus of our study.
Legendre Memory Units, or LMUs, are a type of recurrent neural network exceptionally adept at representing continuous time input signals. Imagine these units as a vault storing the time history of an input signal over a moving window, regardless of the duration.
In our research, we utilized LMUs to capture a representation of events transpiring in our videos. Instead of merely focusing on the current frame, the LMU would output a vector carrying information about the entire three-second window, thereby incorporating historical context into our data.
This vector was then fed into a multi-layer perceptron, which was assigned the task of classifying student engagement levels based on the provided information.
The choice to use LMUs was primarily driven by two factors. First, their ability to represent windows of time was invaluable. We were keen to explore whether incorporating a history of engagement—rather than relying solely on individual frames—would lead to a more precise classification system.
Second, LMUs are lightweight in terms of computational demands. Given that our research operates in the field of human-robot interaction, it's crucial to consider the possibility of embedding this technology within a robot. In the future, we want our robots to function autonomously without the need to constantly communicate with a larger external computer.
Owing to their compactness, LMUs promise great potential for application in low-power systems. Their capacity to represent time windows and their lightweight nature make them a pivotal tool in our research.
The success of our model is, to a large extent, attributable to the Legendre Memory Units (LMUs). By offering us a historical timeline of engagement rather than isolated frames, LMUs significantly enhanced the performance of our model. We validated this enhancement by comparing the model's performance when fed with individual frames versus when it utilized the historical information provided by the LMUs.
Interestingly, the use of LMUs not only improved our model's accuracy but also cut the training and testing time by half. This outcome was an unexpected but welcome surprise. A model that requires less time and energy for training becomes significantly more efficient and practical.
Thus, the incorporation of LMUs offered dual benefits: it not only enhanced the accuracy of our model but also contributed to its overall efficiency.
Incorporating LMUs into our model presented its own set of challenges, but I was fortunate to have a strong support system. I was in direct communication with Dr. Terry Stewart, a co-author on our paper, who was deeply involved in the development of the LMUs. His insights and guidance were invaluable in resolving any issues we encountered.
Personally, the most intimidating aspect was the complexity of the mathematical object that the LMU represents, especially given that I don't have a particularly strong background in mathematics. The major challenge was not just understanding what the LMUs were, but also discerning their usefulness in our context and anticipating what they could potentially achieve.
Once I developed this understanding, the process became significantly smoother. I cannot overstate the importance of Dr. Stewart's assistance throughout this journey. His expertise and support were instrumental in successfully incorporating the LMUs into our model and ultimately achieving our research objectives.
The key contribution of our work, I believe, lies in the lightweight nature of the network. This means it runs swiftly and could potentially be implemented on a robot directly, although we didn't test this specifically in our research. What makes this particularly exciting is that because LMUs are part of the Neural Engineering Framework, they can be implemented in spiking neurons and hosted on neuromorphic hardware, which are renowned for their low power consumption.
This opens up immense possibilities for autonomous systems that can function in real-world environments.
The use of spiking neural models on neuromorphic hardware holds great promise for developing adaptive, energy efficient and autonomous robots and is an active and exciting field of research. I find this potential for future research tremendously exciting.
I believe the most immediate application of our research would be in the realm of neuromorphic hardware, driving robots and other artificial systems. I'm particularly intrigued by the potential of a teaching robot that can adapt in real-time to a student's engagement level or the difficulty they're experiencing with a task.
This ability to dynamically adjust the learning experience could greatly enhance educational outcomes, personalizing the teaching process to each individual learner's needs and responses. One of the aspects I most enjoy about human-robot and human-computer interaction research is its tangible connection to real-world settings.
The practical implications of our findings are quite evident and exciting. The potential to directly impact real-world scenarios makes this field of research not only stimulating but also immensely rewarding.
Certainly. My future research plans are rooted in a return to human and animal learning and cognition. I'm keen to investigate how Legendre Memory Units can assist in processing continuous time inputs for such models. LMUs represent a step towards more biologically plausible models of cognition as they facilitate operation in continuous time in a manner that mirrors how humans and animals learn and interact.
In addition, the prospect of using LMUs in neuromorphic hardware to drive artificial systems holds immense fascination for me. I see it as a compelling pathway to explore. Through this, I plan to continue leveraging the strengths of LMUs, aiming to improve their use in creating more responsive, efficient, and potentially autonomous systems in the future.