New University of Maryland research could fundamentally improve the ability of artificial intelligence to control how robots and other “agents” translate what they know and sense into what they do.
When a baseball or softball player hits a fast ball, their brain (human intelligence) seamlessly, and almost instantly, combines sensory input (sight & sound information about the pitcher’s release, the speed and movement of the ball), with knowledge (memories of the pitcher’s tendencies, information on the batter’s own abilities, the number of balls and strikes, the game situation, etc.) to send nerve signals (motor commands) to muscles resulting in a powerful mid-air impact of bat and ball.
Even with the best current AI and sensory capabilities, a robot facing the same pitcher would have no chance. AI uses a linkage system to slowly coordinate data from sensors and stored data with the robot’s motor capabilities and actions. In addition robots can’t remember anything.
However, for robots with big league aspirations, hope may be found in a new paper by University of Maryland researchers that was just published in the journal Science Robotics. Their work introduces a new way of combining, or integrating, AI perception and motor commands using the what’s called hyperdimensional computing theory.
The authors—UMD computer Science Ph.D. students Anton Mitrokhin and Peter Sutor, Jr.; Cornelia Fermüller, an associate research scientist with the University of Maryland Institute for Advanced Computer Studies; and Computer Science Professor Yiannis Aloimonos—say that such integration is the most important challenge facing the robotics field, and their new paper marks the first time that perception and action have been integrated.
Currently, a robot’s sensors and the actuators that move it are separate systems, linked together by a central learning mechanism that infers a needed action given sensor data, or vice versa. This cumbersome three-part AI system—each part speaking its own language—is a slow way to get robots to accomplish sensorimotor tasks. The next step in robotics will be to integrate a robot’s perceptions with its motor capabilities. This fusion, known as “active perception,” would provide a more efficient and faster way for the robot to complete tasks.
Hyperdimensional Computing for active perception and memory
In the authors’ new hyperdimensional computing theory a robot’s operating system would be based on hyperdimensional binary vectors (HBVs), which exist in a sparse and extremely high-dimensional space. HBVs can represent disparate discrete things—for example, a single image, a concept, a sound or an instruction; sequences made up of discrete things; and groupings of discrete things and sequences. They can account for all these types of information in a meaningfully constructed way, binding each modality together in long vectors of 1s and 0s with equal dimension. In this system, action possibilities, sensory input and other information occupy the same space, are in the same language, and are fused, creating a kind of memory for the robot.
A hyperdimensional framework can turn any sequence of “instants” into new HBVs, and group existing HBVs together, all in the same vector length. This is a natural way to create semantically significant and informed “memories.” The encoding of more and more information in turn leads to “history” vectors and the ability to remember. Signals become vectors, indexing translates to memory, and learning happens through clustering.
The robot’s memories of what it has sensed and done in the past could lead it to expect future perception and influence its future actions. This active perception would enable the robot to become more autonomous and better able to complete tasks.
“An active perceiver knows why it wishes to sense, then chooses what to perceive, and determines how, when and where to achieve the perception,” says Aloimonos. “It selects and fixates on scenes, moments in time, and episodes. Then it aligns its mechanisms, sensors, and other components to act on what it wants to see, and selects viewpoints from which to best capture what it intends.”
“Our hyperdimensional framework can address each of these goals, ” he says.
Applications of the Maryland research could extend far beyond robotics. The ultimate goal is to be able to do AI itself in a fundamentally different way: from concepts to signals to language. Hyperdimensional computing could provide a faster and more efficient alternative model to the iterative neural net and deep learning AI methods currently used in computing applications such as data mining, visual recognition and translating images to text.
“Neural network-based AI methods are big and slow, because they are not able to remember,” says Mitrokhin. “Our hyperdimensional theory method can create memories, which will require a lot less computation, and should make such tasks much faster and more efficient.”
Combining Hyperdimensional Computing with Better Motion Sensing
The authors also note that one of the most important improvements needed to integrate a robot’s sensing with its actions is better motion sensing. Using a dynamic vision sensor (DVS) instead of conventional cameras for this task has been a key component of testing their hyperdimensional computing theory.
Most computer vision techniques use images whose quality is determined in pixel density. Pixel density represents moments in time well, but are not ideal for representing motion because motion is a continuous entity. A a dynamic vision sensor (DVS) is different. It does not “take pictures” in the usual sense, but captures motion, particularly the edges of objects as they move. DVS imaging thus is better suited to robotic needs for ‘seeing” motion. Inspired by mammalian vision, DVS accommodates a large range of lighting conditions, from dark to bright, and can resolve very fast motion with little delay in transmission (low latency). These are ideal properties for real-time applications in robotics, such as autonomous navigation. The data DVS accumulates are much better suited to the integrated environment of the hyperdimensional computing theory.
“The data from this sensor, the event clouds, are much sparser than sequences of images, says Fermüller. “Furthermore, the event clouds contain the essential information for coding space and motion, conceptually the contours in the scene and their movement.”