1. Home
  2. / Science and Technology
  3. / Robots learn from dogs to understand human gestures and can now locate objects with 89% success.
Reading time 3 min of reading Comments 0 comments

Robots learn from dogs to understand human gestures and can now locate objects with 89% success.

Published on 01/06/2026 at 23:58
Be the first to react!
React to this article

Research from Brown University combines language, human gestures, and computer vision to improve object search by robots, with 89% average success in simulations and inspiration from how dogs interpret pointing, looks, and intentions in interactions with people.

Robots capable of locating objects through language, gestures, and vision achieved 89% average success in simulations at Brown University, in a study accepted for HRI 2026, scheduled for March in Edinburgh.

Robots learn from dogs to interpret human commands

The advancement addresses a common difficulty in the domestic and professional use of machines: understanding incomplete requests. For a person, asking for a key, a cup, or a tool seems simple. For a robotic system, the task involves ambiguity, movement, similar objects, and imperfect clues.

The team at Brown University developed the LEGS-POMDP, a system that combines language, human pointing, and visual observation. The inspiration came from research at the Brown Dog Lab on how dogs interpret gestures and looks, especially when humans point to something.

The proposal does not treat the gesture as an exact line. The pointing is modeled as a probability cone, closer to real human behavior. Thus, the robot estimates a probable target area, rather than assuming the finger indicates a perfectly precise direction.

This detail is central because people rarely communicate like technical manuals. They speak in an abbreviated manner, point approximately, change positions, and may partially hide the object they desire. The system attempts to transform this unstable scenario into calculated decisions.

How the system decides where to search

The name LEGS-POMDP refers to a probabilistic structure based on a partially observable Markov decision process. In practice, it helps the machine act when it does not have all the necessary information about the environment, the object, or human intention.

Instead of deciding too quickly, the system maintains hypotheses about the identity and location of the sought item. These hypotheses are updated as new clues appear, including verbal description, gesture direction, and visual reading of the scene.

The combination allows the robot to better explore the space before concluding the search. It can adjust the viewpoint, review a possibility, and delay the final choice until gathering stronger evidence about where the correct object is.

In the experiments, multimodal integration outperformed approaches based solely on language or gestures. The result reinforces the idea that human communication depends on the sum of signals, not a single isolated instruction.

Tests indicate progress, but still with limits

An average rate of 89% was recorded in simulations described as demanding. The team also conducted tests with a real quadruped robot, used as qualitative validation of the approach. The research will be presented at HRI 2026, from March 16 to 19, 2026.

The use of a vision-language model expands the system’s ability to interpret scenes. Thus, the machine can relate verbal descriptions, spatial constraints, and visible objects, even when there is disorganization, similarity between items, or obstacles in the way.

The suggested applications involve everyday and industrial environments. In a home, robots could search for medications on a cluttered counter or find glasses among scattered items. In a workshop, they could retrieve parts and tools without excessively precise commands.

Even so, the results do not mean that fully intuitive mechanical assistants are already available. The 89% figure comes from simulations, while physical tests indicate robustness but do not eliminate the challenges of real, varied, and unpredictable environments.

The progress helps bring laboratories closer to everyday situations, where simple requests always carry noise, pauses, and inaccuracies.

The main advancement is in the way of dealing with uncertainty. By observing dogs, human gestures, and natural language, robotics gains a path to create machines less dependent on rigid commands and more capable of interpreting intentions in context.

Click here to check the study.

Sign up
Notify of
guest
0 Comments
most recent
older Most voted
Tags
Fabio Lucas Carvalho

Journalist specializing in a wide variety of topics, such as cars, technology, politics, naval industry, geopolitics, renewable energy, and economics. Active since 2015, with prominent publications on major news portals. My background in Information Technology Management from Faculdade de Petrolina (Facape) adds a unique technical perspective to my analyses and reports. With over 10,000 articles published in renowned outlets, I always aim to provide detailed information and relevant insights for the reader.

Share in apps
0
I'd love to hear your opinion, please comment.x