Shengdong Zhao, Koichi Nakamura, Kentaro Ishii, Takeo Igarashi

Magic Cards: A Paper Tag Interface for Implicit Robot Control


A design trend for modern computing is to create calm technology for machines to assist people without explicit interaction. Similar concepts can also benefit Human-Robot Interaction (HRI). Instead of explicitly interacting with robots to complete tasks (robot-centric design approach), users can interact with tasks directly (task-centric design approach), and robots only perform these tasks in the background. We implemented a paper-tag based interface to support task-centric HRI in a sensor augmented home environment. Tasks (house work) are assigned using paper tags placed on corresponding real world objects and environment. Robots avoid disturbance to the host and work while hosts are away. Our initial user studies show that paper-tag based task-centric HRI are particularly simple to use.

Motivation and Vision

It has long been our dream to have things done “magically”. In the West, the Brothers Grimm told a famous fairy tale about an honest and hardworking shoemaker getting “magical” help from two little dwarfs with his cobbling while sleeping. In the East, an ancient Chinese story fantasized about a girl hidden inside an oyster shell, completing housework while her host was away. Modern society also envisions a world embedded with “calm” technologies that will work in the background without requiring active attention.

In the emerging field of Human Robot Interaction (HRI), most work has been focused on designing techniques of interaction between humans and robots, hence the term HRI. We acknowledge that explicit interactions between humans and robots are essential, especially when working with entertainment and social robots. However, in many other cases, such as dealing with domestic housework, having the tasks done is the ultimate goal, rather than interacting with the robots. This motivated us to investigate a new task-centric HRI interaction paradigm to complete housework. We used paper tags (cards) as the interaction techniques to systematically manage robots to complete housework in a sensor-augmented home environment. With this new approach, instead of requiring users to interact with robots, users directly give instructions on tasks and objects.

Fig.2:Robot-centric approach (left) and task-centric approach (right). User explicitly interacts with robot in robotcentric approach. In contrast, user only interacts with task while robot works in background in task-centric approach.

An initial prototype system was developed where users placed paper cards (tags) on the floor to give instructions (such as vacuuming or delivering an object to a specific target location) and the system drove robots to complete the tasks by recognizing the tags using cameras fixed in the environment. Paper cards were selected as the interactive media due to their long history of use, lack of battery requirements, and cost benefits. Also, paper cards are tangible artifacts that have many advantages for interactive design. In order to know how the system works from the users’ point of view, we present several usage scenarios.

Usage Scenario


John and Mary are a working couple without children. During weekdays, John and Mary wake up early, and leave for work at 7 am. In the evening, Mary usually arrives home around 6:30 pm while John does not return until 8 to 9 pm. Mary is responsible for most of the housework in their two-bedroom apartment. Maintenance tasks, such as vacuuming and trash management, are typically done during the weekdays while major cleaning is left to weekends.


Now, let us see how the Robot Housework System with an interface of magic cards can assist their lives. The installation process is handled by professionals and involves mounting cameras on the ceiling and setting up the software. Object tags are placed on certain objects Mary wants the system to uniquely identify and handle, such as trash bins. Finally, she receives a set of paper tags organized in a binder to give instructions to the system with basic instructions on how to use it.

Fig.3: Visual illustration of user scenario for set up.

First day

In the evening, the system is ready. Mary starts to use it to plan housework tasks for the next day. One task she wants done tomorrow is vacuuming since the installation process left dust and dirt behind. To have the robots carry out the task, she places a “Vacuum this room” tag and a “Finish by evening” tag in each of the two rooms. In addition, she needs the trash bin to be placed at the door before 8 am the following morning for pickup. However, since more trash will be produced in the morning, she does not want it to be moved now. She also wants the empty trash bin to be returned to the original location in the evening. To do that, she places a “Take me to ‘destinations’” tag beside the trash bin, the “First destination” tag near the apartment door, and a “Start at 7:30 am” tag next to the deliver tag to specify the time for the act of delivery. The “Second destination” tag and a “Start at evening” tag are placed at the current location of the trash bin.

The next day, Mary goes to work at 7 as usual, and when she returns home in the evening, she finds that the tags have been removed. The trash has been emptied and returned to its original location, and both rooms appear clean.

Fig.4: Visual illustration of user scenario for first day.

Another day

One day, Mary is alone at home, and plans to watch a movie in the living room, and she decides she wants the bedroom to be cleaned while she is watching the movie. She places a “Vacuum this room” and a “Do this first” tag beside it. She then places a “Mop this room” tag and a “Do this second” tag on the ground. There is one especially dirty spot in the bedroom, so she places a “Mop this location” tag at that spot to ensure that location will be cleaned multiple times. She also has an expensive vase near the corner of the room and she is afraid that the robot’s movements may accidentally break it, so she places an “Avoid this object” tag next to it. Finally, she places a “Finish by noon” tag on the floor, and starts to watch the movie in the other room.

Fig.5: Visual illustration of user scenario for another day.

User Interface: Paper Tags

The design of the layout of the cards is similar to that of Collaborage. The paper tags in our system serve a dual purpose; they both inform humans so that they know how to assign tasks easily and accurately and the system itself about the nature of the tasks and where exactly these tasks are located. Natural language and intuitive image icons are used to communicate with humans while 2-D id-markers are used to instruct the system. There is an example of this design in Fig. 6.

Fig.6: This figure shows front and backside of tag and highlights its different regions.

We use proprietary 2-D planar id-markers, which are very similar to those in earlier work such as CyberCode, and ARTag. A marker consisted of a 3 x 3 black and 5 white matrix pattern within a black border with a white margin around it (Fig. 4 top left). Each marker is about 5 x 5 cm, which we manage to recognize each stably using a 960 x 720 resolution ceiling camera (2.5 m high) covering a 3 x 3 m region on the floor. Our system could uniquely identify 120 patterns with their orientations. We observed that it worked robustly in various illumination environments, sufficient for our pilot study.

Fig.6: These figures show all tags for current prototype. Tags are first classified into 4 main types, and further divided into 14 functional sets within types.

Note that 2-D planar id-markers are not the only solution to providing identity, position, and orientation information on objects as well as that on robots to the system. Other methods, including magnetic fields, radio, active LED, and laser beacons can also be used. Nevertheless, passive marker patterns that can be detected by computer vision were chosen because they are the most simple and inexpensive.

Using the system for a while, Mary found some sets of tasks appeared repeatedly, and could be reused. To do that, after planning a set of tasks such as vacuuming first and then mopping, she places an additional “Memorize this set of tasks for future reuse” tag, and when these tasks are finished, a new “Memorized set of tasks” tag representing the combination of these tasks appears (this is produced automatically by a printer robot) and is ready to be reused for similar occasions in the future.

System architecture and implementation

Overall system architecture

The entire system hardware consists of sensors, computers, speakers, and robots (Fig. ?). The sensors are Logitech QuickCam® Pro for notebooks installed on the ceiling 2.5 meters above the floor. In the current implementation, they are connected to computers using extended USB cables. In the future, wireless connection is certainly preferred. Each camera covered an area of (2.5 x 2 m); with a combination of four cameras, a total area of 20 m2 (5 x 4 m) was covered. Each camera had 960 x 780 resolution, and is capable of detecting markers with the size of 4.5 x 4.5 cm. The cameras are set up so that images slightly overlapped to allow image calibration and combination using the markers. The current implementation uses two computers connected via a TCP/IP network. The first has wireless receptors and is responsible for communicating with robots via Bluetooth.

Fig.6: Overall system architecture.

Types of robots

All robots are products of iRobot Corporation. Of these, the vacuum robot (Roomba) and mopping robot (Scooba) are commercial robots (Fig. ?, top right) while the others are the modified use iRobot Create (Fig. ?, top left). According to their functionality, they can be roughly divided into working robots (robots that do actual housework) and administrative robots (robots that support other robots).

Fig.6: At top are photos of various robots from iRobot used in project. At bottom are card pickup robot and mobile printer robot modified using iRobot Create.

Paper and Presentation

Selected media coverage: Nikkei Sangyo Shinbun on May 22 (Japanese Industrial Newspaper); Japanese nation-wide TV on May 29.

View this page in Romanian courtesy of azoft

Written by Shengdong Zhao

Shen is an Associate Professor in the Computer Science Department, National University of Singapore (NUS). He is the founding director of the NUS-HCI Lab, specializing in research and innovation in the area of human computer interaction.