Case Study: Hors d’Oeuvres, Anyone?
The University of South Florida’s (USF) entry in the 1999 AAAI Mobile RobotCompetition Hors d’Oeuvres, Anyone? event provides a case study of
selecting sensors, constructing reactive behaviors, and using behavioral sensor
fusion. The entry used two cooperative robots. The goal was to push the
envelope in robotic sensing by using six different sensing modalities with
40 different physical devices on one robot and four modalities with 23 devices
on the other. Although the robots executed under a hybrid deliberative/
reactive style of architecture (covered later in Ch. 7), the basic design
process followed the steps in Ch. 5, and produced a largely behavioral system.
The entry won a Technical Achievement award of Best Conceptual and
Artistic Design after a series of hardware and software timing failures prevented
it from working at the competition.
Step 1: Describe the task. The “Hors d’Oeuvres, Anyone?” event required
fully autonomous robots to circulate in the reception area at the AAAI conference
with a tray of finger food, find and approach people, interact with them,
and refill the serving tray. Each robot was scored on covering the area, noticing
when the tray needed refilling, interacting with people naturally, having
a distinct personality, and recognizing VIPs. The USF entry used two robots,
shown in Fig. 6.30, costumed by the USF Art Department in order to attract
attention. The Borg Shark was the server robot, and navigated through audience
following a pre-planned route. It would stop and serve at regular
intervals or whenever a treat was removed from the tray. It used a DEC-talk
synthesizer to broadcast audio files inviting audience members to remove a
treat from its mouth, but it had no way of hearing and understanding natural
language human commands. In order to interact more naturally with
people, the Borg Shark attempted to maintain eye contact with people. If it
saw a person, it estimated the location in image coordinates of where a VIP’s
colored badge might be, given the location of the face.
When the Borg Shark was almost out of food, she would call over radio
ethernet her assistant robot, Puffer Fish. Puffer Fish would be stationary in
sleep mode, inhaling and exhaling through her inflatable skirt and turning
her cameras as if avoiding people crowding her. When Puffer Fish awoke,
she would head with a full tray of food (placed on her stand by a human)
to the coordinates given to her by Borg Shark. She would also look for
Borg Shark’s distinctive blue costume, using both dead reckoning and visual
search to move to goal. Once within 2 meters of Borg Shark, Puffer Fish
would stop. A human would physically swap trays, then kick the bumpers
to signal that the transfer was over. Borg Shark would resume its serving
cycle, while Puffer Fish would return to its home refill station.
Both robots were expected to avoid all obstacles: tables, chairs, people.
Since there was a tendency for people to surround the robotos, preventing
coverage of the area or refilling, the robots had different responses. Borg
Shark, who was programmed to be smarmy, would announce that it was
coming through and begin moving. Puffer Fish, with a grumpy, sullen personality,
would vocally complain, then loudly inflate her skirt and make a
rapid jerk forward, usually causing spectators to back up and give her room.
Step 2: Describe the robots. The robots used for the entry were Nomad 200
bases, each with a unique sensor suite.
A better solution would be to detect a person using vision. Notice that
detection is not the same thing as recognition. Detectionmeans that the robot
is able to identify a face, which is reactive. Recognition would label the face
and be able to recognize it at a later time, which is a deliberative function, not
reactive. Is there a simple visual affordance of a face? Actually, to a vision
system human skin is remarkably similar in color, regardless of ethnicity.
Once the robot had found a colored region about the size and shape of a
head, it could then more reliably find the VIP badges.
The other opportunity for an affordance was Puffer Fish’s navigation to
Borg Shark. Although Puffer Fish would receive Borg Shark’s coordinates,
it was unlikely that Puffer Fish could reliably navigate to Borg Shark using
only dead reckoning. The coordinates were likely to be incorrect from Borg
Shark’s own drift over time. Then Puffer Fish would accumulate dead reckoning
error, more so if it had to stop and start and avoid people. Therefore,
it was decided that Puffer Fish should look for Borg Shark. Borg Shark’s
head piece was deliberately made large and a distinctive blue color to afford
visibility over the crowd and reduce the likelihood of fixating on someone’s
shirt.
Step 4-7: Design, test, and refine behaviors. The choice of sensors for other
behaviors, such as treat removal, was influenced by the physical location of
the sensors. For example, the SICK laser for Borg Shark camemounted on the
research platform as shown in Fig. 6.30b. The research platform, nominally
the top of the robot, was at hand height, making it a logical place to attach a
tray for holding food. It was obvious that the laser could be used to monitor
the food tray area. Other teams tried various approaches such as having a
colored tray and counting the amount of that color visible (more color means
fewer treats on the tray, covering the color). Another approach was to build
a scale and monitor the change in weight.
An interesting aspect of the robots that impacted the sensor suite indirectly
were the costumes. As part of giving the robots personality, each robot
had a costume. The Puffer Fish had an inflatable skirt that puffed out when
the robot was crowded or annoyed. The team had to empirically test and
modify the skirt to make sure it would not interfere with the sonar readings.
Fig. 6.30c shows the profile of the skirt.
As seen in the behavior table below, the only behavior using any form of
sensor fusion was move-to-goal in Puffer Fish, which had two competing
instances of the goal making it sensor fission.
At this point it is helpful to step back and examine the sensing for the
Hors d’Oeuvres, Anyone? entry in terms of the attributes listed in Sec. 6.3.
Recall that the attributes for evaluating the suitability of an individual sensor
were field of view, range, accuracy, repeatability, resolution, responsiveness in
target domain, power consumption, reliability, and size. The field of view and
range of the sensors was an issue, as seen by the differences in vision and
thermal sensors for the face-finding behavior. The camera had a much better
field of view than the thermal sensor, so it was used to focus the attention of
the heat sensor. Repeatability was clearly a problem for laser with its high
false positive/false negative rate. The sonars could not be used for estimating
the location of a face because the resolution was too coarse. Each of the
sensors had reasonable responsiveness from a hardware perspective, though
the algorithms may not have been able to take advantage of them. Power
consumption was not an issue because all sensors were on all the time due
to the way the robots were built. Reliability and size of the hardware were
not serious considerations since the hardware was already on the robots.
The algorithmic influences on the sensor design were computational complexity
and reliabilty. Both were definitely a factor in the design of the perceptual
schemas for the reactive behaviors. The robots had the hardware
to support stereo range (two cameras with dedicated framegrabbers). This
could have been used to find faces, but given the high computational complexity,
even a Pentium class processor could not process the algorithm in
real-time. Reliability was also an issue. The vision face finding algorithm
was very unreliable, not because of the camera but because the algorithm
was not well-suited for the environment and picked out extraneous blobs.
Finally, the sensing suite overall can be rated in terms of simplicity, modularity,
and redundancy. The sensor suite for both Nomad robots can be considered
simple and modular in that it consisted of several separate sensors,
mostly commercially available, able to operate independently of each other.
The sensor suite did exhibit a high degree of physical redundancy: one robot
had dual sonar rings, and the sonars, laser, and camera pair could have
been used for range, ignoring the placement of the shark teeth and the computational
complexity. There was also a large amount of logical redundancy,
which was exploited through the use of behavioral sensor fusion.
No comments:
Post a Comment