We have introduced a paradigm called the Interactive Visual Dialog (IVD) as a
means of facilitating a system's ability to recognize objects presented to it
by a human. The presentation centers around a supermarket checkout scenario
in which an operator presents an item to be tallied to a stationary television
camera. An active vision approach is used to provide feedback to the operator
in the form of an image (or images) depicting what the system thinks the
operator is most likely holding, shown in a viewpoint that suggests how the
object should next be presented to improve the certainty of interpretation.
Interaction proceeds iteratively until the system converges on the correct
interpretation. The IVD can be implemented using an entropy-based gaze
planning strategy and a sequential Bayes recognition system using optical flow
as input. Experimental results show that the system does, in practice,
improve recognition accuracy, leading to convergence to a correct solution in
a minimal number of iterations.
A presentation on this topic
was given at the Eleventh British Machine Vision Conference, Bristol, UK, 11-14 September 2000.