The Vision Engine is a system with a pipelined early vision architecture, Datacube image processors, connected to a MIMD intermediate vision system, a set of Transputers, designed for general vision tasks.
To respond actively to a dynamic environment, a vision system must process perceptual data in real time, and in multiple modalities. The structure of the computational load varies across the levels of vision, requiring multiple architectures. We describe the Vision Engine, a system with a pipelined early vision architecture, Datacube image processors, connected to a MIMD intermediate vision system, a set of Transputers. The system uses a controllable eye/head for tasks involving motion, stereo and tracking. A simple pipeline model describes image transformation through multiple functional stages in early vision. Later processing (e.g., segmentation, edge linking, perceptual organization) cannot easily proceed on a pipeline architecture. A MIMD architecture is more appropriate for the irregular data and functional parallelism of later visual processing. The Vision Engine is designed for general vision tasks. Early vision processing, both optical ow and stereo, is implemented in near real-time using the Datacube, producing dense vector elds with conndence measures, transferred at near video rates to the Transputer subsystem. We describe a simple implementation combining, in the Transputer system, stereo and motion information from the Datacube. 1