This book presents a thorough overview of fusion in computer vision, from an interdisciplinary and multi-application viewpoint, describing successful approaches, evaluated in the context of international benchmarks that model realistic use cases. Features: examines late fusion approaches for concept recognition in images and videos; describes the interpretation of visual content by incorporating models of the human visual system with content understanding methods; investigates the fusion of multi-modal features of different semantic levels, as well as results of semantic concept detections, for example-based event recognition in video; proposes rotation-based ensemble classifiers for high-dimensional data, which encourage both individual accuracy and diversity within the ensemble; reviews application-focused strategies of fusion in video surveillance, biomedical information retrieval, and content detection in movies; discusses the modeling of mechanisms of human interpretation of complex visual content.