Professor of Artificial Intelligence in the School of Computing
Pro-Vice-Chancellor for Research and Innovation
List of Publications
Statistical learning has become central to research on computational models of intelligence, with important but largely independent developments on perceptual tasks like object recognition and conceptual reasoning. Typically the former involves clustering and regression over quantitative representations based on real linear spaces, whereas the latter involves symbolic procedures over quantitative representations that emphasise the relationships between objects (e.g. next-to, occurs-before, overlaps, similar-shape).
Our research has been looking for ways to bring these developments together. We began by exploring integration in the context of table-top games, using clustering to learn attribute categories for game-objects (e.g. playing cards) and inductive logic programming (Progol) to learn how to play the game. The key to integration was to form a time series of qualitative descriptions of objects on the table-top at salient times. An unexpected finding was that the emergent rules of the game could be used to refine attribute categories in a top-down fashion.
C. J. Needham, P. E. Santos, D. R. Magee, V. Devin, D. C. Hogg, and A. G. Cohn, Protocols from Perceptual Observations, Artificial Intelligence, 167(1-2), 103-136, 2005.
More recently, we have used a more expressive representation for the changing spatial relationships between objects. The idea is to represent the spatial relationships between all pairs of objects within a scene (e.g. touching, disconnected ) and the temporal relationships between the intervals over which these pertain (e.g. before, during). Events are represented by subsets of spatial and temporal facts about groups of objects. We have explored two approaches. In the first we attempt to 'explain' a video corpus in terms of a set of repeated events from a small number of classes - this is an unsupervised procedure:
M. Sridhar, A. G. Cohn, and D. C. Hogg Unsupervised Learning of Event Classes from Video, in AAAI 2010. Atlanta, Georgia: AAAI Press, 2010.
Once the event classes have been inferred, we also obtain a set of functional object classes by virtue of the role played by objects within the observed events.
In the second approach, we use inductive logic programming to learn event classes from a ground-truth of labelled videos:
K. S. R. Dubba, A. G. Cohn, and D. C. Hogg, Event Model Learning from Complex Videos using ILP, in European Conference on Artificial Intelligence (ECAI). Lisbon, Portugal, 2010.
I've had a longstanding interest in finding better ways to model the shapes and behaviours of objects within a scene. In early work, we focussed on modelling the projected shapes of moving objects and their trajectories through a scene (see the publications by Adam Baumberg and Neil Johnson). This approach was extended to track the hand in 3D and, using a related approach, to acquire 3D models of moving objects (see the publications by Tony Heap and Shen Xinquan). More recently, we have taken a different approach in attempting to explain the movements of people within wide-area scenes in terms of their intentions and social groups (see AIJ paper from Hannah Dee and recent paper by Jan Sochman)
Back in 1998 we developed a way to synthesise an interactive agent through unsupervised learning of a joint model of interactive behaviour (see the CVPR98 paper by Neil Johnson and Aphrodite Galata). This has been developed since then to model a reactive face and talking-head.
Full details of most of these projects and others can be found at the vision group's website.