No title
We position recent and emerging research in cognitive vision and perception addressing three key questions: (1) What kind of relational abstraction mechanisms are needed to perform (explainable) grounded inference --e.g., question-answering, qualitative generalisation, hypothetical reasoning-- relevant to embodied multimodal interaction? (2) How can such abstraction mechanisms be founded on behavi
