From pixels to categories: the representational dynamics of real-world scene understanding
Although human visual categorization is characterized by impressive speed and accuracy, the neural mechanisms supporting this ability remain elusive. Previous work has demonstrated that scenes can be categorized via a number of different features including low- to mid-level visual features; objects; spatial layout; and a scene’s functions. Critically, these features are not independent of one another.
For example, changing a scene’s spatial layout changes both its affordances and low-level visual features. To what extent do these features independently contribute to scene categorization, and when do they do it? Using hierarchical linear regression, we examined the contribution of a number of features to scene categorization behavior in a massive online experiment, finding that functions captured 85.5% of overall explained variance, with nearly half of the explained variance captured only by functions. Then, using representational similarity analysis (RSA) in conjunction with variance partitioning and high-density event-related potentials, we examined the use of these features over time in the brain.
Surprisingly, we found parallel development of both exclusively visual features, and seemingly higher-level features such as functions and semantics. Taken together, these results argue for the seamless integration between visual and conceptual processing in scene categorization.