Early hierarchical computational visual models as well as recent deep neural networks have been inspired by the functioning of the primate visual cortex system. Although much effort has been made to dissect neural networks to visualize the features they learn at the individual units, the scope of the visualizations has been limited to a categorization of the features in terms of their semantic level. Considering the ability humans have to select high semantic level regions of a scene, the question whether neural networks can match this ability, and if similarity with humans attention is correlated with neural networks performance naturally arise. To address this question we propose a pipeline to select and compare sets of feature points that maximally activate individual networks units to human fixations. We extract features from a variety of neural networks, from early hierarchical models such as HMAX up to recent deep convolutional neural netwoks such as Densnet, to compare them to human fixations. Experiments over the ETD database show that human fixations correlate with CNNs features from deep layers significantly better than with random sets of points, while they do not with features extracted from the first layers of CNNs, nor with the HMAX features, which seem to have low semantic level compared with the features that respond to the automatically learned filters of CNNs. It also turns out that there is a correlation between CNN's human similarity and classification performance.

From early biological models to CNNs: do they look where humans look? / Cadoni, Mi; Lagorio, A; Grosso, E; Huei, Tj; Seng, Cc. - (2021), pp. 6313-6320. [10.1109/ICPR48806.2021.9412717]

From early biological models to CNNs: do they look where humans look?

Cadoni, MI;Lagorio, A;Grosso, E;
2021-01-01

Abstract

Early hierarchical computational visual models as well as recent deep neural networks have been inspired by the functioning of the primate visual cortex system. Although much effort has been made to dissect neural networks to visualize the features they learn at the individual units, the scope of the visualizations has been limited to a categorization of the features in terms of their semantic level. Considering the ability humans have to select high semantic level regions of a scene, the question whether neural networks can match this ability, and if similarity with humans attention is correlated with neural networks performance naturally arise. To address this question we propose a pipeline to select and compare sets of feature points that maximally activate individual networks units to human fixations. We extract features from a variety of neural networks, from early hierarchical models such as HMAX up to recent deep convolutional neural netwoks such as Densnet, to compare them to human fixations. Experiments over the ETD database show that human fixations correlate with CNNs features from deep layers significantly better than with random sets of points, while they do not with features extracted from the first layers of CNNs, nor with the HMAX features, which seem to have low semantic level compared with the features that respond to the automatically learned filters of CNNs. It also turns out that there is a correlation between CNN's human similarity and classification performance.
2021
Inglese
International Conference on Pattern Recognition
6313
6320
8
978-1-7281-8808-9
IEEE COMPUTER SOC
10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA
From early biological models to CNNs: do they look where humans look? / Cadoni, Mi; Lagorio, A; Grosso, E; Huei, Tj; Seng, Cc. - (2021), pp. 6313-6320. [10.1109/ICPR48806.2021.9412717]
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
Cadoni, Mi; Lagorio, A; Grosso, E; Huei, Tj; Seng, Cc
273
5
none
info:eu-repo/semantics/conferenceObject
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11388/298757
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact