EYE-TRACKING STUDY OF INANIMATE OBJECTS

Unlike the animate objects, where participants were consistent in their looking patterns, for inanimates it was difficult to identify both consistent areas of fixations and a consistent order of fixations. Furthermore, in comparison to animate objects, inanimates received significantly shorter total looking time, shorter longest looks and a smaller number of overall fixations. However, as with animates, looking patterns did not systematically differ between the naming and non-naming conditions. These results suggested that animacy, but not labelling, impacts on looking behaviour in this paradigm. In the light of feature-based accounts of semantic memory organization, one could interpret these findings as suggesting that processing of the animate objects is based on the saliency/diagnosticity of their visual features (which is then reflected through participants eye-movements towards those features), whereas processing of the inanimate objects is based more on functional features (which cannot be easily captured by looking behaviour in such a paradigm).

The modality specific theory of conceptual representations suggests discrete localisation of different types of semantic knowledge (such as animate and inanimate objects).In particular, Warrington & McCarthy (1987) and Warrington & Shallice (1984) proposed a feature-based account according to which animate objects are more easily recognised and described by visual features whereas inanimates rely more on functional features.Support for the category-based, modality-specific semantic organisation of conceptual memory comes from patients with selective deficits of particular object categories as well as from fMRI and PET studies with normal populations (Martin, 2001;Martin & Chao, 2001).In contrast, Tyler et al. (2000) and Devlin et al. (1998) proposed an alternative account which suggests that all semantic information is processed within a unitary neural system since they failed to replicate category-specific effects reported by proponents of the modular account of semantic organisation.
Recent behavioural studies have demonstrated that responses to animate objects are ~50ms faster and more accurate than to inanimates (Proverbio et al, 2007) and that participants are poorer at naming nonliving compared to living things when presented with silhouettes (Thomas & Forde, 2006).This advantage of processing animates in comparison to inanimates may be explained by higher within-category similarity for animates in comparison to inanimates as suggested by some researchers (Gerlach, 2001;Lag, 2005;Lag et al., 2006).Furthermore, Laws and Neve (1999) argued that this advantage is because inanimates have higher 'intra-item representational variability', whereas animates are more structurally similar.
Given the body of previous research suggested that inanimates are structurally different, more easily described and recognised by functional features rather than by visual features and processed differently than animate objects, the current study examined visual exploration of 24 inanimate categories of objects in the naming and non-naming conditions.
The questions to be addressed in the current study were: what visual features do people attend to in the early stages of inanimate object processing and whether language can mediate looking behaviour when the objects participants are looking at are being named.And importantly, are animate objects processed differently from inanimate objects when assessed using an eye-tracking methodology.
Adult participants were expected to be faster at initiating and programming their eye-gaze towards the visual features of inanimate objects when the objects are named than when they are not.Participants were also expected to focus their attention to the more diagnostic dimensions of inanimate objects in the naming condition, where image presentation occurred directly after the inanimate object was named.Evoking of mental representations of the inanimate objects by naming them before presenting them to the participants in a visual form, was expected to result in a less diffuse pattern of eye-movements compared to non-naming conditions, similar to the prediction of Kovic et al. (2009).Finally, according to the modality-specific approach, animate objects should be processed differently to inanimate objects.Feature-based theory would suggest increased attention to animate objects which are better described by visual features, as oppose to inanimate objects which are better described by functional features.In contrast, distributed account of semantic representation would suggest similar processing of both animate and inanimate objects.

METHOD Participants
Thirty-six participants were recruited for the present study.They were all righthanded, native-English speakers, first year Oxford University undergraduate students with normal hearing and normal or corrected-to-normal vision.They were all given credits for their participation.Two of them were excluded from the analysis due to the failure of the calibration procedure.

Stimuli
Visual stimuli: High-quality static photographs of real inanimate objects were chosen from the CD-ROM Graphic Interchange Format Data (Hemera, 2000) and edited using Adobe Photoshop CS software.As in the Kovic et al. (2009) study, for each of the chosen inanimate categories three versions of the corresponding static computer images were chosen, so that the whole sample consisted of 72 (24x3) images in total.All of the chosen images were presented in a profile view.They were all of the same size (400 x 400 pixels) with no background, but with the ten percent grey colour background to avoid brightness on the screen.In this experiment the visual and auditory stimuli were presented using Presentation software, rather than Preferential Looking Program ('Look') to avoid the conversion of the still images into video format (AVI files).The presentation of the images in Presentation software did not have any time lag (short dark intervals) between the presentations of images.
Auditory stimuli: The selected labels: apple, ball, banana, bicycle, bottle, bracelet, bus, car, card, chair, clock, frame, glasses, guitar, hamburger, hammer, key, lamp, leaf, phone, pipe, scissors, shoe, table were digitally recorded in stereo on the same session at 44.11 kHz sampling rate into signed 16-bit files, within the carrierphrase 'Look at the <target>'.Using the GoldWave 5.10 software, the files were edited to remove background noise, head and tail clicks and to match for peak-topeak amplitude.The utterances for the non-naming conditions ('Look at the picture!' and 'What's this?) were recorded on the same session.

Experimental design
The experimental design was the same as in Kovic et al. (2009) except that the visual and auditory stimuli were from the category of inanimate objects (see Figure 1).

Procedure
The experimental procedure for the present study involving inanimate objects was exactly the same as for the Kovic et al. (2009) eye-tracking study with animate objects.

Eye-tracking methodology
The eye-tracking methodology and procedure were the same as in the Kovic et al. (2009) experiment with animate objects.

Measurements
The same measures used in the Kovic et al. (2009) eye-tracking experiment to explore temporal and spatial processing of the visual images were used in the present study: the first look, longest look, total looking time and number of fixations as well as clustering of the fixations for identifying regions of interest which were subsequently plotted on top of the pictures.

Analysis of the first look
A 3x2 ANOVA with factors Auditory Condition and picture Profile revealed no significant main effects regarding the initiation time of the first saccade: Auditory Condition (F(2,414)=2.299,p=0.102) and picture Profile (F(1,414)=0.492,p=0.484).There was no significant interaction effect either.
The average amount of time participants took to program and initiate their first eye-gaze towards the inanimate objects in the naming condition ('Look at the <target>!')was M=130.60ms(s.e.m.=1.43) which did not differ from the other two, non-naming conditions ('Look at the picture!' and 'What this?'): M=131.73 ms (s.e.m.=131.73)and M=135.43 ms (s.e.m.=135.43),respectively (see Figure 2).

Figure 2. Average time of initiation of the first look across the conditions
"What's this?" "Look at the <target>!""Look at the picture!"

Analysis of the longest look
A 3x2 ANOVA of the longest look demonstrated no significant effect of Auditory Condition (F(2,414)=0.270,p=0.764), but a significant effect of picture Profile (F(1,414)=5,184, p<.023).
The left oriented inanimate pictures received longer longest look than the pictures presented in the right profile view (M(left)=171.023ms, s.e.m.=2.28,M(right)=164.147ms, s.e.m.=1.98, see Figure3).Inanimate objects presented in the naming condition received an average longest look of M=166.03 ms (s.e.m.=2.71) and the objects in the other two conditions received the longest looks of similar durations: M('Look at the picture')=168.16(s.e.m.=2.64) and the M('What's this?')=168.56(s.e.m.=2.52).The interaction effect was not significant.

Analysis of the total looking time
A 3x2 ANOVA comparing total amount of time participants spent looking at the pictures of inanimate objects with the factors Auditory Condition and picture Profile, revealed neither significant main effects of Auditory Condition (F(2,414)=0.198,p=0.821) and picture Profile (F(1, 414)=0.239,p=0.625)), nor a significant interaction effect.Participants spent less than half of a second in total looking at the pictures in the 'Look at the target!' condition (M=408.78ms, s.e.m.=12.25) as well as in the non-naming 'Look at the picture' (M=400.97ms, s.e.m.=12.65) and 'What's this?' (M=411.98ms, s.e.m.= 12.87) conditions (see Figure 4).

Analysis of number of fixations
A 3x2 ANOVA revealed no significant effects of naming (F(2,414)=0.179,p=0.836) or profile (F(1,414)=0.002,p=0.963) on the number of fixations.The interaction effect was not significant either.The average number of fixations in the 'Look at the <target>!' condition was M=2.90 (s.e.m.=0.08) and similarly, in the 'Look at the picture!' and 'What's this?' conditions it was M=2.89 (s.e.m.=0.07) and M=2.96 (s.e.m.=0.08), respectively (see Figure 5).

Cluster analysis
Unlike the animate objects which participants tended to examine in rather consistent ways (looking at the eyes first before moving to some other part of the animal, Kovic et al. (2009)), when presented with inanimate objects they generally demonstrated much more dispersed patterns of looking.Using Ward's method (Ward, 1963) and Clastan software (Wishart, 2004), the fixations for each of the 24 inanimate objects across all of experimental conditions (3 auditory conditions x 3 visual conditions x 2 side profiles) for each of the participants were clustered in order to identify the areas of the objects which predominantly attracted participants' attention.The clusters of fixations are plotted on top of the pictures and presented in different colours for easier interpretation.
Participants looking behaviour examined through the cluster analysis revealed no areas of the objects which particularly attracted their attention.Their eyemovements to the inanimate objects were rather inconsistent.A comparison of fixation distances from cluster centroids across the naming and non-naming conditions revealed no significant differences (F(1,507)=0.395,p=0.674;M("Look at the picture!")=42.53,s.e.m.=3.414;M("Look at the <target>!")=43.39;s.e.m.=2.502,M("What's this?")=40.02,s.e.m.=2.603).Similarly to Kovic et al. (2009), additional quantification of the amount of time or number of fixations participants made within the clusters was difficult to perform given that there was no clear-cut boundary between areas of interest.

Figure 6. Plotting clusters of fixations on top of the pictures
A comparison of Spearman's correlation between the order of fixations and cluster membership revealed an overall significant correlation r=.025, p<.005, but this correlation was not significant when examined within naming and non-naming conditions separately ('Look at the <target>!':r=.029, p>05; 'Look at the picture!': r=.006, p>.05; and 'What's this?': r=.039, p>.05, respectively).

DISCUSSION
The initiation of the first fixation, the single longest look measure, total looking time and number of fixations that participants made while processing pictures of inanimate objects again revealed no systematic differences across the naming and non-naming conditions.Contrary to expectations, labelling of the object prior to visual presentation did not change participants' looking behaviour towards pictures of the inanimate objects as far as the above-mentioned measures were concerned.
The duration of the single longest look differed across the inanimate pictures presented in the left and right profile.Unlike the pictures of the animate objects in Kovic et al. (2009), pictures of the inanimate objects presented in the left profile received longer longest looks than did pictures presented in the right profile.Again, this finding may be explained by left-to-right processing preferences or handedness (given that all of the participants in the current study were right-handed).Further examination of this result might motivate participation of Arabic or Hebrew speaking participants, as well as systematic comparison between right and left handed participants.However, answers to these interesting questions will not be addressed in the research presented here.
Furthermore, similarly to Kovic et al. (2009), cluster analysis was run in order to examine if any specific regions/features of inanimate objects tended to attract participants' attention as well as to reveal if looking behaviour in the naming condition tended to be less diffuse.The results revealed no differences across the naming and non-naming conditions regarding the distribution of fixations.Also, the distance between fixations from cluster centroids was similar across "Look at the <target>!","Look at the picture!" and "What's this?" conditions.Moreover, the order of fixations and cluster membership, although revealing an overall weak positive correlation, did not turn out to be significant when assessed across the naming and non-naming condition separately.This result suggests that participants' processing of inanimate objects was rather inconsistent.The weak, overall positive correlation may have been driven by the first and last fixations, given that the participants started exploration of the pictures from the centre of the screen where a fixation cross was presented prior to presentation of inanimate object.Hypothetically, this cluster would not have been present if the position of the fixation cross differed from trial to trial.However, for the purposes of the current study, it was important that participants had the same starting point for all of the presented pictures across the three experimental conditions.
The cluster analysis demonstrated that participants tend to look at the pictures of animate objects in a much more consistent way than the pictures of inanimate objects (Kovic et al. 2009).Furthermore, the correlation between a fixation's cluster membership and the looking order at the clusters demonstrated a significant correlation for animates, but not for inanimates.These findings motivated a direct comparison between animate and inanimate objects on the other four measurements: the initiation of the first look, the single longest look, total looking time and number of fixations.

First look (msec)
"What's this?" "Look at the <target>!""Look at the picture!" auditory A comparison of the initiation of the first saccade between animate and inanimate objects showed that the average onset of the first saccade for the inanimate pictures in the 'Look at the picture!' condition (M=131.73ms, s.e.m.=1.59) was marginally faster (t(1,287)=1.768,p=0.078) than for the inanimate pictures (M=138.97ms, s.e.m.=3.76).There was no significant difference in terms of initiation of the first saccade for the inanimate objects (M=130.60ms, s.e.m.=1.43) in comparison to the animate objects (M=137.20 ms, s.e.m.=3.57) in the 'Look at the <target>' condition (t(1,268)=1.714,p=0.088).There was no systematic difference (t(1,286)=0.388,p=0.699) between inanimate (M=135.43ms, s.e.m.=1.93) and animate (M=137.53ms, s.e.m.=5.05) objects regarding the first look measure in the 'What's this?' condition (look at the Figure 7).

CONCLUSION
Contrary to the predictions, processing of the animate and inanimate objects assessed through measuring the initiation of the first look, single longest look, total looking time and number of fixations showed no systematic differences across the auditory conditions under which they were presented ('Look at the <target>!','Look at the picture!' and 'What's this?').Participants took approximately the same amount of time to initiate their eye-gaze towards animate and inanimate pictures irrespective of the naming condition (130-140 ms on average).Direct comparison between animate and inanimate objects revealed that the animate objects received significantly longer total looking time, longer looks and a larger number of fixations than inanimates.The single longest look lasted for about 300 ms in for inanimates and below 200 ms for inanimates.Average TLT was around 400 ms for inanimates and around 750 ms for animate, and the average number of fixations was 2.9 and 3.7 for inanimates and animates, respectively.
Furthermore, cluster analyses demonstrated that eye fixations were evenly distributed across inanimates, but clustered around particular features for animates (see Kovic et al., 2009).Crucially, processing of both animate and inanimate objects revealed that the naming condition had no effect on looking patterns, demonstrating that animacy, but not labelling, impacts looking behaviour in this paradigm.Besides the more dispersed pattern of fixations for inanimate objects in comparison to animate objects, inanimate objects also showed a less consistent order of fixations to specific object regions/features.Unlike the animate objects, where participants allocated their attention towards head, tail, udder etc. in a consistent way, it was very hard to identify consistent areas which tended to receive participants attention for the inanimate objects and the order in which those features were processed.
In the light of the modality specific, feature based theory one could claim that the observed differences between animate and inanimate objects regarding looking behaviour are due to the saliency of visual features in animate objects which tend to attract participants' attention.Thus, one possible interpretation of these results is that animate objects have more visually salient, diagnostic features which participants are focusing at (such as eyes, ears, tail, udder), unlike inanimate objects which have no salient visual features and thus, participants demonstrated more dispersed pattern of fixations when processing inanimate objects.An alternative explanation might be that in the animate objects processing task, participants only demonstrated strategic, task-specific looking behaviour, given that most of the animate objects were from the same category (mammals) and that there was more inter-group variability within inanimate objects.In order to test this interpretation further, a follow-up study involving random presentation of both animates and inanimates within a mixed-design is needed to clarify whether the reported difference reflected merely strategic responding in the Kovic et al. (2009) study.
A possible reason for not finding differences between naming and non-naming conditions could be because the interval between the presentation of the auditory and visual stimuli was too short, i.e., the offset of the auditory presentation was right-aligned with the onset of the visual object presentation.In such an experimental design participants possibly did not have enough time to evoke a mental representation of the objects when the object was named and thus, the looking behaviour did not differ across naming and non-naming conditions.In order to test this prediction, the following study introduced a 500 ms inter-stimulus interval between the offset of a auditory stimulus and the onset of the visual stimulus in order to allow more time for the auditory stimuli to be processed and a mental image to be evoked.

Figure 1 .
Figure 1.The three experimental conditions

Figure 3 .
Figure 3. Average longest look for the left and right picture profiles

Figure 4 .
Figure 4. Average total looking time across the three experimental conditions

Figure 5 .
Figure 5. Average number of fixations across the three experimental conditions

Figure 7 .
Figure 7. Average initiation of the first look -animates vs. inanimates

Figure
Figure 9. Average total looking time: animates vs. inanimates