Research ArticleEVOLUTIONARY BIOLOGY

Intrinsic spatial knowledge about terrestrial ecology favors the tall for judging distance

See allHide authors and affiliations

Science Advances  31 Aug 2016:
Vol. 2, no. 8, e1501070
DOI: 10.1126/sciadv.1501070

Abstract

Our sense of vision reliably directs and guides our everyday actions, such as reaching and walking. This ability is especially fascinating because the optical images of natural scenes that project into our eyes are insufficient to adequately form a perceptual space. It has been proposed that the brain makes up for this inadequacy by using its intrinsic spatial knowledge. However, it is unclear what constitutes intrinsic spatial knowledge and how it is acquired. We investigated this question and showed evidence of an ecological basis, which uses the statistical spatial relationship between the observer and the terrestrial environment, namely, the ground surface. We found that in dark and reduced-cue environments where intrinsic knowledge has a greater contribution, perceived target location is more accurate when referenced to the ground than to the ceiling. Furthermore, taller observers more accurately localized the target. Superior performance was also observed in the full-cue environment, even when we compensated for the observers’ heights by having the taller observer sit on a chair and the shorter observers stand on a box. Although fascinating, this finding dovetails with the prediction of the ecological hypothesis for intrinsic spatial knowledge. It suggests that an individual’s accumulated lifetime experiences of being tall and his or her constant interactions with ground-based objects not only determine intrinsic spatial knowledge but also endow him or her with an advantage in spatial ability in the intermediate distance range.

Keywords
  • Human behavior
  • ecology
  • perception
  • adaptation

INTRODUCTION

Visual space appears veridical. Because it reliably directs and guides actions in our daily activities, we are often tempted to conclude that the visual system faithfully reconstructs the physical space despite the fact that the raw optical images captured by our eyes are ambiguous. Less obvious, perhaps because it seems “automatic,” is the fact that the visual system relies on its internal spatial knowledge to disambiguate the retinal information (1). Therefore, discovering the nature of internal spatial knowledge is critical for understanding how the brain creates visual space.

Studies of space perception in the intermediate distance range using impoverished visual stimuli provide an insight into the visual system’s internal knowledge (27). We previously showed that a dimly lit target in the dark beyond 3 to 4 m (Fig. 1A, yellow target) is perceived at the intersection between its projection line from the eyes and an imaginary curved surface (Fig. 1A, dashed white curve) (4, 5). The shape and location of this imaginary curve are quite stable for the individual observer. We coin this imaginary curved surface the “intrinsic bias” because without any biases, the visual system would have placed the target at any arbitrary location along the projection line. This is because the visual system’s ocular motor signals from vergence (pointing the visual axis of each eye onto the target) and accommodation (focusing the eyes onto the target) cannot reliably provide external depth information on a target located beyond 2 to 3 m. At the functional level, we consider the intrinsic bias as a default supporting surface on which the target rests when external cues for distance are unavailable; in the lower visual field, it acts like a ground surface that supports the target.

Fig. 1 Proposed ecological basis of the visual system’s intrinsic bias.

(A) A dimly lit target in the dark is perceived at the intersection between the projection line from the eye and the intrinsic bias (curve). (B) The ground surface representation is approximated as a slant (dashed curve) when the floor is weakly delineated by texture elements, due to the integration of the intrinsic bias and external depth (texture) information. The ground representation is less slanted than that of the intrinsic bias, leading to the target location being more accurately judged. (C) An observer encounters multiple objects at various locations along a projection line as he or she interacts with the environment. Over time, one location (represented by the white circle) emerges as having the peak probability of encountering the most objects. (D) We propose that the visual system adopts the peak probability locations from all viewing directions to define its intrinsic bias. This provides the visual system with the “best guess” of where objects are located (that is, the intrinsic bias) when external visual information is impoverished. (E) Our ecological existence causes the intrinsic bias to skew toward the ground surface. (F) Prediction 1: There is an asymmetry in the shapes of the intrinsic biases in the upper and lower visual fields (yellow curves). (G) Prediction 2: A taller observer perceives a target with the same angular declination as farther than does a shorter observer. Note that the two targets in the figure have the same angular declination.

The intrinsic bias also affects space perception in the reduced-cue environment, which carries some depth cues but not the wide range of rich cues that are available in the full-cue viewing environment. Figure 1B illustrates one example in which the horizontal ground in the dark is only delineated by several dimly lit elements to specify a texture gradient. In this reduced-cue condition, the observer perceives a target on the ground as if located on an imaginary representation of the ground surface, leading to distance being underestimated (Fig. 1B) (7). Notably, the ground surface representation in the reduced-cue condition (Fig. 1B) is less slanted than that of the intrinsic bias in the dark (Fig. 1A). This difference causes less underestimation of distance under the reduced-cue condition. It indicates that the contribution of the intrinsic bias to space perception decreases when some external depth cues are available, as in the reduced-cue condition. It has been shown that the contribution of the intrinsic bias to space perception reduces as the amount of external depth information in the environment increases, and its contribution becomes much smaller in the full-cue environment (47).

Despite the importance of the intrinsic bias in defining our perceptual space, it is unclear why the intrinsic bias adopts its particular configuration. One hypothesis is that the intrinsic bias is derived from past experiences of the statistical spatial relationship between the observer and natural scenes, especially those on the ground surface (4, 5, 8, 9). Let us consider the statistical spatial relationship between the observer’s eyes and all possible objects along a projection line (Fig. 1C). Clearly, over one’s lifetime, an observer existing in his or her natural habitat will have encountered a variety of objects (red shapes) located at various distances (nearer or farther) along the same projection line. From these accumulated experiences, the visual system acquires a probability distribution function that specifies the probability of an object occupying each location along the projection line, with the peak of the probability function being the location an object (gray in Fig. 1C) is most likely to occupy. Thus, by acquiring the peak probability for each projection line, the visual system can define a profile of peak probabilities (Fig. 1D, gray curve). This profile represents the visual system’s best guess or internal spatial knowledge (that is, the intrinsic bias) of the location of a target if the target’s true location cannot be ascertained because of the lack of external depth information. Thus, by adopting this strategy in a reduced-cue environment, such as in the dark where there are no reliable visual cues, the visual system represents a dimly lit target on the ground at the location of its intrinsic bias, as shown in Fig. 1A. Although the intrinsic bias locates the target on the basis of a probability assumption, which does not always identify the perceived target at the physical target location, it reveals a visual strategy of making the best guess of the target’s location when external cues to depth are sparse.

We can further hypothesize that the intrinsic bias is affected by the spatial relationship between the observer and his or her ecological environment. A principal ecological environment for humans, being terrestrial, is the ground surface where objects we most frequently interact with are located. For example, the increased frequency of encountering objects on or closer to the ground effectively biases the peak probability function along a projection line toward the ground. Consequently, the shape of the intrinsic bias will be skewed toward the far distance, as shown in Fig. 1E. An important empirical support for this curve-like characteristic was provided by Yang and Purves (9), who derived a profile of peak probability locations from statistical analysis of natural scenes. The hypothesis for the shape of the intrinsic bias mentioned above makes two empirically testable predictions.

Prediction 1

If the ecological environment is important, which the ground surface is for humans, then the hypothesis would argue for a less accurate intrinsic bias in the upper visual field. This is because we are less spatially attuned to surfaces or objects located above our heads (for example, ceiling, skies, and treetops) because we infrequently interact with these structures. Therefore, we predicted that the lack of supportive role of the background surface in the upper visual field, in contrast to the ground surface, leads to the shape of the intrinsic bias in the upper visual field being different from that in the lower visual field. As shown in Fig. 1F, instead of being symmetrical at about the eye level (gray curve), the intrinsic bias in the upper visual field is shifted downward (yellow curve). Presumably, this is due to our frequent experiences of walking into rooms with low ceiling heights, where the distance between the eyes and the ceiling is shorter than the distance between the eyes and the floor. Specifically, the ceiling height of a typical room is 2.4 or 2.7 m, resulting in an average adult’s eyes being farther away from the ground than the ceiling when upright. Consequently, we can predict that the asymmetry should be more obvious in a taller observer than in a shorter observer.

Prediction 2

The observer’s eye height also affects the shape of the intrinsic bias. Consider the two observers with different eye heights in Fig. 1G. The two projection lines (green and blue dashed lines) are parallel to each other (that is, they have the same angular declination). Recall that according to our hypothesis, the location of the peak probability function that defines the intrinsic bias is shifted toward the ground surface. Therefore, because the ground surface is farther away from the eyes of the taller observer, his or her peak probability function (green circle) is more distant from the eyes in contrast to that of the shorter observer (blue circle). Consequently, if the peak probability location (intrinsic bias) is used as the default location of a dimly lit target in the dark, we predict that a target with the same angular declination is perceived as farther by the taller observer than by the shorter observer. It thus follows that for a target located at the same physical distance, and hence having two different angular declinations for the taller and shorter observers, the taller observer will perceive it as farther than the shorter observer will.

We evaluated these two predictions by testing space perception in three different environments: (i) dark condition (where the room was totally darkened and only the dimly lit target was visible), (ii) reduced-cue condition (where the ground or ceiling was sparsely defined by a texture gradient in an otherwise dark room), and (iii) full-cue condition (where the environment was fully lighted and rich in depth cues).

RESULTS

We conducted three experiments and obtained results that are consistent with the two predictions (that is, the taller observers showed larger vertical asymmetry in their intrinsic biases and their intrinsic biases extend farther from the eyes than those of the shorter observers).

Experiment 1: Dark environment

Observers judged the location of a dimly lit target in the dark using the blind walking-gesturing task (4, 1013). During the task, the observer judged the location of a briefly displayed dimly lit target. He or she then responded by walking in the dark to traverse the remembered judged distance. Upon arriving at the remembered judged location, he or she either used a 1-m rod held in his or her right hand or used his or her right hand alone, depending on whether the estimated height was within reach, to indicate the remembered height of the judged target. [Although a previous report showed that holding a rod can affect the reported distance (14), this is unlikely to significantly affect our comparison of the judgments made by both groups of observers because both groups performed the same task in the upper and lower fields.] Together, the walked distance and gestured height are taken as the judged target location. The various physical target locations are depicted with plus symbols in Fig. 2A, whereas the average results are depicted with filled symbols (blue symbols for the shorter group and green symbols for the taller group). Clearly, the data deviate from the physical target locations and define four different solid curved lines, which together estimate the average intrinsic bias curves of the taller and shorter groups in the lower and upper fields. Confirming the first prediction (Fig. 1E), the taller group (solid green curves) showed a large vertical asymmetry, with the upper field’s intrinsic bias being lower than the theoretical symmetry (green dashed curve). This is verified by an analysis of variance (ANOVA) on the walked distances [visual field: F(1, 11) = 136.47, P < 0.001; distance: F(2.14, 23.58) = 234.72, P < 0.001; interaction: F(4, 44) = 8.52, P < 0.001] and on the gestured heights [visual field: F(1, 11) = 7.29, P< 0.05; distance: F(4, 44) = 198.26, P < 0.001; interaction: F(4, 44) = 26.44, P < 0.001]. The shorter group (solid blue curves) showed a smaller vertical asymmetry, with the upper field’s intrinsic bias almost overlapping the theoretical symmetry (blue dashed curve). We performed a similar ANOVA on the walked distances [visual fields: F(1, 11) = 10.97, P < 0.01; distances: F(1.35, 14.83) = 126.76, P < 0.001; interaction: F(4, 44) = 4.43, P < 0.005] and on the gestured heights [visual fields: F(1, 11) = 2.30, P = 0.158; distance: F(4, 44) = 147.40, P < 0.001; interaction: F(1.96, 21.53) = 13.57, P < 0.001]. Further, in the lower field, the taller group judged the dimly lit targets as more distant than did the shorter group, indicating that the intrinsic bias curves of the taller group are shifted farther to the right of those of the shorter group [distance: F(2.41, 52.93) = 283.10, P < 0.001; group: F(1, 22) = 14.02, P = 0.001; interaction: F(4, 88) = 7.50, P < 0.001]; the taller group’s gestured heights were significantly lower than those of the shorter group at the farther distance [distance: F(2.28, 50.25) = 204.10, P < 0.001; group: F(1, 22) = 2.80, P = 0.109; interaction: F(4, 88) = 12.47, P < 0.001].

Fig. 2 Results of experiment 1 in the dark environment.

(A) The plus symbols depict the physical target locations. The triangles and circles plot the average judged locations of the taller (green) and shorter (blue) observers. The solid curves represent the intrinsic biases of the taller and shorter observers in the upper and lower visual fields. The dashed curves depict the predicted locations of the intrinsic biases of the taller and shorter groups in the upper visual field, had they been symmetrical to those in the lower visual field. (B) Plotting the average judged eye-to-target distance as a function of the physical angular declination/elevation reveals that judged distance was longer for the taller observers. (C) Plotting the average judged angular declination/elevation as a function of the physical angular declination/elevation reveals that judged direction was accurate.

A similar trend is observed for the targets at the eye level (the average results are depicted in filled circles). For the 1.5-m target at the eye level, both the shorter (1.98 ± 0.08 m) and taller (2.30 ± 0.25 m) groups overestimated the distance. This overestimation tendency is in agreement with previous observations that the visual system overestimates near targets (<3 to 4 m) in a dark environment (24). The taller group judged the target as farther than did the shorter group; however, the difference fails to reach a significant level [t(22) = 1.28, P = 0.215]. For the 7.0-m target, both groups underestimated the distance, again confirming previous findings that the visual system foreshortens target distance beyond 3 to 4 m in the dark. The taller group also judged the target as farther, although the difference is not statistically significant [shorter, 4.26 ± 0.19 m; taller, 4.79 ± 0.26 m; t(22) = 1.71, P = 0.101]. Note that although the values are not significantly different, both groups judged the target locations as slightly above the physical eye level (dashed horizontal line) [at 1.5 m, 1.32° ± 0.31° for the taller group and 1.86° ± 0.42° for the shorter group; at 7 m, 1.13° ± 0.24° for the taller group and 1.28° ± 0.19° for the shorter group; P > 0.05]. This suggests that the eye level or horizon representation, which can affect distance perception in dark and reduced-cue environments (4, 15, 16), was similar in the two groups.

The second prediction depicted in Fig. 1G indicates that for two targets with the same physical angular declination, the taller group will perceive it as farther than will the shorter group. It follows then that by testing the two groups with a target at the same physical distance as in our experiment, hence subtending two different angular declinations for the taller and shorter groups, the taller group would judge the target as farther than the shorter group would. In Fig. 2B, we plotted the mean judged eye-to-target distance as a function of physical angular declination. The taller group (green symbols) showed a strong vertical asymmetry, with judged eye-to-target distances being longer in the lower field [visual field: F(1, 11) = 137.93, P < 0.001; angular declination/elevation: F(4, 44) = 184.10, P < 0.001; interaction: F(4, 44) = 6.45, P < 0.001]. The shorter group (blue symbols) showed a moderate vertical asymmetry [visual field: F(1, 11) = 10.29, P < 0.01; angular declination: F(4, 44) = 101.17, P < 0.001; interaction: F(4, 44)= 5.26, P < 0.005]. A comparison between the taller and shorter groups reveals that the taller group judged distances as longer than did the shorter group, confirming our second prediction (as depicted in Fig. 1G).

Figure 2C shows relationship between the physical angular declination and the mean judged angular declination of the target, which is derived using the equation α = tan(Hh)/x, where H is the eye height, and x and h are the walked horizontal distance and gestured target height, respectively. It is clear that all data points are clustered along the diagonal line. The slopes of the regression lines are close to unity for both the shorter group (lower field: y = 0.983x − 0.709, R2 = 0.997; upper field: y = 1.028x + 0.660, R2 = 0.999) and the taller group (lower field: y = 1.080x − 3.148, R2 = 0.996; upper field: y = 1.079x − 0.375, R2 = 0.999). This indicates accurate perception of angular declination despite the intrinsic bias differences between the two groups (4, 5). Furthermore, the slopes of the regression lines in the upper and lower visual fields are similar. This is confirmed by ANOVA, which fails to reveal a significant interaction effect for both the shorter group [visual field: F(1, 11) = 17.54, P < 0.005; angular declination/elevation: F(1.88, 6.69) = 647.87, P < 0.001; interaction: F(4, 44) = 2.51, P = 0.055] and the taller group [visual field: F(1, 11) = 10.89, P < 0.01; angular declination/elevation: F(1.30, 14.25) = 568.11, P < 0.001; interaction: F(2.16, 23.72) = 2.15, P = 0.136].

Experiment 2: Reduced-cue environment

The experimental setup was similar to that in the dark environment except that the ground and ceiling surfaces were now delineated with a 2 × 3 array of dimly lit texture elements (see side view in Fig. 3A and top view in Fig. 3B). The filled triangles in Fig. 3C (blue for shorter and green for taller groups) show that the mean judged locations in both the upper and lower fields were deviated from the physical target locations plotted using plus symbols. The judged locations are also fitted by curved lines, which reflect the visual representations of the textured surfaces of the taller and shorter observers in the upper and lower fields. Clearly, in comparison to the dark condition (Fig. 2A), the judged target locations in the current reduced-cue environment were closer to the physical target locations (Fig. 3C), indicating that the introduction of texture information led to a more accurate distance perception. We previously showed that the deviation of the represented texture surface from the horizontal planar surface in the reduced-cue environment is due to the influence of the intrinsic bias (7). Confirming our first prediction (Fig. 1E), neither the shorter nor the taller groups’ data (texture surface representation) overlap with the theoretical symmetry (dashed blue and green curves), revealing a vertical asymmetry. This is confirmed by ANOVA of the gestured heights of the taller group [visual fields: F(1, 11) = 28.54, P < 0.001; distance: F(1.73, 19.04) = 63.53, P< 0.001; interaction: F(2.31, 25.44) = 3.86, P < 0.05] and of the shorter group [visual fields: F(1, 11) = 0.004, P = 0.949; distance: F(1.61, 17.75) = 40.58, P < 0.001; interaction: F(4. 44) = 3.87, P < 0.01]. The curves also show that the represented texture surfaces are farther from the observer in the lower than in the upper visual field. This is confirmed by ANOVA of the walked distances of the taller group [visual fields: F(1, 11) = 137.06, P < 0.001; distance: F(2.14, 23.55) = 562.05, P < 0.001; interaction: F(4, 44) = 26.92, P < 0.001] and of the shorter group [visual fields: F(1, 11) = 37.32, P < 0.001; distance: F(4, 44) = 457.11, P < 0.001; interaction: F(4, 44) = 15.42, P < 0.001]. A visual field asymmetry in judged location also exists for the targets located at the eye levels (open triangles). Targets were judged as farther when the texture elements were seen in the lower (inverted open triangles) rather than the upper visual field (upright open triangles) for both the taller group [at 4.5 m, t(11) = 5.595, P < 0.001; at 7.0 m, t(11) = 13.817, P < 0.001] and the shorter group [at 4.5 m, t(11) = 6.363, P < 0.001; at 7.0 m, t(11) = 9.716, P < 0.001]. Also confirming the second prediction (Fig. 1G), the taller group judged targets at the same distances in the lower field as farther than did the shorter group [distance: F(2.72, 59.89) = 895.57, P < 0.001; group: F(1, 22) = 12.51, P < 0.005; interaction: F(4, 88) = 3.83, P < 0.01]. Additionally, consistent with a farther location, the taller group gestured target height as lower than did the shorter group [distance: F(1.94, 42.73) = 57.61, P < 0.001; group: F(1, 22) = 0.66, P = 0.43; interaction: F(4, 88) = 3.33, P < 0.05]. A similar trend was observed in the upper visual field. This can be better appreciated by plotting our data on the basis of eye-to-target distance and angular declination (Fig. 3, D and E) because the test targets in the upper field were located at different physical heights for the two groups of observers.

Fig. 3 Results of experiment 2 in the reduced-cue environment.

(A) Side view of the dimly lit texture elements that delineated the floor and ceiling surfaces. (B) Top view of the same texture display with the test target locations added (green plus symbols). (C) The plus symbols depict the physical target locations. The triangles plot the average judged locations of the taller (green) and shorter (blue) observers. The solid curves represent the intrinsic biases of the taller and shorter observers in the upper and lower visual fields. The dashed curves depict the predicted locations of the intrinsic biases of the taller and shorter groups in the upper visual field, had they been symmetrical to those in the lower visual field. (D) Plotting the average judged eye-to-target distance as a function of the physical angular declination/elevation reveals that judged distance was longer for the taller observers. (E) Plotting the average judged angular declination/elevation as a function of the physical angular declination/elevation reveals that judged direction was accurate.

We plotted the mean judged eye-to-target distances as a function of the physical angular declination in Fig. 3D. Both groups judged the eye-to-target distances as longer in the lower than in the upper field [for the short group: visual field: F(1, 11) = 35.06, P < 0.001; angular declination/elevation: F(4, 44) = 412.80, P < 0.001; interaction effect: F(4, 44) = 15.41, P < 0.001; for the taller group: visual field: F(1, 11) = 130.63, P < 0.001; angular declination/elevation: F(4, 44) = 481.88, P < 0.001; interaction: F(4, 44) = 25.84, P < 0.001]. It is also clear that overall, the taller group judged distances as longer than did the shorter group for the same angular declination or elevation (second prediction, Fig. 1G).

As shown in Fig. 3E, the mean judged target angular declination as a function of the physical angular declination largely clusters along the diagonal line for both the shorter group (lower field: y = 0.904x + 0.404, R2 = 0.996; upper field: y = 1.019x + 1.294, R2 = 0.999) and the taller group (lower field: y = 0.980x − 1.111, R2e = 0.996; upper field: y = 1.055x − 2.121, R2 = 0.999). The slopes are slightly steeper in the upper field. ANOVA reveals a significant interaction effect inthe shorter group [visual field: F(1, 11) = 25.12, P < 0.001; angular declination/elevation: F(1.58, 17.33) = 840.41, P < 0.001; interaction: F(1.97, 21.66) = 4.85, P < 0.05], but not in the taller group [visual field: F(1, 11) = 0.21, P = 0.657; angular declination/elevation: F(1.74, 19.12) = 752.19, P < 0.001; interaction: F(1.34, 14.68) = 3.60, P = 0.068].

Experiment 3: Full-cue environment

Unlike the dark and reduced-cue environments, the full-cue environment is rich in depth cues that can be used for accurate egocentric distance judgment, which reduces the need for the visual system to depend on its internal spatial knowledge (intrinsic bias). However, the intrinsic bias still influences distance judgment when the depth cues are not fully used. This occurs when performing the exocentric task of judging successive intervals (Fig. 4A), in which distance intervals are underestimated (10, 11, 1719). To show that the differential shapes of the intrinsic biases of taller and shorter people also affect their exocentric distance judgments, we measured their performance when naturally standing on the ground (Fig. 4A) and when their eye heights were compensated for by having the shorter group stand on a box (Fig. 4B) and the taller group sit on a chair (Fig. 4C). This manipulation of the observer’s physical eye height is also related to our second prediction (Fig. 1G), which states that when the target viewed by the taller or shorter observer has the same angular declination, the taller observer will judge the target as farther than will the shorter observer. Here, we achieved equal angular declination for a target viewed by the taller and shorter observers by appropriately adjusting their physical eye heights. On the basis of 13 successive exocentric distance interval judgments for each height condition, we obtained a function of the inferred distance (Fig. 4D). This function reveals the observer’s exocentric distance (interval) perception over a large distance range. It does not necessarily reflect the mechanisms underlying egocentric distance perception (10, 11, 19). As shown in Fig. 4D, with natural eye height (Fig. 4A), the shorter group (blue circles) underestimated inferred distance more than the taller group (green circles) [eye height: F(1, 22) = 22.39, P < 0.001; distance: F(11, 242) = 95.74, P < 0.001; interaction: F(11, 242) = 15.41, P < 0.001]. When the shorter group stood on a box with a height of 30 cm (Fig. 4B) so that their eye height to the ground became similar to the taller group’s natural eye height (Fig. 3A), the shorter group’s inferred distances (blue triangles) remained less than those of the taller group, who stood on the ground (green circles) [eye height: F(1, 22) = 5.48, P < 0.05; distance: F(11, 242) = 92.44, P < 0.001; interaction effect: F(11, 242) = 5.31, P < 0.001]. Conversely, when the taller group sat on a chair to lower their average eye height (Fig. 4C) to be approximately equal to the shorter group’s natural eye height (Fig. 4A), the taller group still performed better than the shorter group (green triangles above blue circles) [eye height: F(1, 22) = 4.76, P < 0.05; distance: F(11, 242) = 116.80, P < 0.001; interaction effect: F(11, 242) = 2.10, P < 0.025]. Thus, even when the eye heights to the ground are similar (that is, the same angular declination of target), the shorter group made a larger underestimation error than did the taller group. This confirms our predictions that a difference in the internal knowledge due to eye height [that is, the shape of the intrinsic bias (Fig. 1F)], in addition to a difference in external stimulation, causes the observed difference in exocentric distance judgment between the two groups. Previous studies showed that changing the observer’s physical eye height relative to the visible ground affects the judged size and depth (1824). Our study extends the finding by revealing that in addition to the physical eye height, the observer’s intrinsic bias contributes to the perceived depth. Although we only tested two physical heights for each group of observers, it would be interesting for future studies to test more physical eye heights over a larger range to obtain a quantitative relationship among the perceived depth, physical eye height, and intrinsic bias.

Fig. 4 Results of experiment 3 in the full-cue environment.

(A to C) Procedures for testing an observer on the Gilinsky successive equal-appearing intervals task while (A) standing on the ground, (B) standing on a box to raise the eye height by 30 cm, and (C) sitting on a chair to lower the eye height by 30 cm. (D) Plotting the average inferred distance as a function of the physical distance reveals that inferred distance was more accurate (closer to the equidistant dashed line) for the taller observers, even when they sat on a chair.

DISCUSSION

To reveal why the intrinsic bias, which reflects the visual system’s internal spatial knowledge, adopts its particular shape and position, we measured the differential influence of the intrinsic biases of taller and shorter observers on distance judgments. Tests were conducted in dark, reduced-cue, and full-cue environments. Overall, we found that the intrinsic bias is (i) vertically asymmetrical with the intrinsic bias in the lower field being farther away from the eyes and (ii) affected by the observer’s eye height, with a taller observer having his or her intrinsic bias farther away from his or her eyes. These characteristics, also evident in the reduced and full-cue environments, result in the taller observers having an advantage (that is, better accuracy) over the shorter observers in judging distance in the intermediate distance range. Thus, our study confirms the hypothesis that the visual system’s intrinsic bias is obtained from past experiences of the statistical spatial relationship between the observer and the ground surface (4, 5, 9).

A comparison of the results from the test environments (Figs. 2 to 4) in our study reveals that the differences between the taller and shorter groups’ data are largest in the dark environment, followed by those in the reduced-cue and full-cue environments. This trend is consistent with the fact that the contribution of the intrinsic bias to space perception is largest in the dark, where there is no external depth information on the ground. Its impact on space perception is reduced when more external depth cues become available, as in the full-cue environment (experiment 3).

The current finding will lead to further investigations to reveal the spatial relationship between the observer and the ground surface and the relationship’s role in determining the observer’s intrinsic bias. One possibility, which is depicted in Fig. 1E, could be the frequency of encountering objects on or closer to the ground surface in one’s natural environment. A second possibility could be the optical slant at the viewed location on the ground (that is, the angle formed between the line of sight to the viewed location and the ground surface) (1, 5, 18, 19, 22). When the optical slant is too small, the visual system cannot spatially resolve the texture gradient on the ground surface and thus fails to reliably represent the target location. The optical slant decreases as the distance of the viewed location on the ground increases. This means that the small optical slant angle sets the limit for the extent of the ground surface that the visual system can reliably represent. Because taller observers have relatively larger optical slant angles on the ground than shorter observers, the former can better represent the ground at a farther distance and thus have more accurate distance perception. This could lead to the taller observers having intrinsic biases that extend farther from the eyes than do those of shorter observers.

Our findings provide support for the ecological approach to understanding visual perception that emphasizes the biological significance of the ground surface in space perception (1, 4, 9, 2529). We live in an environment where the ground surface is a constant presence. We stand and walk on the ground, and most objects and animals we interact with also rely on the ground. Thus, using a ground-based reference frame to guide and direct actions can improve the visual system’s efficiency in coding locations. We previously proposed that with the ground-based two-dimensional (2D) coordinate coding system, representations of objects on the ground and our body (feet) become 2D instead of 3D (30). This coding scheme requires less computational resources than coding with a 3D Cartesian coordinate system. Furthermore, with the 2D ground surface reference frame, the visual system can rely on the accurately coded angular declination of the target and on the observer’s eye height to determine the target’s location (35, 15, 3133). The visual system can also obtain accurate eye height by using the near-depth information on the ground. Even when the visual information is poor or absent, the visual system can rely on the stored eye height knowledge that is relatively constant, particularly for adults (1, 15, 31). In addition, when visual cues used to specify the geographical slant of the ground surface are less than optimal, the visual system can represent the ground surface with aid from its internal spatial knowledge, namely, the intrinsic bias.

MATERIALS AND METHODS

Observers

Twenty-four naïve observers with informed consent, 12 in the shorter group (6 females and 6 males) and 12 in the taller group (6 females and 6 males) participated in all three experiments. They were either undergraduate or graduate students at the East China Normal University (ECNU) who responded to subject recruitment flyers that were posted on campus. All had normal, or corrected-to-normal, visual acuity (at least 20/20) and a stereoscopic resolution of 20 arc sec or better. The average eye heights of the shorter and taller groups were 149.3 ± 1.2 cm with a range of 143 to 158 cm (females, 147.5 ± 1.3 cm; males, 151.0 ± 2.1 cm) and 173.4 ± 1.1 cm with a range of 169 to 179 cm (females, 170.0 ± 0.4 cm; males, 176.8 ± 0.7 cm), respectively. The mean age of the taller group was 21.1 ± 0.7 years (females, 20.3 ± 0.5 years; males, 21.8 ± 1.4 years), and the mean age of the shorter group was 24.3 ± 1.0 years (females, 22.2 ± 1.2 years; males, 26.5 ± 0.9 years). All 24 observers participated in all three experiments and were compensated monetarily for their time. At the beginning of the study, one male observer suffered from a sports-related injury (not due to the study) after the first test session and withdrew from the study. Another participant was recruited as a replacement. All experiments on the observers were performed following the Institutional Review Board guidelines.

Test room and stimuli (experiments 1 and 2)

All experiments were run in a dark room (8 m × 13 m) in the Institute of Cognitive Neuroscience Building at ECNU. The layout and dimensions of the room were unknown to the observer. Its ceiling (3.1 m) and walls were painted black, and the floor had black carpeting. A 15-m-long rope (0.8 m above the floor) tied to both ends of the room was used to guide the observer while blind walking. Music was played aloud during the testing to keep possible acoustic cues from revealing the target location. The test target was constructed from a ping-pong ball that was internally illuminated by a green light-emitting diode (LED) (0.16 cd/m2) and controlled by a computer. An iris-diaphragm aperture in the front of the ping-pong ball kept its visual angle at 0.20° when measured at the eye level.

Two experimental tasks

Blind walking-gesturing task for experiments 1 and 2. To report the judged target location, the observer walked blindly to the remembered target distance and gestured its remembered height with a 1-m rod held in his or her right hand, or simply with his or her right hand alone. If the remembered target height was beyond reach, the observer used the 1-m rod held in his or her right hand to indicate the height (13, 33).

Gilinsky’s successive interval task for experiment 3. The observer reported successive intervals to be equal to 1 m by verbally instructing the experimenter to adjust the separation between two targets (17, 18). The observers were not given any feedback regarding their performance during the experiments.

Experiment 1: Stimuli and procedures

The suspended test target was located at one of 12 locations specified by a combination of distances (1.5, 3.25, 4.5, 5.75, and 7.0 m) and heights [lower visual field, 0.65 m above the floor; at the eye level (1.5 and 7.0 m only); upper visual field, 2 × observer’s eye height − 0.65 m]. For each observer, the heights in the upper and lower visual fields were symmetrical at about the observer’s eye level (plus symbols in Fig. 2). Each target location was measured four times. Six other different locations were used for the catch trials [(3.25 m, 0.35 m), (5.75 m, 0.95 m), (7.0 m, 0.95 m), (3.25 m, 2 × observer’s eye height − 0.95 m), (5.75 m, 2 × observer’s eye height − 0.95 m), and (7.0 m, 2 × observer’s eye height − 0.95 m)].

To begin the experiment, the observer was blindfolded and brought to a waiting area housed within the test room, where he or she removed the blindfold, sat on a chair facing away from the test area, and was informed whether the target would be below, above, or about the eye level. The observer then waited for a computer-generated tone to cue him or her to turn off the room’s light and walked to the starting point (observation position) with the aid of the guidance rope. Then, to begin the trial, he or she stood upright and called out “ready.” After a 5-s delay, the test target, flickering at 5 Hz, was turned on for 2 s for the observer to judge the target location. The observer shifted his or her gaze to the target and estimated its location (slight vertical head rotations were allowed). He or she then put on a blindfold and verbally indicated that he or she was ready to walk. The experimenter immediately removed the target and shook the guidance rope to indicate that the course was clear for walking. The observer walked to the remembered target location while sliding his or her left hand along the guidance rope. Upon arriving at the remembered target location, he or she indicated the remembered target height with a 1-m rod and called out “done.” The experimenter marked the location of the observer’s feet, measured the height indicated by the tip of the rod, and informed the observer to turn around and walk back to the waiting area. [Note here that the reason for using the tip of the 1-m rod to indicate the judged height of the target was because it could be perceived at a higher location beyond the observer’s reach by hand when the target was above the eye level. In case the perceived target height was reachable by the right hand (for example, around the eye height), the observer could either use his or her right hand alone or right hand with the 1-m rod to gesture the target location. Separately, it has been shown that the expected reach when wielding a tool or rod is different compared to the expected reach without a tool (14). Nevertheless, the use of the rod in our study was unlikely to significantly affect our conclusions, which are based on comparisons between the two groups and comparisons between the upper and lower fields for the same observer.] Upon reaching the waiting area, he or she switched on the lamp, removed the blindfold, and sat down to wait for the next trial. Meanwhile, the experimenter prepared for the next trial. A total of 60 test trials (12 test locations × 4 repeats and 6 catch trials × 2 repeats) were run in two test sessions over 2 days. The order of stimulus presentation was randomized, with the second session having the reversed randomization sequence from the first session. In each session, five practice trials were given before the formal data collection.

Experiment 2: Stimuli and procedures

The texture background was constructed from six ping-pong balls with internally illuminated red LEDs (0.08 cd/m2). Each ball was housed inside a small box with a 2.5-cm (diameter) circular opening. The balls were arranged in a 2 × 3 formation (Fig. 3A). The locations of the texture background were 1.5, 3, and 4.5 m from the observation point. The texture background was placed 0.65 m from the floor in the lower visual field, 2 × eye height − 0.65 m in the upper visual field, and at the same plane with the test target. The target locations were the same as in experiment 1, except that the viewing distance of the eye level target was now 4.5 m instead of 1.5 m. During the test, if the target was not at the eye level, only the texture background in the same field was presented simultaneously (5 Hz flickering for 5-s duration). If the target was at the eye level, the texture background in either the upper or the lower field was displayed.

The test procedure was the same as in experiment 1, except for the following modifications. The observer was not informed whether the upcoming target was in the upper or lower field before each trial. Because the stimulus duration was 5 s, observers had no difficulty locating the target. A total of 68 trials (14 test trials × 4 repeats and 6 catch trials × 2 repeats) were run over 2 days (sessions). The order of stimulus presentation was randomized, with the second session having the reversed randomization sequence from the first session. Observers were given five practice trials before each session.

Experiment 3: Stimuli and procedures

Experiment 3 was conducted within the ECNU campus on a large horizontal grass field that provided a full-cue environment. Two red rectangular pieces of cardboard (20 cm × 4 cm × 2 cm) served as the targets. The shorter group was tested in two eye height conditions: (i) stand-on-ground (baseline) and (ii) stand-on-box, where they stood on a 0.3-m-high box. The taller group was also tested in two eye height conditions: (i) stand-on-ground (baseline) and (ii) sit-on-chair, where they sat on a height-adjustable chair that reduced their eye height by 0.3 m. Together, the average eye heights of the shorter group were 149.3 ± 1.2 cm for the stand-on-ground condition and 179.3 ± 1.2 cm for the stand-on-box condition. The average eye heights of the taller group were 173.4 ± 1.1 cm for the stand-on-ground condition and 143.4 ± 1.1 cm for the sit-on-chair condition.

The task of successive equal-appearing intervals, similar to Gilinsky’s study (17), was used to measure the observers’ judgments of distance. At the start of the experiment, the first target was placed 1 m in front of the observer’s midline, with the widest horizontal edge facing the observer. The second target, also with the widest horizontal edge facing the observer, was placed in the observer’s midline and parallel to the first target, at a random distance from 0.5 to 3 m beyond the first target. The observer’s task was to judge if the interval between the first and second targets was equivalent to 1 m. If not, he or she would instruct the experimenter to move the second target nearer/further to/from the first target at a distance that was n times the target’s horizontal edge, until the interval was perceived to be 1 m. He or she would then close his or her eyes and wait for the experimenter to finish moving the second target, after which he or she took another look at the interval setting. This response routine was repeated several times until the observer was satisfied with the setting and called out “done.” He or she then closed his or her eyes again to allow the experimenter to measure the set interval and prepare for the next trial. For the next (second) trial, the experimenter created a new interval by removing the first target from its place and placing it at some random distance between 0.5 and 3 m distal to the second target. The observer then began the second trial by responding with the routine above. Gradually, from the near to far distance, the experimenter obtained 13 measurements of physical intervals that were perceived by the observer as equal to the perceived length of 1 m.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: Funding: This study was supported by grants from the NIH (EY023374 and EY023561) to T.L.O. and Z.J.H. Author contributions: L.Z. conducted the experiments, performed data analysis, and wrote the paper. Z.J.H. and T.L.O. provided the theoretical motivations, performed data analysis, and wrote the paper. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper. Additional data related to this paper may be requested from the authors.
View Abstract

Navigate This Article