Research ArticleNEUROSCIENCE

Visual number sense in untrained deep neural networks

See allHide authors and affiliations

Science Advances  01 Jan 2021:
Vol. 7, no. 1, eabd6127
DOI: 10.1126/sciadv.abd6127
  • Fig. 1 Spontaneous emergence of number selectivity in untrained neural networks.

    (A) Examples of the stimuli used to measure number tuning (21). Set 1 contains dots of the same size. Set 2 contains dots with a constant total area. Set 3 contains items of different geometric shapes with an equal overall convex hull (white dashed pentagon). (B) Top: Architecture of the untrained AlexNet, where the weights in each convolutional layer were randomly initialized by a controlled normal distribution (28). Bottom: Examples of tuning curves for individual number-selective network units observed in the untrained AlexNet. A.U., arbitrary unit. (C) Left: The preferred numerosity (PN) outcomes measured with different stimulus conditions are significantly correlated with each other, implying consistency of the preferred numerosity. Right: The average numerical distance between PNs of each number-selective neuron measured with different stimulus conditions is close to zero. Dashed lines indicate the average. (D) Ratio of number-selective neurons is consistently observed, even when the weight variation was substantially reduced from the original random initialization condition (28), suggesting that the emergence of number-selective neurons does not strongly depend on the initialization condition. Black and gray triangles indicate the degree of weight variation for the standard random initialization suggested previously (28, 29).

  • Fig. 2 Tuning properties of number-selective neurons in the untrained network.

    (A) Distribution of preferred numerosity in the network and the observation in monkeys (27). Inset: The root mean square error between the red and green curves (untrained versus data) is significantly lower than that in the control with the distribution of shuffled preferred numerosity (tall pink line; P < 0.05, n = 100). (B) Left and middle: Average tuning curves of different numerosities on a linear scale and on a logarithmic scale. Right: The goodness of the Gaussian fit (r2) is greater on a logarithmic scale, as reported (27). *P < 10−40, Wilcoxon rank sum test. (C) The tuning width (sigma of the Gaussian fitting) increases proportionally on a linear scale and remains constant on a logarithmic scale, as predicted by the Weber-Fechner law (n = 100) (27).

  • Fig. 3 Number neurons can perform numerosity comparison, reproducing statistics observed in animal behaviors.

    (A) Number comparison task using the SVM. (B) Task performance in the case that the response of number neurons, nonselective neurons, and the pixel values of raw stimulus images were provided to train the SVM. The dashed line indicates the chance level. (C) Performance of numerosity comparison across different combinations of numerosities. (D) Left: Performance as a function of the difference between two numbers. The performance increases as the number difference increases (numerical distance effect) and is significantly higher than the chance level for all cases (*P = 8.70 × 10−17, Wilcoxon rank sum test). Right: Even when the difference between two numbers is identical [e.g., 12 versus 2 and 26 versus 16; black versus white squares in (C)], the performance is greater for the pairs of small numbers (numerical size effect; *P = 3.25 × 10−17, Wilcoxon rank sum test). (E) Left: Average activity of number-selective units as a function of the numerical distance. Right: Response to the preferred numerosity; note that the response during correct trials is significantly higher than that during incorrect trials, as observed in actual neurons recorded from a monkey prefrontal cortex during a numerosity matching task (*P = 2.82 × 10−39, Wilcoxon rank sum test) (27).

  • Fig. 4 Abstract number sense independent of low-level visual features.

    (A) Correlation between the numerosity and total area of the stimuli used for the task in Fig. 3. (B) Newly designed stimulus set with greater variation of the total area (eightfold variation; 120π to 960π pixel2) (16). (C) Image pairs were grouped as congruent and incongruent pairs, depending on the correlation between the values of numerosity and the total area. (D) In the revised task, the SVM was trained with the response to congruent pairs but was tested using incongruent pairs. (E) Task performances when the SVM was trained with the response of number neurons, of nonselective neurons, and with pixel values of raw stimulus image, respectively (*P = 1.28 × 10−34, Wilcoxon rank sum test). (F) Similar classification of image pairs by dot size and density. (G) Task performances with image pairs in (E) suggest that number-selective neurons encode the abstract number sense independent of low-level visual features of the stimulus (*P = 1.28 × 10−34, Wilcoxon rank sum test). (H) Task performances with the logistic regression as a classifier, instead of SVM (*P = 1.28 × 10−34, Wilcoxon rank sum test).

  • Fig. 5 Emergence of number tuning from the weighted sum of increasing and decreasing unit activities.

    (A) Summation coding model (1618, 26). (B) Monotonically decreasing/increasing neuronal activities as the numerosity increases were observed in earlier layers. Inset: The sigma of the Gaussian fit. Red solid lines indicate the average. (C) Number tuning as the weighted summation of decreasing/increasing units. Black solid lines, the average of individual tuning curves. (D) Left: Distributions of the preferred numerosity from the model simulation and observations in monkeys (27). Inset: The similarity test as performed in Fig. 2A (P < 0.01, n = 100). Right: The tuning width increases proportionally on a linear scale and remains constant on a logarithmic scale, as predicted by the Weber-Fechner law (27). (E) In the model simulation, neurons tuned to smaller numbers receive strong inputs from the decreasing units and receive weak inputs from the increasing units and vice versa. Right: The average weights of units preferring 4 and 24. (F) Weight bias of all number neurons observed in Conv5 of the untrained AlexNet. As predicted by the model simulation, the neurons tuned to smaller numbers receive stronger inputs from decreasing units and vice versa.

Supplementary Materials

  • Supplementary Materials

    Visual number sense in untrained deep neural networks

    Gwangsu Kim, Jaeson Jang, Seungdae Baek, Min Song, Se-Bum Paik

    Download Supplement

    This PDF file includes:

    • Figs. S1 to S4
    • Table S1

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article