Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 2 - STATLEARN (The reading brain as a statistical learning machine)

Teaser

Despite written language is not part of our genetic endowment, literate adults process an impressive amount of information as they read, and do that flawlessly and nearly error–free. How this happens is largely unknown, and represents a fundamental issue for theories of...

Summary

Despite written language is not part of our genetic endowment, literate adults process an impressive amount of information as they read, and do that flawlessly and nearly error–free. How this happens is largely unknown, and represents a fundamental issue for theories of human learning. Building on data from nonhuman primates, human infants and psycholinguistic experiments on word internal structure, STATLEARN tests the hypothesis that statistical learning is one of the fundamental cognitive mechanisms underlying visual word identification and reading. Human infants learn to chunk smaller perceptual units (e.g., oriented lines) into larger, meaningful objects (e.g., tools, faces), taking advantage of recurrent patterns in their distribution. As developing readers, they would apply this very same mechanisms to a newly–encountered type of visual objects, i.e., letters. On this basis, they would build progressively higher–order orthographic units, which eventually make their visual word identification as adult readers astonishingly efficient.
STATLEARN tests this conjecture combining techniques from Computational Linguistics, Experimental Psychology and Neuroscience. We carry out experiments with adults and children, using both behavioural methods and state–of–the–art technologies such as EEG, eye tracking and MEG. We also test natural reading, as well as simulated learning of new writing systems in the lab.
Given that reading is one of the most widespread human activities and is critical to navigate the modern society, this project promises to have far–reaching impact. By providing new insight on how we acquire literacy, STATLEARN may inform how we diagnose and treat developmental and acquired dyslexia, and how written language is taught in schools. More generally, the project may shed new light into the incredible learning and information processing abilities of the human brain.

Work performed

Words that share part of their internal structure (e.g., [mind]ful and [mind]less) are connected in the human brain and cognitive system. We have carried out a series of chronometric experiments to test whether this is related to letter co–occurrence statistics—we would notice the presence of “mind” in “mindful” and “mindless” because the letters m, i, n and d occur together often in the language. The data we gathered so far suggest this not to be the case.
In a second series of experiments, we asked our participants to learn a bunch of novel words in the lab, and tested whether they relied on letter co–occurrence statistics in doing so. When the experiments involved familiar letters, participants tended to apply the statistics of their native language, rather than learning new regularities based on the novel words. When we used an unfamiliar alphabet instead, participants did seem to capture the co–occurrence pattern between the novel characters.
We also looked for brain signatures of sensitivity to recurring chunks of letters. We did so by presenting these chunks periodically into a stream of visual events, and assessing whether the brain synchronises its rhythm to this same periodicity. The data suggest this to be case, at least in areas typically deputed to higher–level vision (i.e., the left occipito–temporal cortex). We also obtained evidence that this sensitivity is enhanced by meaning—the brain responds more to recurring clusters of letters that also carry a consistent meaning (e.g., “ness” in “kindness”, “fairness” and “bitterness”, or “er” in “driver”, “dealer”, or “baker”).
We complemented this evidence on adults by looking into when sensitivity to letter statistics emerges in children learning to read. Eye tracking data on text reading suggest that children are already sensitive to the frequency with which given letter combinations occur in the language in Grade 3, and that this information guides their visual exploration of the written text.

Final results

The data gathered so far clearly indicate that readers are sensitive to the statistical structure generated by how letters co–occur in the written language, and to how these co–occurrences inform about word meaning. This calls for a paradigm shift in reading research: the brain and cognitive processes behind reading aren’t only related to the fact that reading is written language, but also to the fact that reading generates a novel and somewhat independent visual domain, which the human brain/mind captures through general–purpose learning algorithms, based (at least in part) on probabilistic associations between (chunks of) letters.
In the remainder of the project, we hope to further clarify the mechanisms behind this probabilistic learning (e.g., which specific statistical cue, if any, is actually captured by the brain; which conditions need to be satisfied for this learning to happen). We also hope to learn more about the developmental trajectory of this learning, by diving more deeply into its brain signatures in children learning to read.

Website & more info

More info: https://lrlac.sissa.it/projects/statistical-learning-and-reading.