Early parafoveal semantic integration in natural reading

eLife assessment

Reading is a remarkable human skill that requires rapid processing of written words. We typically fixate each word for only 225–250ms, but nevertheless manage to encode its visual information, extract its meaning, and integrate it into the larger context, while also doing saccade planning (Rayner, 2009). To overcome the tight temporal constraints during reading, we preview the next word in the parafovea before moving our eyes to it (Jensen et al., 2021; Reichle and Reingold, 2013; Schotter, 2018). Substantial evidence suggests that parafoveal information can be extracted at various linguistic levels, including orthography (Drieghe et al., 2005; Inhoff, 1989; Johnson et al., 2007; White, 2008; Williams et al., 2006), phonology (Ashby et al., 2006; Ashby and Rayner, 2004; Chace et al., 2005; Miellet and Sparrow, 2004; Pollatsek et al., 1992; Rayner et al., 1995), lexicality (Kennedy and Pynte, 2005; Kliegl et al., 2006), syntax (Snell et al., 2017; Wen et al., 2019), and semantics (Rayner and Schotter, 2014; Schotter, 2013; Schotter et al., 2015; Schotter and Jia, 2016); for a comprehensive review see Schotter et al., 2012. However, for semantics in particular, controversy remains about the extent and type of information extracted from parafoveal processing under various conditions. Moreover, it is unknown when and how the previewed semantic information can be used – i.e., integrated into the evolving sentence context – which is an integral component of the ongoing reading process.

For some time, it was claimed that parafoveal preview was limited to perceptual features of words and did not extend to semantics (Inhoff, 1982; Inhoff and Rayner, 1980; Rayner et al., 2014; Rayner et al., 1986). However, eye tracking-based evidence for the extraction of parafoveal semantic information began to emerge from studies that used languages other than English, including Chinese (Tsai et al., 2012; Yan et al., 2012; Yan et al., 2009; Zhou et al., 2013) and German (Hohenstein et al., 2010; Hohenstein and Kliegl, 2014), and was eventually extended into English (Rayner and Schotter, 2014; Schotter et al., 2015; Schotter and Jia, 2016; Veldre and Andrews, 2018; Veldre and Andrews, 2017; Veldre and Andrews, 2016a; Veldre and Andrews, 2016b). For example, (Schotter and Jia, 2016) showed preview benefits on early gaze measures for plausible compared to implausible words, even for plausible words that were unrelated to the target. These results demonstrate that semantic information can indeed be extracted from parafoveal words. However, due to the limitations of the boundary paradigm, which only assesses effects after target words have been fixated, it is challenging to precisely determine when and how parafoveal semantic processing takes place. Furthermore, it is generally hard to distinguish between the effects of cross-saccade integration (e.g. the mismatch between the preview and the word fixated) and the effects of how differing words fit into the context itself (Veldre and Andrews, 2016a; Veldre and Andrews, 2016b).

Complementary evidence showing that semantic information can be extracted parafoveally, even in English, comes from electrophysiological studies. Context-based facilitation of semantic processing can be observed as reductions in the amplitude of the N400 component (Kutas and Hillyard, 1984; Kutas and Hillyard, 1980), a negative-going event-related potential (ERP) response observed between about 300 and 500ms after stimulus onset, which has been linked to semantic access (DeLong et al., 2014; Federmeier, 2022; Federmeier et al., 2007; Kutas and Federmeier, 2011; Lau et al., 2008). Basic effects of contextual congruency on the N400 – smaller responses to words that do versus do not fit a sentence context (e.g. to ‘butter’ compared to ‘socks’ after ‘He spread the warm bread with …’) – are also observed for parafoveally-presented words (Antúnez et al., 2022; Barber et al., 2013; Barber et al., 2010; López-Peréz et al., 2016; Meade et al., 2021) and, even when all words are congruent, N400 responses to words in parafoveal preview, like those to foveated words, are graded by increasing context-based predictability (Payne et al., 2019; Payne and Federmeier, 2017; Stites et al., 2017). Although many of these effects have been measured in the context of unnatural reading paradigms (e.g. the ‘RSVP flanker paradigm’), similar effects are obtain during natural reading. Using the stimuli and procedures from Schotter and Jia, 2016, Antúnez et al., 2022 showed that N400 responses, measured relative to the fixation before the target words i.e., before the boundary change while the manipulated words were in parafoveal preview, were sensitive to the contextual plausibility of these previewed words. These studies suggest that semantic information is available from words before they are fixated, even if that information does not always have an impact on eye fixation patterns.

Thus, both eye tracking and electrophysiological studies have provided evidence suggesting that semantic information is extracted from words in parafoveal preview. However, most of these studies have been limited to measuring parafoveal preview from fixations to an immediately adjacent word, raising questions about exactly how far in advance semantic information might become available from parafoveal preview. Moreover, important questions remain about the extent to which parafoveally extracted semantic information can be functionally integrated into the building sentence-level representation. Although some ERP studies have found that the semantic information extracted from parafoveal preview is carried forward, affecting semantic processing when that same word is later fixated (Barber et al., 2010; Payne et al., 2019; Stites et al., 2017), other studies have not observed any downstream impact (Barber et al., 2013; Li et al., 2015). Furthermore, post-N400 ERP components, linked to more attentionally-demanding processes associated with message-building and revision, do not seem to be elicited during parafoveal preview (Li et al., 2023; Milligan et al., 2023; Payne et al., 2019; Schotter et al., 2023). Therefore, critical questions remain about the time course and mechanisms by which semantic information is extracted and used during reading.

Answering those questions requires an approach that allows a more continuous and specific assessment of sensitivity to target word semantics during parafoveal processing across multiple fixations, and, in particular, that can speak to how attention is allocated across words during natural reading. We tackle these core issues using a new technique that combines the use of frequency tagging and the measurement of magnetoencephalography (MEG)-based signals.

Frequency tagging, also known as steady-state visually evoked potentials, involves flickering a visual stimulus at a specific frequency and then measuring the neuronal response associated with processing the stimulus (Norcia et al., 2015; Vialatte et al., 2010). It has been widely used to investigate visuospatial attention (Gulbinaite et al., 2019; Kritzman et al., 2022; Müller et al., 2003; Müller et al., 1998; Norcia et al., 2015; Vialatte et al., 2010) and has recently been applied to language processing (Beyersmann et al., 2021; Montani et al., 2019; Wu et al., 2023). However, the traditional frequency tagging technique flickers visual stimuli at a low-frequency band, usually below 30 Hz, such that the flickering can be visible and may interfere with the ongoing task. To address this limitation, we developed the rapid invisible frequency tagging (RIFT) technique, which involves flickering visual stimuli at a frequency above 60 Hz, making it invisible and non-disruptive to the ongoing task. Responses to RIFT have been shown to increase with the allocation of attention to the stimulus bearing the visual flicker (Brickwedde et al., 2022; Drijvers et al., 2021; Duecker et al., 2021; Ferrante et al., 2023; Gutteling et al., 2022; Zhigalov et al., 2021; Zhigalov et al., 2019; Zhigalov and Jensen, 2022; Zhigalov and Jensen, 2020). In our previous study, we adapted RIFT to a natural reading task and found temporally-precise evidence for parafoveal processing at the lexical level (Pan et al., 2021). The RIFT technique provides a notable advantage by generating a signal — the tagging response signal — specifically yoked to just the tagged word. This ensures a clear separation in processing the tagged word from the ongoing processing of other words, addressing a challenge faced by eye tracking and ERP/FRP approaches. Moreover, RIFT enables us to monitor the entire dynamics of attentional engagement with the tagged word, which may begin a few words before the tagged word is fixated.

In the current study, RIFT was utilised in a natural reading task to investigate parafoveal semantic integration. We recruited participants (n=34) to silently read one-line sentences while their eye movements and brain activity were recorded simultaneously by an eye-tracker and MEG. The target word in each sentence was always unpredictable (see Behavioural pre-tests in Methods) but was semantically congruent or incongruent with the preceding sentence context (for the characteristics of words, see Table 1). The target words were tagged by flickering an underlying patch, whose luminance kept changing in a 60 Hz sinusoid throughout the sentence presentation. The patch was perceived as grey, the same color as the background, making it invisible. To ensure that the flicker remained invisible across saccades, we applied a Gaussian transparent mask to smooth out sharp luminance changes around the edges (Figure 1A). Parafoveal processing of the target word was indexed by the RIFT responses recorded using MEG during fixations of pre-target words.

	Pre-target	Target	Post-target
Word frequency	124.5 (310.9)	62.2 (77.0)	3619.8 (6725.2)
Word Length	5.5 (1.1)	5.3 (1.1)	5.4 (2.0)
Position in the sentence	5.4 (3.0)	6.4 (1.3)	7.4 (1.3)

Characteristics of pre-target, target, and post-target words.

The paradigm and the eye movement metrics. (A) After the presentation of a cross-fixation at the screen centre for 1.2–1.6 s, a gaze-contingent box appeared near the left edge of the screen. Fixing the box for 0.2 s triggered the full sentence presentation. Participants (n=34) read 160 one-line sentences silently while brain activity and eye movements were recorded. Each sentence was embedded with one congruent or incongruent target word (see the dashed rectangle; not shown in the actual experiment). The target words could not be predicted based on the sentence context and word-level properties of congruent and incongruent targets were balanced by swapping them between two sentence frames. The target words were tagged by changing the luminance of the underlying patch (with a Gaussian mask) in a 60 Hz sinusoid throughout the sentence presentation (depicted as a bright blob, not shown in the actual experiment). Additionally, we included a small disk at the bottom right of the screen that displayed the tagging signal and was recorded by a photodiode throughout each trial. After reading, gazing at the bottom box for 0.2 s triggered the sentence offset. Twelve percent of the sentences were followed by a simple yes-or-no comprehension question. (B) The first fixation durations on the pre-target and target words when the target words were incongruent (in blue) or congruent (in orange) with the sentence context. Each dot indicates one participant (n=34). ***p<0.001; n.s., not statistically significant; ITI, inter-trial interval.

This paradigm allows us to address three questions. First, we aimed to measure when in the course of reading people begin to direct attention to parafoveal words. Second, we sought to ascertain when semantic information obtained through parafoveal preview is integrated into the sentence context in a manner that affects reading behaviours. Modulations of pre-target RIFT responses by the contextual congruity of target words would serve as evidence that parafoveal semantic information has not only been extracted and integrated into the sentence context but that it is affecting how readers allocate attention across the text. Third, we explored whether these parafoveal semantic attention effects have any relationship to reading speed.

eLife assessment

How to Identify Grammatical Names and Functions

The 24 Consonant Sounds with 10 Examples Each

Neural signatures commonly observed when humans make choices can also reflect choice-independent processes

How Much Do You Know about Quantifiers?

10 phrases that instantly make you sound less intelligent, according to psychology

Difference Between Growth and Development

ChatGPT is changing the way we write. Here's how—and why it's a problem