Double-R Grammar

A Computational Cognitive Grammar of English
Last Updated 3 September 2015

Jerry Ball
with Mary Freiman, Stu Rodgers and Alan Ball

Table of Contents:

Chapter 1: Introduction

Double-R Grammar is a computational cognitive grammar of English that details a system of grammatical representation focused on capturing two key dimensions of meaning — referential and relational meaning. Double-R identifies the referring expressions in the input (e.g. object referring expression or nominal, situation referring expression or clause) and the relationships between these referring expressions (e.g. transitive verb relating a subject and an object). Double-R representations are linguistic, or grammatical in the traditional sense, but not purely syntactic — there is no autonomous syntax component. Grammatical representations are assumed to be semantically motivated, subject to the challenge of encoding multiple dimensions of meaning in a linear code. Double-R includes a cognitively motivated language analysis mechanism that adheres to two well-established constraints on Human Language Processing (HLP) — incremental and interactive processing. Double-R incrementally analyzes the written linguistic input one word or multi-word unit at a time, using all available grammatical (and eventually semantic) information interactively (in parallel) to make the best choice at each choice point. Lexical items in the input project constructions which set up expectations that drive processing. Once a choice is made, it is assumed to be correct, and Double-R proceeds incrementally forward. However, the subsequent input may require modification of the evolving representation via a non-monotonic mechanism of context accommodation. Overall, the processing mechanism is pseudo-deterministic in that it pursues the single best analysis given the current input and context, but accommodates the subsequent input and context when necessary. Double-R is implemented as a large-scale cognitive model in the ACT-R cognitive architecture and is approaching the grammatical breadth of leading computational linguistic systems, without being tuned to a specific corpus or being limited to purely syntactic analysis.

Although most of the examples in this document show the representation and processing of isolated sentences, Double-R accepts input from single words up to an entire document of text. Current capabilities for resolving cross-sentential dependencies (e.g. anaphora) are limited, as are discourse capabilities more generally. During the processing of an input text, Double-R creates a collection of nested ACT-R chunks (i.e. frame-like representations consisting of a collection of slot-value pairs where the value of a slot may be a chunk). At the end of processing, these ACT-R chunks are converted into tree diagrams for visualization. The diagram creation capability consists of Lisp code that generates bracketed structures from the ACT-R chunks and phpSyntaxTree which generates diagrams from the bracketed structures. PhpSyntaxTree is a product of Mei and André Eisenbach. The bracketing code was developed by Andrea Heiberg. In linearizing the slot values of the nested ACT-R chunks, the bracketing code provides the rough equivalent of a language generation capability. Andrea worked with Jack Harris to interface the bracketing code to phpSyntaxTree. The tree diagrams are customizable to some extent. Grammatical features may or may not be displayed, and when they are displayed they are only displayed at the level of the clause or nominal. Some elements of the underlying chunk representation are not displayed in the diagrams — especially slots lacking a value.

We refer to Double-R representations as grammatical or linguistic (but not syntactic) representations. By this we mean that they include information about the grammatical function as well as the form of the linguistic elements in the input. We assume that grammatical functions like subject, object, specifier, head, and modifier, and grammatical features like animacy, gender and number are semantically motivated. We also assume, as in Langacker's Cognitive Grammar and traditional grammar, that parts of speech like noun, verb and adjective are semantically motivated. We use the term grammatical as it is traditionally used, to reflect these assumptions. These grammatical representations map into non-linguistic representations of the situations and objects that they describe within the context of a situation model. These non-linguistic representations are in the spirit of Jackendoff's Conceptual Semantics and are under development by Stu Rodgers.

The grammatical representations encode two key dimensions of meaning: referential and relational — hence the name Double-R. Double-R identifies the referring expressions in the input (e.g. object referring expression or nominal, situation referring expression or clause) and the relationships between these referring expressions. The key relational elements include verbs, adjectives, adverbs, prepositions and conjunctions (but not nouns). The processing of relational elements leads to projection of constructions which predict the occurrence of the elements they relate.

Double-R is most closely aligned with Langacker's Cognitive Grammar and collectively Construction Grammar (most recently Sag's shift from Head-Driven Phrase Structure Grammar (HPSG) to Sign Based Construction Grammar (SBCG) and Jackendoff's shift in this direction as well). Double-R can best be viewed as a formalization and computational implementation of ideas from Cognitive and Construction Grammar.

From the perspective of English grammar, Double-R aligns with Huddleston and Pullum's The Cambridge Grammar of the English Language and to lesser extent Quirk, Greenbaum, Leech and Svartvik's A Comprehensive Grammar of the English Language, and Biber, Johansson, Leech, Conrad & Finegan's Longman Grammar of Spoken and Written English.

From the perspective of formal linguistics, Double-R aligns with the Simpler Syntax of Culicover and Jackendoff.

Double-R adheres to well-established cognitive constraints on Human Language Processing (HLP), including the incremental and interactive nature of HLP. Double-R processes the input incrementally, one word or multi-word unit at a time, using a perceptual span of 12 characters to perceive the current input. No preprocessing of the input is required. There is no separate tokenizing, part of speech tagging or syntactic parsing. All processing occurs interactively within Double-R.

The word recognition subcomponent, which is fully integrated with the rest of the system, is being developed by Mary Freiman. The contents of the perceptual span activate words and multi-word units in the Mental Lexicon component of declarative memory. Activation spreads from the letters, trigrams and space delimited units in the perceptual span. The most highly activated declarative memory element, which need not be an exact match, is retrieved from memory and used in subsequent processing.

The current computational implementation comprises ~1400 productions and ~58,000 lexical items, and is capable of processing a broad range of English language constructions. The mental lexicon increased from ~8000 to ~58,000 lexical items in 2011. This expansion was made possible by use of the Corpus of Contemporary American English (COCA) which provides information about the frequency of occurrence of words in their various parts of speech. We are working on stabilizing the behavior of Double-R with this comprehensive lexicon and on extending the grammatical coverage to encompass the expanded lexicon.

On a 64-bit quad core computer with 8 Gig RAM, Double-R is capable of processing ~150 words per minute (wpm) with the 58,000 word mental lexicon. Double-R also processes ~140 wpm in ACT-R cognitive processing time. By comparison, fluent adult reading rates are in the range of 200-300 wpm. The key to achieving adult reading rates is the ability to process multi-word units, including units that are larger than a single perceptual span. Such units not only speed up processing, but they are less ambiguous than individual words and facilitate determination of meaning. We are working on extending the number of multi-word units in the mental lexicon and adding the capability to recognize units larger than a single perceptual span. This view of multi-word units as crucial for rapid comprehension contrasts with that of Sag et al. (2002) where they are viewed as A Pain in the Neck for NLP.

Double-R is a work in progress. It currently has broad enough coverage of English to be used in the development of a synthetic teammate capable of communicating via text chat with two human teammates in a Unmanned Aerial Vehicle (UAV) reconnaissance mission simulation. Although we are making incremental progress in expanding Double-R's grammatical coverage, it does not currently have the grammatical coverage of leading computational linguistic systems which use automated machine learning techniques operating over annotated corpora like the Penn Treebank corpus in order to bootstrap grammar development. Much of the current capability of Double-R has been manually encoded, although we also use automated techniques where practical (e.g. in creation of the mental lexicon). The downside is that this is a slower development process than using fully automated techniques. The upside is that we have full control and can incrementally improve the behavior of Double-R in ways that are not available to approaches which rely exclusively on automated techniques. For example, it is straightforward in Double-R to add a new grammatical category to facilitate the processing of some previously unhandled construction. This is not possible in systems which rely on use of an annotated corpus with a fixed set of grammatical categories — at least not without first updating the annotated corpus to reflect the new category.

Although updating an annotated corpus is possible, the updated corpus is no longer compatible with the original. Given the way that annotated corpora like the Penn Treebank are used to evaluate competing systems, changing the annotated corpus is not typically an option. This inability to update the corpus has the negative effect of stifling innovation in representations — at least incremental innovation of a particular annotated corpus — even when such innovations might improve processing. For example, it is not possible to train grammars on proper prepositional phrase modifier attachment in nominals, since modifiers are simply listed at a fixed level in the Penn Treebank. If the Penn Treebank were updated to support this, the updated corpus would be incompatible with the original corpus. It would be necessary to get researchers who use the Penn Treebank for comparison with competing approaches to adopt the update en masse. No one researcher or institution has the influence to make this happen.

Since Double-R does not rely on Penn Treebank annotations — although the Penn Treebank was used to determine the subcategorization frames of verbs (e.g. intransitive, transitive, ditransitive) — changing the grammatical categories is not a problem. What is a problem is comparing the performance of Double-R to Penn Treebank based parsers. It is simply not possible to use established techniques to do this. Despite this difficulty, given the current capabilities and the ability to incrementally improve Double-R, we expect Double-R to eventually match or exceed the state-of-the-art performance of leading computational linguistic systems.

To test the performance of Double-R against leading computational linguistic systems, we use the MSR Splat parser as an exemplar since it is available on line. However, since Double-R generates representations which are not fully compatible with MSR Splat (or other systems based on the Penn Treebank), it is difficult to do automated comparisons over large corpora (tyically a test set from the Penn Treebank) as is the current standard for comparison. For this reason, we rely on sampling techniques to do the comparisons. We take a random sample from a given corpus, run the sample thru Double-R and MSR Splat, and manually evaluate the results. Note that this approach avoids the need for a gold standard like the Penn Treebank to do the evaluation automatically. We recently evaluated a random sample taken from the text chat corpus we are using in the development of the synthetic teammate. Although this corpus is guiding development of Double-R, it is not annotated and Double-R is not specifically trained to handle just this corpus, although the mental lexicon does include the proper nouns from this corpus. Based on our analysis of a random sample of 51 text chat messages, Double-R correctly processed 24 messsages (about 50% correct) and MSR Splat correctly processed 16. The details of this analysis are provided in Appendix E. The reason the success rates are so low is largely due to the nature of the text chat corpus combined with a definition of correctness which applies to the entire message.

Although this document if primarily focused on describing the capabilities of Double-R, it is also intended as a resource for identifying gaps in coverage and areas in need of improvement. We will not hesitate to point out such gaps. As those gaps are filled and improvements made, this document will be updated.


Chapter 2: Methodological Commitments

A key commitment of our language comprehension research is development of a computational model which is at once cognitively plausible and functional. We believe that adherence to well-established cognitive constraints will facilitate the development of functional systems by pushing development in directions that are more likely to be successful. There are short-term costs associated with adherence to cognitive constraints; however, we have already realized longer-term benefits. For example, the integration of a word recognition capability with ACT-R's perceptual system and higher-level linguistic processing has facilitated the recognition and processing of multi-word expressions and multi-unit words in ways that are not available to systems with separate word tokenizing and part of speech tagging processes. Using an available tokenizer and part of speech tagger would have initially facilitated development, but the cognitive implausibility of using staged tokenizing and part of speech tagging led us to reject this approach. The benefits that we have realized as a result of this decision are described below and elsewhere in this document. As another example, we rely on use of a mental lexicon of word knowledge based on our understanding of the human mental lexicon. During language analysis, words in the input activate corresponding words in the mental lexicon. The most highly activated word is retrieved and guides subsequent processing. Words in the mental lexicon encode knowledge about their form (e.g. letters and trigrams), their part of speech (e.g. noun, verb), their grammatical features (e.g. number, animacy, gender; tense, aspect, voice), their grammatical function (e.g. head of object referring expression, specifier of clause), and, to some extent, their meaning (e.g. semantic type, referential type). In addition, the mental lexicon encodes knowledge about the frequency of use of words in their various parts of speech. In Double-R, the mental lexicon is a much richer resource of word knowledge than are the simple word/part of speech lists that are typically used in computational linguistic systems. Creation of the mental lexicon relied on use of a combination of automated and manual techniques. Overall, it was a resource intensive effort, and we continue to make improvements. As the mental lexicon has expanded and improved, so have the language analysis capabilities of Double-R.

Cognitive Plausibility

There is extensive psycholinguistic evidence that human language processing is incremental and interactive (Gibson & Pearlmutter, 1998; Altmann, 1998; Tanenhaus et al., 1995; Altmann & Steedman, 1988). Garden-path effects, although infrequent, strongly suggest that processing is essentially serial at the level of phrasal and clausal analysis (Bever, 1970). Consider These sentences all lead the reader down a garden-path that results in a disruption of normal incremental processing at the highlighted word or the end of the sentence. This disruption should not occur if alternative representations were maintained in parallel, suggesting that human language processing (HLP) is essentially serial and incremental at the phrase and clause level. Humans appear to pursue a single analysis which is only occasionally disrupted, requiring reanalysis. One of the great challenges of psycholinguistic research is to explain how humans can process language effortlessly and accurately given the complexity and ambiguity that is attested (Crocker, 2005). As Boden (2006, p. 407) notes, However, given the rampant ambiguity of natural language, a deterministic mechanism would need access to the entire input before making a decision. Marcus (1980) proposed a deterministic parser with a limited lookahead capability to capture the trade-off between the efficiency of human parsing and the limitations with respect to garden-path inputs. However, there is considerable evidence that HLP is inconsistent with extensive lookahead, delay or underspecification — the primary serial mechanisms for dealing with ambiguity without backtracking or reanalysis. Instead of lookahead, the HLP engages in thinkahead, biasing and predicting what will come next, rather than waiting until the next input is available before deciding on the current input.

Summarizing the psycholinguistic evidence, Altmann & Mirkovic (2009, p. 605) claim Although phrase and clause level processing is serial and incremental, lower level processes of word recognition suggest parallel, activation-based processing mechanisms (McClelland & Rumelhart, 1981; Paap et al., 1982). Determining the part of speech of a word is strongly contrained by the preceding context. Consider the word bank: In the context of the, bank-the-noun is strongly preferred. In the context of to, bank-the-verb is strongly preferred. Retrieval of the appropriate part of speech of bank is highly context dependent and interactive.

These cognitive constraints legislate against staged analysis models — the current standard in computational linguistics. All levels of analysis must at least be highly pipelined together (i.e. word by word, or morpheme by morpheme), if not, in addition, allowing feedback from higher to lower levels. They also suggest the need for hybrid systems which incorporate a mixture of parallel and serial mechanisms, with lower levels of processing being primarily parallel, probabilistic and interactive, while higher levels of analysis are primarily serial, deterministic and incremental.

Functional Language Analysis

Double-R Grammar is an attempt to build a broadly functional model of language analysis — and ultimately language comprehension — that is also cognitively plausible. In attempting to be cognitively plausible, we adhere to well established cognitive constraints on human language processing (HLP) and do not adopt any computational techniques which are obviously not cognitively plausible. For example, we attempt to model the real-time language processing behavior of humans using a pseudo-deterministic, serial processing mechanism operating over a parallel, probabilistic, activation substrate. The parallel, probabilistic substrate activates constructions corresponding to the linguistic input — constrained by the current context — and the serial processing mechanism selects from among the activated constructions and integrates them into a coherent representation. Overall, the processing mechanism is highly interactive and incremental, allowing whatever grammatical or semantic information is most relevant to be brought to bear in making a decision that will usually be correct at each choice point. The language analysis model does not make use of computational techniques like a first pass part-of-speech tagger that operates independently of a second pass parser. Being non-incremental, such an approach is not cognitively plausible.

It might be assumed that the commitments to building a functional and cognitively plausible model are incompatible. However, it is a basic claim of this research that a non-functional model cannot be cognitively plausible. That is, a cognitive model that doesn't actually perform the cognitive task it purports to model is at best suspect, even if the model fits some empirical measure like reaction time. As an example, the EZReader model (Reichle, Raynor & Pollatsek, 2003) — a model of lexical access during reading — is not a cognitively plausible model of reading since it doesn't perform the reading task. Although EZReader is capable of modeling a wide range of empirical phenomena related to lexical access, since it doesn't actually complete the reading task, it is clearly not a model of reading — the developers of EZReader are quick to acknowledge this point (despite the model's name). Further, any claims about lexical access must be taken with a grain of salt, since lexical access is an important subcomponent of reading which cannot easily be studied in isolation from the overall reading task. As a concrete example of where the attempt to study the lexical access subcomponent in isolation runs astray, the developers of EZReader claim that at the processing of each word, it takes only 25 msec on average for all higher level processing associated with reading to influence the next eye movement. In ACT-R terms, there is insufficient time for a single higher-level production to fire and influence the programming of the next eye movement. Such a claim is obviously false. It arises because the EZReader model makes no attempt to actually read the linguistic input.

The use of the term functional to describe a model like Double-R Grammar can be further elaborated. Ultimately, a functional model of reading should do just that — read and understand the linguistic input. In this respect, Double-R Grammar currently falls short since it is not currently capable of full language comprehension. Rather, Double-R Grammar is a model of language analysis which creates linguistic representations that encode some aspects of meaning — especially referential and relational meaning. While these two dimensions of meaning are important, they are not the whole story. Until Double-R Grammar is capable of full language comprehension, claims of cognitive plausibility can be criticized.

There are two directions in which Double-R Grammar must be advanced in order demonstrate full language comprehension: Research is ongoing in both these directions.

With respect to deeper semantic analysis, efforts to map Double-R representations into a Situation Model are underway. At present, these efforts are restricted to the domain of an Unmanned Aerial Vehical (UAV) reconnaissance task. The Situation Model represents the objects and situations that are relevant to this domain and provides the referential grounding for the referring expressions in Double-R's linguistic representations. Efforts are also underway to support Discourse Modeling within this domain. Other aspects of pragmatic analysis are yet to be implemented. These capabilities provide the deeper understanding that is needed for development of a synthetic teammate that is capable of performing the piloting task within the UAV reconnaissance domain, and get us closer to the ultimate objective of a model that is capable of full language understanding (at least within the UAV reconnaissance domain).

With respect to breadth of coverage, a model that is limited to some specialized collection of inputs designed to test some isolated psycholinguistic behavior is empirically lacking. For example, a system which models garden-path phenomena, but can't model common-or-garden sentences, is not considered functional. In addition, the term functional applies to the addition of mechanisms, as needed, to model a broad range of inputs. For example, the modeling of wh-questions requires the addition of mechanisms to support the fronting of a wh-expression and the binding of this fronted expression with the trace of an implicit argument or adjunct (or alternative mechanisms for indicating this relationship). Likewise, the modeling of yes-no questions requires mechanisms to support the inversion of the subject with the first auxiliary (relative to declarative sentences). The overall functional goal is to be able to handle the basic grammatical patterns of English such that the model can be used in a real world application like the UAV reconnaissance task. Although the language analysis capabilities of Double-R Grammar are quite broad, the domain specificity of the Situation Model still limits overall functionality and limits claims of cognitive plausibility.

There is some acknowledgement within the cognitive modeling community that we need to be building larger-scale models with broader cognitive capabilities. This acknowledgement is reflected in the frequent reference to Newell's 20 questions critique (Newell, 1973) of cognitive science research (cf. Anderson & Lebiere, 2003; Anderson, 2007; Byrne, 2007). Anderson & Lebiere (2003) argue that the ACT-R cognitive architecture answers Newell's 20 questions critique in assessing ACT-R's capabilities with respect to the Newell Test (Newell, 1990) for a theory of cognition. The Newell Test lists twelve functional criteria considered essential for a human cognitive architecture. Although ACT-R does not completely satisfy all twelve criteria, it does well enough to merit serious consideration as a functional architecture.

On the other hand, although the ACT-R cognitive architecture addresses Newell's 20 questions criticism, cognitive models developed within ACT-R typically address specific, limited, cognitive phenomena tied closely to simple laboratory experiments. The typical study involves the development of a cognitive model that matches the human data from some laboratory experiment, demonstrating that the ACT-R cognitive architecture provides the needed cognitive mechanisms — when combined with task specific knowledge — to model the human data. In addition, Young's (2003) notion of compliancy is satisfied if the model was developed without excessively challenging the cognitive architecture. A few studies attempt to model more complex phenomena (Gray & Schoelles, 2003; Fu et al., 2006) and there is also some hope that smaller scale models can be integrated into more complex composite models (Gray, 2007). But cognitive modelers are loath to distance themselves from matching human experimental data and this commitment methodologically differentiates cognitive modeling from other types of computational modeling. (Cognitive modelers might argue that this is what makes their efforts scientific in the Popperian sense.) Further, within the ACT-R community, matching human data typically means matching data from reaction time studies, since ACT-R was specifically developed to support this kind of modeling (Anderson & Lebiere cite this as the best functional feature of ACT-R). Note that it is the cognitive models developed within ACT-R which actually provide the empirical validation of the cognitive architecture, since the cognitive architecture itself is not capable of modeling human behavior (although some steps have been taken to automate the learning of experimental tasks so that the architecture can directly model human behavior without the intervention of creating a cognitive model).

Despite the functionalist claims of Anderson & Lebiere (2003), recent variants of the ACT-R cognitive architecture are (in part) motivated on minimalist principles by which the architecture is only extended if extensive empirical evidence is provided to motivate the extension. According to Anderson & Taatgen (2008), from a theoretical perspective, it is important to keep the architecture tightly constrained. Otherwise, the architecture underconstrains models developed in the architeture, and that makes it difficult to falsify either the model or the architecture. As evidence of this minimalist stance, some functional mechanisms available in earlier ACT-R variants like the unbounded goal stack and multi-level activation spread have been removed from the architecture. The unbounded goal stack has not held up to minimalist arguments and is inconsistent with empirical results which show decayed memory for previous goals. While the removal of the unbounded goal stack is supported by empirical evidence, the replacement of the goal stack by a single chunk goal buffer appears to be more a reflection of the minimalist bent than it is empirically motivated. Until there is sufficient empirical evidence that a multiple chunk goal buffer is needed, the minimalist argument suggests it be limited to a single chunk. Ockham's razor is hard at work. Not only the goal buffer, but all buffers in ACT-R are limited to a single chunk. Functionally, the language analysis model appears to need more than single chunk buffers. To overcome this limitation, the language analysis model uses a combination of grammatical function specific buffers and multiple buffers linked together to create bounded buffer stacks (limited to 4 chunks, consistent with empirical evidence of short-term working memory capacity, cf. Cowan, 2001). The grammatical function specific buffers provide links to the outermost element of a deeply nested linguistic structure — supporting primacy effects, whereas the buffer stacks provide access to the most recent linguistic elements — supporting recency effects. Likewise, the elimination of multi-level activation spread is based on empirical evidence against priming from the word bull to the word milk via the intermediate word cow. However, limiting activation to the spread from slots in chunks in buffers to slots in declarative memory chunks, with no subsequent spread of activation from slots in declarative memory to other declarative memory chunks imposes a hard constraint on possible systems of representation. For example, in a declarative memory in which a cow chunk is not directly linked to a bull chunk, no activation spread is possible. In small-scale systems, this is not a problem, but in large-scale systems, the proliferation of direct links required to support single-level activation is explosive. Further, chunks must have separate links for all possible forms of activation including semantic, syntactic, morphologic, phonologic, orthographic, etc., resulting in a proliferation of links within individual chunks and making it difficult to spread activation across levels (e.g., letters can activate syllables and words directly, but how can syllables activated by letters, spread activation to words without multiple level activation?).

In sum, there appear to be competing motivations influencing the development of ACT-R. On the one hand is the desire to satisfy the Newell Test of functionality; and, on the other hand is the small-scale approach to science adopted within Popperian cognitive psychology and against which Newell's 20 questions critique is addressed.

In our view, the key to bridging the gap between current small-scale cognitive modeling and the development of large-scale functional systems is to adopt a functionalist perspective at the level of cognitive models (as well as cognitive architecture) — without giving up on cognitive plausibility. Given the complexity of the cognitive systems we are modeling, it may not be feasible to pursue low-level empirical studies — at least not until we have a working cognitive model built from the functional perspective. Once a working cognitive model is available, the functional mechanisms proposed in the development of the model can be subjected to empirical validation and Ockham's razor (i.e. can a model with fewer mechanisms model the same complex behavior). From the functionalist perspective, it is premature to enforce minimalist assumptions in the absence of a functional model. Further, empirical validation of small pieces of a complex system in the absence of a working model are of limited value (as suggested by Newell's 20 questions critique). Mechanisms which are sufficient in a small-scale model of a single cognitive phenomenon are unlikely to be sufficient in a large-scale functional model of complex cognitive behavior. Scaling up to complex cognitive phenomena means adding additional mechanisms and integrating these mechanisms in complex ways which cannot be predicted on the basis of small-scale models. Ockham's razor may well be counter-productive in such contexts. As Roelofs (2005) notes, although Ockham's razor favors the simplest model that covers a set of phenomena, it does not simultaneously favor modeling the simplest set of phenomena. Further, Tenenbaum (2007) argues that it is important to consider the trade-off between simplicity and fit (in the development of models of language acquisition). The simplest model which covers a set of phenomenon is unlikely to be the best fit to the data and the best fitting model is unlikely to be the simplest. The preferred model will necessarily trade-off simplicity and fit. In addition, as the set of phenomena to be modeled is increased, a more complex model will be required to provide the same degree of fit. Further, increases in complexity are likely to be exponential, rather than linear, with increases in the number of objects in a model.

What is not being proposed is an approach to cognitive modeling research which ignores well-established cognitive constraints on human behavior. While such an approach is acceptable in some Artificial Intelligence circles where the goal is to develop intelligent systems using advance computational techniques, regardless of the cognitive plausibility of those techniques, the approach being proposed here accepts the validity of cognitive constraints and integrates them into the development of complex cognitive mechanisms which are at once functional and cognitively plausible. What is proposed is a shift in methodology in which the conduct of small-scale empirical studies is delayed or marginalized until a working model of a complex task has been developed on functionalist principles. Once a functional model is in place, small-scale empirical validation of specific components of the model and the application of minimalist principles like Ockham's razor become relevant and important. Until a functional model is in place, model development is guided by cognitive constraints and empirical validation at a gross level, without being constrained to match specific data sets which would likely derail, rather than facilitate, progress. A small-scale model tuned to a specific data set is unlikely to generalize to meet the larger functional requirements of a complex system.

Natural Language Processing → Human Language Processing

Marr (1982, 1992) put forward a strongly functionalist approach to modeling biological information processing problems in arguing that we should first identify the computational mechanisms and constraints that are needed to compute the complex phenomena being studied (computational and algorithmic levels), before worrying about how these mechanisms might be implemented in the brain or other hardware (implementation level). As Boden (1992) notes in describing Marr's position (Marr, 1992), Although Marr prefers an approach to research which focuses on the development of Type-1 theories which are explicit and computational, he acknowledges that this is not always possible, and often Type-2 theories are the best explication of a complex information processing problem that can be developed. Marr places human language processing in this latter category suggesting that a Type-1 theory corresponding to Chomsky's notion of competence may not be possible, with only Type-2 theories which consider the process of mapping a multidimensional (mental) representation in the head of the speaker into a being attainable.

Left unstated in Marr (1992) is the methodology by which models of complex cognitive systems are empirically validated. Within AI, it is often assumed that the primary empirical goal is to model input-output behavior. However, as argued in Ball (2006), we do not believe it is possible to model the input-output behavior of complex cognitive systems like language without serious consideration and computational implementation of the internals of language processing in humans. If we are to delve inside the black box of cognition, then we need a methodology for empirically validating the representations and mechanisms proposed for inclusion in the black box. However, as argued above, small-scale Popperian falsification of isolated hypotheses is likely to derail progress in the development of functional systems. For those of us who are interested in building large-scale models, such an approach is not viable (although we are happy to consider the small-scale experimental results of others researchers). Instead, we should focus on identifying empirical phenomena which can be validated at a gross level which helps to focus development in promising directions without side-tracking that development.

One good example of matching a cognitive constraint at a gross level within NLP, is the requirement to be able to process language incrementally in real-time. At Marr's algorithmic level where parallel and serial processing mechanisms are relevant, a language processing system should be capable of incremental, real-time language processing. For language processing, this means that the performance of the system cannot deteriorate significantly with the length of the input — as is demonstrably not the case in humans. The simplest means of achieving this is in a deterministic system (cf. Marcus, 1980). To the extent that the system is not deterministic, parallelism (or some other nonmonotonic mechanism) is required to overcome the non-determinism at the algorithmic level. In Ball (2007a, 2011), a language processing model based on a pseudo-deterministic, serial processing mechanism operating over a probabilistic, parallel processing substrate was described. A basic element of the serial processing subsystem is a mechanism of context accommodation wherein the current input is accommodated without backtracking, if need be. For example, in the incremental processing of the airspeed restriction, when airspeed is processed it is integrated as the head of the nominal the airspeed, but when restriction is subsequently processed, airspeed is moved into a modifier function, allowing restriction to function as the head of the airspeed restriction. Interestingly, context accommodation gives the appearance of parallel processing within a serial processing mechanism (i.e. at the end of processing, it appears that airspeed was considered a modifier all along). Context accommodation is a cognitively plausible alternative to the less cognitively plausible lookahead mechanism of the Marcus parser (and the cognitively implausible mechanism of algorithmic backtracking). There is little psychological evidence that humans are aware of the right context of the current input (cf. Kim, Srinivas & Trueswell, 2002) as is strongly implied by a lookahead mechanism. In support of his lookahead mechanism, Marcus (1980) argues that strict determinism which eschews all non-determinism cannot be achieved without it. The context accommodation mechanism violates Marcus' notion of strict determinism in that it allows for the modification of existing structure and is nonmonotonic (i.e. capable of simulating non-determinism), but whereas exploring the feasibility of strict determinism for language processing may have been an important goal of Marcus' research, its reliance on a lookahead capability appears not to be cognitively viable — the right context of the input is simply not available to the human language processor (patently so in spoken language). Besides the context accommodation mechanism, the parallel, probabilistic spreading activation mechanism violates Marcus' notion of strict determinism. However, parallel processes are well attested in human cognitive and perceptual processing, and are well motivated for handling non-determinism probabilistically — especially at lower levels of cognitive processing like word recognition and grammatical construction selection. At Marr's algorithmic level, a non-deterministic language processing system may still be cognitively plausible and capable of operating in real-time if the non-determinism can be handled using parallel, probabilistic processing mechanisms. The ACT-R cognitive architecture, which is based on 40+ years of cognitive psychological research, provides just this combination of a serial, feed-forward production system combined with a parallel, probabilistic spreading activation mechanism which provides the context for production selection and execution.

Another important advantage of the ACT-R cognitive architecture is that is provides a virtual machine (i.e. the cognitive architecture) which supports an executable algorithmic level description of solutions to complex cognitive problems (i.e. the cognitive model). The ACT-R virtual machine also provides execution time information which makes it possible to determine if the cognitive model is capable of operating in real-time at the algorithmic level. The execution of the language analysis model — which contains 58,000 lexical items — demonstrates that it is capable of operating incrementally in real-time at the algorithmic level (i.e. the performance of the model does not degrade with the length of the input). In addition, the language analysis model currently operates near real-time on the implementation hardware. However, the language analysis model does not yet generate representations of meaning comparable to humans, currently generating only a linguistic representation that can be mapped into a Situation Model for a single domain. The model also does not make extensive use of the parallel, spreading activation mechanism which is computationally explosive on serial hardware. Instead, Double-R relies on a disjunctive retrieval mechanism (Freiman & Ball, 2008) that avoids the need to compute the activation of 58,000 lexical items on each lexical retrieval.

Two examples of computational linguistic systems which take into consideration cognitive plausibility are the Eager parser of Shen & Joshi (2005) and the supertagger of Kim, Srinivas & Trueswell (2002). The Shen & Joshi parser is designed to be incremental and only considers the left context in making parsing decisions. However, this parser performs less well than a less cognitively plausible bi-directional parser to which it is compared in Shen (2006). The supertagger of Kim, Srinivas & Trueswell is concerned with modeling several psycholinguistic phenomena in a large-scale system based on a Constraint-Based Lexicalist (CBL) theory of human sentence processing. Operating incrementally, left-to-right, the trained connectionist model selects the sequence of supertags (i.e. lexically specific syntax treelets) which is most consistent with the input — where supertag selection is based on a linking hypothesis involving the mapping of the activated output units to the supertag which is most consistent with them. The theoretical mechanism by which the selected supertags are integrated — within the parallel CBL framework — is not explored.

Most large-scale computational linguistic systems perform only low-level linguistic analysis of the input. As Shen (2006) notes, Building a language comprehension system based on existing computational linguistic techniques will require extensive modification to make the system capable of comprehending language as humans do.

Pitfalls of a Functionalist Approach

A primary risk of a functionalist approach to research is that it can become largely detached from empirical reality. This appears to be what has happened in generative grammar following the ill-advised introduction of functional heads (cf. Abney, 1987; for a critique, see Ball, 2007b). Recent linguistic representations within generative grammar do not pass the face validity test — they are too complex and unwieldy, with too many levels and hidden elements, to be cognitively plausible. These complex representations have been motivated on functional grounds stemming from requirements for increasing the grammatical coverage to an ever wider range of linguistic phenomena while at the same time providing a maximally general theory. The primary empirical methodology driving generative grammar is judgements of grammaticality — often by the generative grammarian him or herself. While grammaticality judgements may be a reasonable (gross level) empirical method — if applied judiciously — the cognitive implausibility of the proposed representations suggests the need for alternative empirical methods of validation.

On the basis of grammaticality judgments on ever more esoteric linguistic expressions, more and more linguistic mechanisms and entities have been proposed within generative grammar for which there is no explicit evidence in the linguistic input. The introduction of all these implicit linguistic entities and mechanisms created a challenge for theories of language acquisition and led to a reformation of opinion within generative grammar with the introduction of the Minimalist Program (Chomsky, 1995). The Minimalist Program is (in part) an attempt to simplify generative grammar (in the pursuit of a perfect computational system), reducing the number of implicit linguistic entities. Unfortunately, although the Minimalist Program has been very successful in reducing linguistic entities and mechanisms, as Culicover & Jackendoff (2005) argue, it has done so at the expense of being able to model the broad range of linguistic phenomena covered in earlier generative theories. Essentially, the Minimalist Program has defined away all the linguistic variability that it no longer attempts to model, making that variability external to the core grammar that is of theoretical interest. The Minimalist Program has thereby renounced most functionalist claims in pursuit of a perfect system of core grammar. The result is a system that is functionally and empirically incomplete. In pursuit of explanatory adequacy (how language can be learned), the Minimalist Program has de-emphasized descriptive adequacy, pushing many linguistic phenomena to the uninteresting periphery. In Tenenbaum's (2007) terms, it is a simpler theory which is a poor fit to much of the available linguistic data. Culicover & Jackendoff (2005) provide an alternative within generative grammar called the Simpler Syntax which retains a strong functionalist orientation while at the same time challenging the proliferation of linguistic entities and mechanisms within the syntactic component of non-minimalist generative grammar. Essentially, the syntactic component is simplified by introducing a compositional semantic component with which the syntactic component interfaces. The syntactic component is no longer required to support all the grammatical discriminations that need to be made without recourse to semantic information (although semantic information is still isolated in a separate component). Chater & Christiansen (2007) contrast the simplicity of the Minimalist Program and the Simpler Syntax, favoring the latter.

Double-R is founded on a linguistic theory (Ball, 2007b) which goes a step further in arguing that the functional need for a distinct syntactic component and purely syntactic representations can be disposed of in favor of linguistic representations and mechanisms which integrate structural, functional and grammatically relevant semantic information — although it has not yet been demonstrated that the model can cover the full range of linguistic phenomena addressed in the non-computational theory of Culicover and Jackendoff. As the rise of the Minimalist Program and the Simpler Syntax demonstrate, it is important to reevaluate purported functional mechanisms in light of theoretical and empirical advances, applying Ockham's razor judiciously.

Although it is important for a functionalist approach to be theoretically and empirically validated at reasonable points to avoid the proliferation of functional entities, it should be noted that the small-scale empirical method is not impervious to the proliferation of functional elements that threatens the functionalist approach. As Gray (2007) notes, the divide and conquer approach of experimental psychology has led to a proliferation of purported mechanisms within individual cognitive subsystems without due consideration of how these purported mechanisms can be integrated into a functional cognitive system. It is avoidance of this proliferation of mechanisms within individual subsystems that presumably motivates the minimalist bent within the development of ACT-R. Alternative cognitive modeling environments like COGENT (Cooper, 2002) are non-minimalist in that they support the exploration of multiple cognitive mechanisms without necessarily making a commitment to a coherent set of mechanisms for the architecture as a whole. It might be thought that COGENT would be a more compliant architecture for building functional systems. However, to the extent that a functional cognitive model needs to be coherent, COGENT functions more like a programming language and less like a cognitive architecture than ACT-R. The trade-off is an important one. The coherency of ACT-R constrains the range of possibilities for cognitive models more so than COGENT. Such constraint is functional if it pushes model development in the direction of likely solutions to complex cognitive problems without being overly constraining. As I have argued elsewhere (Ball, 2006), I view the constraints provided by ACT-R as largely functional and I consider the current level of success of Double-R to have been facilitated by the ACT-R cognitive architecture.

Besides a functionalist approach being at risk of becoming detached from empirical reality, to the extent that a complex cognitive system is being modeled, there is a risk of the complexity overwhelming development. It may be argued that the past failures of explicitly developed NLP systems have stemmed from the inability to manage this complexity. At the Cognitive Approaches to NLP AAAI symposium in fall 2007, Mitchell Marcus argued that large-scale NLP systems could not be developed without recourse to automated machine learning techniques. Indeed, most computational linguistic research aimed at development of large-scale systems has come to rely on the use of machine learning techniques. A side effect of this research direction is that it is more difficult to enforce cognitive constraints, since the machine learning computations are outside the direct control of the researcher. Further, it is not unusual for NLP systems created using machine learning techniques to contain thousands (or tens of thousands) of distinct linguistic categories, many of which have no mapping to commonly accepted linguistic categories. These systems perform extremely well on the corpora they were trained on. However, the underlying models are extremely complex and it looks suspiciously like they are over fitting the data (i.e. ignoring Tenenbaum's trade-off between simplicity and fit). That the test set for such models often comes from the same corpus as the training set (the annotated Penn Treebank Wall Street Journal Corpus) does not provide an adequate test of the generalizability of such models. As Fong & Berwick (2008) demonstrate, the Bikel reimplementation of the Collins parser is quite sensitive to the input dataset, making prepositional phrase attachments decisions that reflect lexically specific occurrences in the dataset (e.g. "if the noun following the verb is 'milk' attach low, else attach high").

The simplest rejoinder to the position put forward by Marcus is to develop a functionally motivated and explicit NLP system that proves him wrong. Easier said than done! Statistical systems developed using machine learning techniques dominate computational linguistic research because they outperform competing explicitly developed functional systems when measured on large annotated corpora like the Penn Treebank (Marcus, et al., 1993). However, there are reasons for believing that an explicitly developed functional system might eventually be developed which outperforms the best machine learning systems. In the first place, an explicitly developed functional system can take advantage of statistical information. Once an appropriate ontology of linguistic categories has been functionally identified, statistical techniques can be used to compute the probabilities of occurrence of the linguistic categories, rather than using brute force machine learning techniques to identify the categories purely on the basis of low-level distributional information. Instead of having categories like on the and is a identified on the basis of pure statistical cooccurrence in unsupervised systems, supervised systems can use phrase boundaries and functional categories (e.g. subject, object, head, specifier, modifier) to segment and categorize word sequences prior to computing cooccurrence frequencies. Statistical systems based on the annotated Penn Treebank corpus already make use of phrase boundary information, but these systems typically ignore the functional category information (including traces) provided in the annotations (Manning, 2007; Gabbard, Marcus & Kulick, 2006 is an exception). In general, the more high level functional information that can be incorporated into the supervised machine learning system, the better. The value of doing so is a more coherent system. Low level statistical regularities may be useful for low level linguistic analyses like part of speech tagging (and maybe even syntactic parsing), but to the extent that they are not functionally motivated, they are likely to impede the determination of higher level representations.

A good way to overcome complexity is to base development on a sound theory (back to Marr). The failure of earlier functional NLP systems may be due in large part to the weak or inappropriate linguistic representation and processing theory on which they were based. Staged models of language processing with autonomous lexical, syntactic, semantic and pragmatic components were never practical for large-scale NLP systems. The amount of nondeterminism they engender is fatal. For a system to be pseudo-deterministic, it must bring as much information to bear as possible at each decision point. The system must be capable of making the correct choice for the most part, otherwise it will be overwhelmed. The system must not be based on a strong assumption of the grammaticality of the input, nor assume a privileged linguistic unit like the sentence will always occur. Yet these are all typical assumptions of earlier systems which are often violated by the linguistic input. Psycholinguistics is currently dominated by a number of constraint based theories of language processing. These theories are largely valid, however, they tend to ignore the overriding serial nature of language processing. There must be some serial selection and integration mechanism operating over the parallel substrate of constraints, lest the system be incapable of making decisions until the entire input has been processed. Carrying multiple choices forward in parallel is only feasible if the number of choices selected at each choice point is kept to a minimum, preferably one, very infrequently more. Otherwise, the number of choices will proliferate beyond reasonable bounds and performance will degrade with the length of the input. Parallel, constraint based psycholinguistic models typically focus on the choice point of interest, often ignoring the possibility of other choice points (cf. Kim, Srinivas & Trueswell, 2002) and delaying selection until the end of the input when all constraints have had their effect (typically within a connectionist network). Even the large-scale system of Kim, Srinivas and Trueswell (2002) leaves unexplained how the supertags get incrementally integrated. Parallel computational linguistic systems — which cannot assume away choice points — typically impose a fixed-size beam on the number of choices carried forward, often much larger than is cognitively feasible to reduce the risk of pruning the correct selection before the end of the input when all co-occurrence probabilities can be computed.

In an integrated system it is possible to ask what is driving the interpretation of the current input and encode that information into the system. Is it the previous word which forms an idiom with the current word? Is it the part of speech of the previous word which combines with the part of speech of the current word to form a phrasal unit? Is it the preceding phrasal unit which combines functionally with the phrasal unit of the current word to form some higher level functional category? Utilities can be assigned to the different possibilities and the assigned utilities can be tested out on a range of different inputs to see if the system performs the appropriate integration in different contexts. If not, the system can be adjusted, adding functional categories as needed to support the grammatical distinctions that determine appropriate structures. For example, the word the, a determiner, is a strong grammatical predictor of a nominal. To model this, allow the to project a nominal construction, setting up the expectation for the head of the nominal to follow. On the other hand, the word the is a poor grammatical predictor of a sentence. Unlike left-corner parsers which typically have the project a sentence for algorithmic reasons, wait for stronger grammatical evidence for a sentence (or clause). If the word red follows the, in the context of the and the projected nominal, red is a strong predictor of a nominal head modifier. Allow the adjective red to project a nomimal head with red functioning as a modifier of the head and predicting the occurrence of the head. If the word is follows, is is a strong predictor of a clause. Allow is to project a clause with a prediction for the subject to precede the auxiliary is and a clausal head to follow. Since the nominal the red has been projected, allow the red to function as the subject, even though a head has not been integrated into the nominal. Note that the words the red are sufficient to cause human subjects to look for red objects in a visual scene in Visual World Paradigm experiments (e.g. Tanenhaus et al., 1995) — providing strong evidence for the incremental and integrated nature of language comprehension. Further, if there is only one red object, the red suffices to pick it out and the expression is perfectly intelligible, although lacking a head (and it is certainly a nominal despite the lack of a head noun). If the word nice follows is, in the context of is and the projected clause, allow the adjective nice to function as the clausal head. Let the lexical items and grammatical cues in the input drive the creation of a linguistic representation (cf. Bates & MacWhinney, 1987). When processing a simple noun like ball, in the absence of a determiner, allow ball to project a nominal in which it functions as the head. Both a determiner and a noun (in the absence of a determiner) are good predictors of a nominal, but they perform different functions within the nominal (i.e., specifier vs. head). Both an auxiliary verb and a regular verb (in the absence of an auxiliary verb) are good predictors of a clause. Allow the grammatical cues in the input and a functional ontology to determine which higher level categories get projected. This is the basic approach being followed in the language analysis model development.


A cognitively plausible, functional approach to the modeling of language comprehension has much to recommend it. Adhering to cognitive constraints on language processing moves development in directions which are more likely to be successful at modeling human language processing capabilities than competing approaches. Modeling a complex cognitive system has the potential to overcome the functional shortcomings of small-scale cognitive modeling research in addressing Newell's 20 questions critique. However, from the perspective of cognitive modeling, the approach may appear to be insufficiently grounded in empirical validation, and from the perspective of computational linguistics, the approach may appear to be computationally naïve and unlikely to succeed. What is needed is a demonstration that the approach is capable of delivering a functional system that is cognitively plausible. Lacking that demonstration, one can only conjecture about the feasibility of the methodology proposed in this section. It is hoped that Double-R Grammar will soon provide that demonstration.


Chapter 3: Pseudo-Deterministic Human Language Processing


The theoretical commitments underlying Double-R align with current linguistic theory in Cognitive Grammar (Langacker, 1987, 1991), Sign-Based Construction Grammar (Sag, 2010) and Conceptual Semantics (Jackendoff, 2002), and borrow ideas from Preference Semantics (Wilks, 1975) and Tree Adjoining Grammar (Joshi, 1987). A key goal of the research is development of a functional model that adheres to well-established cognitive constraints. Such constraints have evolved to be largely functional in humans (Ball et al., 2010). Double-R also borrows heavily from the comprehensive grammar of Huddleston & Pullum (2002, 2005) and the Simpler Syntax of Culicover & Jackendoff (2005; Culicover, 2009). A key feature of the grammar of Huddleston & Pullum (henceforth H&P) is the introduction of phrase internal grammatical functions like head, determiner (or specifier) and modifier. Lexical items and phrases may have alternative functions in different grammatical contexts. For example, a prepositional phrase may function as a modifier (or adjunct) in one context (e.g. He will eat dinner in a minute), and as a verbal complement in a different context (e.g. He put the book on the table). Although the typical subject (a clause level grammatical function) is a noun phrase, various clausal forms can also function as subject (e.g. That he likes you is true, Going to the movies is fun).

Differences from these grammatical treatments are largely motivated by constraints imposed by the incremental and interactive nature of HLP as reflected in the computational implementation. For example, wh-words occurring at the beginning of a sentence are uniformly assigned a wh-focus function that is distinct from the subject function. In Who is he talking to?, who functions as the wh-focus and he functions as the subject of the wh-question construction that is projected during the processing of who is…. In addition, who is secondarily bound to the object function of the locative construction projected during processing of the preposition to. Likewise, in Who is talking?, who again functions as the wh-focus, but in this case who is secondarily bound to the subject function. In contrast, H&P treat who as the subject in Who is talking? and as a pre-nucleus which is external to the main clause in Who is he talking to?. However, at the processing of who in an incremental processor, it is not possible to determine which function applies given the H&P grammar, whereas who is uniformly treated as the wh-focus in the pseudo-deterministic model. Further, the pseudo-deterministic model projects a uniform wh-question construction with both a wh-focus and subject function (allowing the subject to be bound to the wh-focus), whereas the grammar of H&P needs two different representations: one with a clause external pre-nucleus when the wh-word is not the subject, and one that is a simple clause when the wh-word is the subject. An incremental processor would need to project both alternatives in parallel to be able to efficiently process wh-questions beginning with who. Although this is possible, parallel projection of alternative structures must be highly constrained to avoid a proliferation of alternatives within the serial processing mechanism which has limited capacity to maintain alternative structures in parallel.

Parallel, Probabilistic Activation and Selection

Based on the current input (constrained to a 12 character perceptual window), current context and prior history of use, a collection of DM elements is activated via the parallel, spreading activation mechanism of ACT-R. The selection mechanism is based on the retrieval mechanism of ACT-R. Retrieval occurs as a result of selection and execution of a production — only one production can be executed at a time — whose right-hand side provides a retrieval template that specifies which type of DM chunk is eligible to be retrieved. The single, most highly activated DM chunk matching the retrieval template is retrieved. Generally, the largest DM element matching the retrieval template will be retrieved, be it a word, multi-unit word (e.g. a priori, none-the-less), multi-word expression (e.g. pick up, go out), or larger phrasal unit.

To see how the spreading activation mechanism can bias retrieval, consider the processing of the speed vs. to speed. Since speed can be both a noun and a verb, we need some biasing mechanism to establish a context sensitive preference. In these examples, the word the establishes a bias for a noun to occur, and to establishes a bias for a verb to occur (despite the ambiguity of to itself). These biases are a weak form of prediction. They differ from the stronger predictions that result from projection of constructions from lexical items, although in both cases the prediction may not be realized. In addition to setting a bias for a noun, the projects a nominal construction which establishes a prediction for a head, but does not require that this head be a noun. If the is followed by hiking, hiking will be identified as a present participle verb since there is no noun form for hiking in the mental lexicon. There are two likely ways of integrating hiking into the nominal construction projected by the:
  1. hiking can be integrated as the head as in the hiking of Mt. Lemmon
  2. hiking can project a modifying structure and set up the expectation for a head to be modified as in the hiking shoes
Since it is not possible to know in advance which structure will be needed, Double-R must chose one and be prepared to accommodate the alternative (accommodation may involve parallel projection of the alternative). Based on history of use (derived from the Corpus of Contemporary American English), hiking has a strong preference to function as a nominal head, so Double-R initially treats hiking as the head and accommodates shoes in the same way as noun-noun combinations (discussed below). This is in contrast to adjectives which have a strong preference to function as modifiers in nominals. Adjectives project a structure containing a pre-head modifying function and head, with the adjective integrated as the modifier and a prediction for a subsequent head to occur.

Although the parallel, probabilistic mechanism considers multiple alternatives in parallel, the output of this parallel mechanism is a single linguistic unit. For motivation at the lexical level, consider the written input car. Although this input may activate lots of words in memory, ultimately, the single word car is brought into the focus of attention (retrieved from memory and put in the retrieval buffer in ACT-R terms). If instead, the input is carpet or carpeting, a single, but different, word enters the focus of attention. If car were initially retrieved during the processing of car… (perhaps more likely in the case of spoken input), then it is simply overridden in the focus of attention if the input turns out to be carpet. Likewise for carpet… if it turns out to be carpeting. The processing of carpeting does not lead to car, carp, pet, and carpet all being available in the focus of attention along with carpeting (although these words may all be activated in DM). The single word that is most consistent with the input enters the focus of attention.

Serial, Pseudo-Deterministic Structure Building and Context Accommodation

To capture the incremental and immediate nature of HLP, we propose a serial, pseudo-deterministic processor that builds and integrates linguistic representations, relying on a non-monotonic mechanism of context accommodation with limited parallelism, which is part of normal processing, to handle cases where some incompatibility that complicates integration manifests itself.

The primary monotonic mechanisms for building structure within the serial mechanism include:
  1. integration of the current input into an existing construction which predicts its occurrence (substitution)
  2. projection of a new construction and integration of the input into this construction (Ball, 2007b).
For example, given the input the pilot, the processing of the will lead to projection of a nominal construction and integration of the as the specifier of the nominal. In addition, the prediction for a head to occur will be established. When pilot is subsequently processed, it is biased to be a noun and integrated as the head of the nominal construction projected by the.

Besides predicting the occurrence of an upcoming linguistic element, projected constructions may predict the preceding occurrence of an element. If this element is available in the current context, it can be integrated into the construction. For example, given the pilot flew the airplane, the processing of flew will lead to projection of a declarative clause construction which predicts the preceding occurrence of a subject. If a nominal is available in the context (as in this example), it can be integrated as the subject of the declarative clause construction.

In addition to these monotonic mechanisms, a projected construction may non-monotonically override an existing construction. For example, in the processing of the pilot light, the incremental integration of pilot as the head of the nominal construction will subsequently be overridden by a construction in which pilot functions as a modifier and light functions as the head.

The structure building mechanism involves the serial execution of a sequence of productions that determine how to integrate the current linguistic unit into an existing representation and/or which kind of higher level linguistic structure to project. These productions execute one at a time within ACT-R, which incorporates a serial bottleneck for production execution.

The structure building mechanism uses all available information in deciding how to integrate the current linguistic input into the evolving representation. The mechanism is deterministic in that it builds a single representation which is assumed to be correct, but it relies on the parallel, probabilistic mechanism to provide the inputs to this structure building mechanism. In addition, structure building is subject to a mechanism of context accommodation capable of making modest adjustments to the evolving representation. Although context accommodation is part of normal processing and does not involve backtracking or reanalysis, it is not, strictly speaking, deterministic, since it can modify an existing representation and is therefore non-monotonic.

Context accommodation makes use of the full context to make modest adjustments to the evolving representation or to construe the current input in a way that allows for its integration into the representation. It allows the processor to adjust the evolving representation without lookahead, backtracking or reanalysis, and limits the need to carry forward multiple representations in parallel or rely on delay or underspecification in many cases.

As an example of accommodation via construal, consider In this example, hiking is construed objectively and integrated as the head of a nominal even though it is a present participle verb.

As an example of accommodation via function shifting, consider When airspeed is processed, it is integrated as the head of the nominal projected by the. When restriction is subsequently processed, there is no prediction for its occurrence. To accommodate restriction, airspeed must be shifted into a modifying function to allow restriction to function as the head. This function shifting mechanism can apply iteratively as in the processing of where screw is the ultimate head of the nominal, but pressure, valve and adjustment are all incrementally integrated as the head prior to the processing of screw. Note that at the end of processing it appears that pressure, valve and adjustment were treated as modifiers all along, giving the appearance that these alternatives were carried along in parallel with their treatment as heads.

At a lower level, there are accommodation mechanisms for handling conflicts in the grammatical features associated with various lexical items. For example, the grammatical number feature singular is associated with a and the number feature plural is associated with few and pilots. In the singular feature of a is overridden by the plural feature of few and pilots and the nominal is plural overall (Ball, 2010a).

The preceding text argued for a parallel mechanism for selecting between competing structures combined with a serial mechanism for building structure given the parallel selection. The architectural mechanism which supports selection is ACT-R’s DM retrieval mechanism which returns a single structure. However, is it always the case that the input to the serial, structure building mechanism is a single structure? Just & Carpenter (1992) provide evidence that good readers (among CMU subjects) can maintain two alternative (syntactic) representations of ambiguous inputs in parallel during the processing of sentences which may contain a dispreferred reduced relative clause vs. whereas less good readers are limited to a single representation. So long as the preferred representation at the verb (i.e., the main verb reading) is ultimately correct, less good readers do well relative to good readers. But if the preferred representation at the verb is incorrect for a given input, less good readers do significantly worse than good readers at the point of disambiguation (i.e. less good readers are garden-pathed). However, according to the authors, Good readers are slower on ambiguous inputs vs. unambiguous inputs — e.g. the soldiers warned... vs. the soldiers spoke… — relative to less good readers.

Reduced relative clauses are special constructions which have generated a large amount of psycholinguistic research. Bever's (1970) famous example of a garden-path The horse raced past the barn fell stumps even good readers. Garden-path effects are explained as a disruption of normal processing requiring introduction of reanalysis mechanisms. Such disruption should not occur if competing alternatives are available in parallel. Other types of garden-path inputs exist. A classic example is the old train the young (Just & Carpenter, 1987). The garden-path effect after train suggests that readers make a strong commitment to use of train as a noun and do not have parallel access to the strongly dispreferred verb use during normal processing of this simple sentence. It is especially revealing that the garden-path effect occurs immediately after the processing of train, implying severe limits on parallel structures.

However, there are examples of the need for parallelism in structure building which have small but cumulative effects on normal processing (Freiman & Ball, 2010). Such examples provide evidence for a mechanism like context accommodation combined with a limited capacity to maintain multiple structures in parallel for efficiency. We have already briefly discussed the example the airspeed restriction where it was suggested that the processing of restriction causes airspeed to be shifted into a modifying function to allow restriction to be the head. There are at least three mechanisms for achieving this within the constraints of ACT-R. The first approach involves parallel projection of the structure needed to support the accommodation at the time airspeed is processed. The second approach involves projection of the needed structure at the processing of restriction. The third approach involves making the extra structure globally accessible. In the first approach, the processing of airspeed leads to its integration as the head of the nominal projected by the. In parallel, a structure which supports both a pre-head modifier and head is projected and made separately available (called an obj-head in Double-R). When restriction is processed, the initial integration of airspeed as the head of the nominal is overridden by this alternative structure. Within this structure, airspeed is shifted into the modifying function and restriction is integrated as the head. In ACT-R, this is accomplished in a single computational step via execution of a production which makes the needed adjustments. In the second approach, when restriction is processed in the context of the airspeed, a structure with a pre-head modifier function, in addition to a head, is projected. Restriction is integrated as the head of this structure and airspeed is shifted into the modifying function. This new structure then overrides airspeed as the head of the nominal. Within ACT-R, the second approach requires an additional computational step relative to the first approach. It is not possible to project the needed structure — which requires creation or retrieval of a DM chunk — and integrate that structure into another structure in a single procedural step. To avoid this extra computational step and bring Double-R into closer alignment with adult human reading rates (Freiman & Ball, 2010), Double-R originally adopted the first approach, and has more recently switched to the third approach. The rapidity with which humans process language (200-300 wpm for fluent adult readers) suggests that humans can learn to buffer needed information for efficiency. That information can be buffered either globally or locally (as needed). Global buffering of information has the advantage of simplifying local processing. When a noun is processed, it can be integrated into an object referring expression without the need to project an alternative object head (as in the first approach), since an object head with empty grammatical functions is globally available. When the globally accessible object head is used and the grammatical functions instantiated, the globally accessible object head need only be replaced with another uninstantiated object head. Currently, Double-R makes use of globally accessible structures to support the efficient processing of optional elements (e.g. modifiers), and relies on local projection for the processing of non-optional elements (e.g. arguments).

If the globally accessible object head structure supports both a pre- and post-head modifier, then post-head modifiers can also be accommodated. For example, in the book on the table, if integration of book as the head of the nominal projected by the occurs in parallel with a globally accessible object head structure with a prediction for a post-head modifier, then this structure can override the treatment of book as the head when a post-head modifier like on the table occurs. The primary alternative is to have the post-head modifier project the structure needed to accommodate both the head and the post-head modifier, and then override the previous head. Within ACT-R, this latter approach requires an extra computational step and is less efficient.

As another example of the need for context accommodation in incremental HLP, consider the processing of ditransitive verb constructions. Given the input he gave the…, the incremental processor doesn't know if the is the first element of the indirect or direct object. In he gave the dog the bone, the introduces the indirect object, but in he gave the bone to the dog, it introduces the direct object. How does the HLP proceed? Delay is not a generally viable processing strategy since the amount of delay is both indeterminate and indecisive as shown by:
  1. he gave the very old bone to the dog
  2. he gave the verb old dog the bone
  3. he gave the very old dog collar to the boy
  4. he gave the old dog on the front doorstep to me
In 1, the inanimacy of bone, the head of the nominal, suggests the direct object as does the occurrence of to the dog which is the prepositional form of the indirect object, called the recipient in Double-R. In 2, the animacy of dog in the first nominal, and the inanimacy of bone in the second nominal suggest the indirect object followed by the direct object. Delaying until the head occurs would allow the animacy of the head to positively influence the integration of the nominal into the ditransitive construction in these examples. However, in 3, the animacy of dog also suggests the indirect object, but dog turns out not to be the head. In 4, the animacy of dog which is the head, suggests the indirect object, but this turns out not to be the case given the subsequent occurrence of the recipient to me. There are just too many alternatives for delay to work alone as an effective processing strategy. Although there are only two likely outcomes — indirect object followed by direct object or direct object followed by recipient — which outcome is preferred varies with the current context and no alternative can be completely eliminated. And there is also a dispreferred third alternative in which the direct object occurs before the indirect object as in he gave the bone the dog. In Double-R, ditransitives are handled by projecting an argument structure from the ditransitive verb which predicts a recipient in addition to an indirect and direct object (this might be viewed as a form of underspecification). Although it is not possible for all three of these elements to occur together, it is also not possible to know in advance which two of the three will be needed. So long as Double-R can recover from an initial mistaken analysis without too high a cost, early integration is to be preferred. Currently, Double-R projects a nominal from the following the ditransitive verb and immediately integrates the nominal as the indirect object of the verb. Once the head of the nominal is processed, if the head is inanimate, the nominal is shifted to the direct object. If the first nominal is followed by a second nominal, the second nominal is integrated as the direct object, shifting the current direct object into the indirect object, if necessary. This argument shifting is in the spirit of slot bumping as advocated by Yorick Wilks (p.c.). If the first nominal is followed by a recipient to phrase, the first nominal is made the direct object, if need be. If the first nominal is inanimate and made the direct object and it is followed by a second nominal that is animate, the second nominal is integrated as the indirect object. It is important to note that the prediction of all three elements by the ditransitive verb supports accommodation at no additional expense relative to a model that predicted only one or the other of the two primary alternatives. However, unlike a model where one alternative is selected and may turn out to be incorrect, necessitating retraction of the alternative, there is no need to retract any structure when all three elements are simultaneously predicted, although it is necessary to allow for a prediction to be left unsatisfied and for the function of the nominals to be accommodated given the actual input.

The processing of ditransitive verbs is complicated further within a relative clause construction which contains an implicit complement (either the object or indirect object) that is bound to the nominal head. Consider
  1. the booki that I gave the man obji
  2. the mani that I gave iobji the book
  3. the mani that I gave the book to obji
In 5, the book is bound to the implicit object of gave within the relative clause based on the inanimacy of book. In 6, the man is bound to the implicit indirect object of gave based on the animacy of man. Note that animacy is the determining factor here. There is no structural distinction to support these different bindings. These bindings are established at the processing of gave without delay when the ditransitive structure is first projected. In 7, the man is initially bound to the indirect object, but this initial binding must be adjusted to reflect the subsequent occurrence of to which indicates a recipient phrase even though no explicit object follows the preposition.

Things get even more interesting if we combine a ditransitive verb construction with a wh-question and passive construction. Consider
  1. whati could hej have been given iobjj obji
In this case, neither the object nor indirect object of given occurs in canonical position within the ditransitive verb construction. In this example, the wh-focus what is bound to the implicit object, and the subject he is bound to the implicit indirect object. Again, the inanimacy of what and the animacy of he are the determining factors.

As a final example, consider the processing of the ambiguous word to. Since to can be both a preposition (e.g. to the house) and a special infinitive marker (e.g. to speed) it might seem reasonable to delay the processing of to until after the processing of the subsequent word. However, to provides the basis for biasing the subsequent word to be an infinitive verb form (e.g. to speed vs. the speed) and if its processing is delayed completely there will be no bias. How should the HLP proceed? If the context preceding to is sufficiently constraining, to can be disambiguated immediately as when it occurs after a ditransitive verb (e.g. He gave the bone to…). Lacking sufficient context, to can set a bias for an infinitive verb form to follow even though the processing of to is itself delayed until after the next word is processed. This is the default behavior of Double-R. However, Double-R also supports the recognition of multi-word units using a perceptual span for word recognition that can overlap multiple words (Freiman & Ball, 2010). With this perceptual span capability, an expression like to speed can be recognized as a multi-word infinitival unit and the processing of to need not be delayed in this context. Similarly, to the can be recognized as a prepositional phrase lacking a nominal head. Although not typically considered a grammatical unit in English, to the is grammaticalized as a single word form in some romance languages and its frequent occurrence in English suggests unitization. The perceptual span is roughly equivalent to having a limited lookahead capability. Overall, the processing of to encompasses a range of different mechanisms that collectively support its processing. Some of these mechanisms are specific to to, and others are more general.

Summary & Conclusions

This section proposes, empirically motivates and describes the implementation of a pseudo-deterministic model of HLP. The use of the term pseudo-deterministic reflects the integration of a parallel, probabilistic activation and selection mechanism, and non-monotonic context accommodation mechanism (with limited parallelism), with what is otherwise a serial, deterministic processor. The serial mechanism proceeds as though it were deterministic, but accommodates the changing context, as needed, without backtracking and with limited parallelism, delay and underspecification. The overall effect is an HLP which presents the appearance and efficiency of deterministic processing, despite the rampant ambiguity which makes truly deterministic processing impossible.

Extended Example Representation and Processing

This section presents an extended example that highlights many of the grammatical characteristics of Double-R. The rest of the document provides a phased introduction to these characteristics and more. The representations shown below and in the rest of this document were generated automatically by Double-R and are not hand-crafted.

What could he have been given to be eaten?

This diagram shows the integration of several different constructions and demonstrates many of the features of the grammar:
Double-R representations follow the grammar of Huddleston & Pullum (2002) in encoding both the grammatical type (or category) and the grammatical function of words and expressions. For example, the wh-word what has the grammatical type of a wh-nominal-determiner (wh-nominal-det) and functions as the grammatical head of a wh object referring expression (wh-obj-refer-expr). The grammatical type is a part of speech in the case of words. The part of speech of what is a composite part of speech that combines the wh-nominal part of speech with the determiner part of speech. Wh-nominal is a subtype of nominal (i.e. a word that functions like a nominal expression). This composite part of speech captures the full range of behavior of what Wh object referring expression is a phrase level grammatical type consisting of the single word what which functions as the wh-focus of a wh-question construction. Wh-question construction is a clause (or sentence) level grammatical type.

Grammatical types are organized into a multiple inheritance hierarchy such that the grammar can represent words and expressions at different levels of abstraction. (Support for multiple inheritance in ACT-R was provided in December 2013 by Mark Burstein of SIFT.) For example, wh-nominal-det is a subtype of the more general grammatical types wh-nominal and determiner, and wh object referring expression is a subtype of object referring expression and wh referring expression. For some grammatical purposes (e.g. determining the wh-focus) the grammar needs to know that a word is a wh-nominal; for other grammatical purposes (e.g. combining with a noun head) the grammar needs to know that the word is a determiner. The hierarchy in Double-R aligns with the hierarchy in Head-Driven Phrase Structure Grammar (HSPG) (Sag, Wasow & Bender, 2003) and Sign-Based Construction Grammar (SBCG) (Sag, 2010).

We prefer to use the terms grammatical type and grammatical category over grammatical form since the form of an expression of a given grammatical type can vary. This variability in form, combined with downplaying the importance of grammatical function, can lead to the treatment of words and expressions of radically different forms as though they were the same because the grammar appears to require it. As a simple example, in the running bull and the running of the bull, an approach based on syntactic form (derived on the basis of purely distributional information) might suggest that running is an adjective in the running bull and a noun in the running of the bull. This follows given the assumption that noun phrases (syntactic form) are necessarily headed by nouns (lexical form) and noun heads (syntactic form) are necessarily modified by adjectives (which precede the head). But this syntactic form based categorization is inconsistent with the morphological form of the word running which exhibits the progressive verb form ending -ing. In these examples, syntactic form and morphological form are incompatible. Both cannot be correct. When grammatical functions are considered, it is easy to see that running functions as the head in the running of the bull, and as a modifier of the head bull in the running bull. There is no need to suggest that running is other than a progressive verb in these examples. But this means giving up the deeply entrenched notion that noun phrases are syntactic forms that are headed by nouns, which are lexical forms. If the head of a noun phrase need not be a noun, then there is no universal basis for the syntactic form based category noun phrase. Looked at from a semantic point of view, noun phrases (better called nominals) are referring expressions (Lyons, 1977), whether or not they are headed by nouns. They are expressions that are intended to refer to objects. In the case of an expression like the running of the bull, the event of running by the bull is objectified — the overall expression refers to an object (i.e. a reified event) even though the head running is a verb. The meaning of running by itself, is not materially changed within this expression. There is no need to posit an entry in the mental lexicon for a noun version of running. Doing so would just add ambiguity that complicates processing. Nor is it desirable to have the word running in the mental lexicon without a part of speech specification (Borer, 2009, 2011). Having part of speech information is crucial to language analysis. For example, if the word the were not categorized as a determiner, then this information would not be available to help identify an object referring expression when the is processed. What the mental lexicon needs is information about the part of speech associated with the most common uses of a word. When a word is processed, the part of speech for the most common use will be retrieved, unless the linguistic context is sufficient to bias retrieval of the part of speech associated with a less common use. For a word like pilot which is genuinely ambiguous (i.e. it has multiple meanings), there will be two entries in the mental lexicon, one corresponding to the senses which are categorized as a noun, and another corresponding to the sense which is categorized as a verb. In the context of the as in the pilot, language analysis will be biased to retrieving the noun part of speech; in the context of to as in to pilot, language analysis will be biased to retrieving the verb part of speech. Having an adjective entry for pilot to handle inputs like the pilot light, where pilot modifies light, would only complicate processing. At the incremental processing of pilot, the locally best analysis is that it is a noun functioning as head. Only when light is subsequently processed, can it be determined that pilot is functioning as a modifier, not the head. At this point, the context accommondation mechanism kicks in and shifts pilot, the noun, from the head to a modifier function (and adjusts the meaning) so that light can function as the head. Having an adjective entry for pilot would complicate rather than facilitate processing, assuming an incremental processor as in Double-R, by creating an additional ambiguity at the processing of pilot that is not associated with any difference in meaning. Unfortunately, syntactic form based approaches lead to a proliferation of entries, even when there is no meaning difference. In Double-R, we avoid introducing ambiguity based on purely structural considerations. From an incremental processing perspective, it is crucial to limit the amount of ambiguity in the mental lexicon. Many of the representational assumptions of Double-R reflect such processing considerations. In this respect, Double-R differs from most other grammatical formalisms, even grammatical formalisms like HPSG for which computational implementations are available.

The term construction is used as a synonym for grammatical type in Double-R. Words are represented as instantiated instances of part of speech constructions (i.e. lexical constructions). At the phrase, clause and sentence level, a construction is the specification of an ordered sequence of grammatical functions followed by an unordered sequences of grammatical features that are abstracted from specific linguistic inputs during language acquisition. When the grammatical functions and features in the grammatical construction are filled in by words and expressions (in the case of grammatical functions) and feature values (in the case of grammatical features) during language analysis, the result is a linguistic or grammatical representation — i.e. an instantiated instance of the grammatical construction. For example, an object referring expression is a grammatical construction that specifies the grammatical functions specifier (spec), modifier (mod), head and post-head modifier (post-mod), and the normal order in which they occur, along with the unordered grammatical features definiteness (def), distance, person, number, gender, animacy and case. The grammatical functions may be filled by a range of words and expressions of different grammatical categories. The grammatical features have a fixed set of values (e.g. the number feature has the values singular and plural) (Ball, revised 2013, 2012). The modifier (mod) and post-head modifier (post-mod) functions are always optional. At a minimum, either a specifier (spec) or a head must occur. If a specifier occurs (e.g. the in the books functions as a specifier), the construction is projected from the specifier (i.e. the processing of the leads to projection of an object referring expression in which the functions as the specifier). If no specifier occurs (e.g. books functions as the head in books), the construction is projected by the head and the specifier function is empty. Having an empty specifier is the Double-R equivalent of the MP (Minimalist Program) treatment of books as a DP (determiner phrase) with an empty functional head (cf. Carnie, 2011), except that Double-R rejects the functional head hypothesis (Abney, 1987), claiming instead that the functional head hypothesis is wrong-headed! This claim is based on a commitment to providing a semantic basis for the head function (Ball, 2007a). The head is the semantically most significant element of an expression. Functional heads do not satisfy this semantic requirement. For this reason, they are often optional leading to the need to posit lots of headless constructions. For example, the nominals John, books, and rice all lack a determiner and must have an empty head if functional heads are assumed (cf. Carnie, 2011). On the other hand, *the, and *a are ungrammatical despite having a "functional head" (scare quotes intended) since they lack sufficient semantic content to function as referring expressions. Even more telling, pronouns like he and she normally do not allow a determiner (e.g. *the he, *the she), making the claim that there is an empty determiner entirely suspect and forcing some linguists to conclude that pronouns are really pro-determiners (cf. Cooke & Newsome,1997)! Comparing the specifier and head functions in Double-R's instantiation of these functions, the specifier is far more likely to be optional. This is because the specifier, which primarily functions to establish reference via the definiteness grammatical feature in the case of nominals, is not the only source of definiteness, which is often jointly encoded by the lexical head. Proper nouns, plural nouns, mass nouns and pronouns (not pro-determiners) functioning as heads are all capable of projecting the definiteness grammatical feature which is otherwise the primary contribution of the specifier. Interestingly, singular count nouns appear not to project a definiteness feature (e.g. *book is good) and require separate specification in normal English (e.g. the book is good) (Ball, revised 2013).

Returning to the example, the word what is categorized as a wh-nominal-det that functions as the head of a wh object referring expression. The wh object referring expression functions as the wh-focus of the wh-question construction. The wh-focus is a grammatical function that is specific to wh-questions. From an incremental processing perspective, it is important to retain access to the wh-focus beyond its initial processing to support the resolution of long-distance dependencies. In the example, the wh-focus what also functions as the implict object of given (and the implict subject and implicit object of to be eaten). In Double-R, access to the wh-focus is supported by retaining it in a special wh-focus buffer. At the incremental processing of given, the wh-focus is available in the wh-focus buffer to be bound by the implicit object of given (the object is predicted by the ditransitive verb construction associated with given). The use of buffers to provide long-distance access to grammatical functions like the wh-focus is the Double-R equivalent of movement (or copying) in generative grammar. When the implicit object of given is bound to the wh-focus, the wh-focus is copied into an object buffer to support subsequent processing. At this point in processing, the same linguistic element (the wh object referring expression headed by what) is accessible in two buffers which reflect its dual grammatical functions. It is the accessibility of what in the object buffer which supports the binding of the implied subject and object of to be eaten to it. The subject and object of to be eaten are predicted by the transitive verb construction associated with eaten. The binding of the implied object of to be eaten to the implied subject is supported by the passive construction which is cued by the -en form of eaten which projects a passive voice feature. Ultimately, the wh-word what winds up filling four distinct grammatical functions:
  1. wh-focus of wh-question headed by given
  2. implicit object of given
  3. implicit subject of to be eaten
  4. implicit object of to be eaten

To support the processing of embedded clauses, Double-R maintains separate matrix and embedded clause buffers for the subject, object and indirect object (e.g. subject buffer vs. embedded subject buffer; object buffer vs. embedded object buffer).

There are three kinds of implicit argument in Double-R:
  1. PRO
  2. trace
  3. empty (e)
The terms PRO and trace are adapted from generative grammar (Chomsky, 1987) and empty (e) comes from the grammar of Huddleston and Pullum. PRO corresponds to the implicit subject of a tenseless clause. The implicit subject of an infinitive situation referring expression like to be eaten is a PRO. If there is another referring expression in the input that is co-referential with the implicit subject, then the implicit PRO subject is bound to that referring expression via the bind index (bind-indx). Note that the bind index of the wh-focus what and the PRO subject of to be eaten are the same (5), indicating co-reference. This binding is actually indirect in this example. The trace object of given is bound to the wh-focus what. The PRO subject of to be eaten is subsequently bound to the trace object of given. Trace corresponds to a displaced argument — i.e. a predicted argument of a relation that does not occur in canonical position. If there is another referring expression in the input that is co-referential with the implicit argument, then the trace is bound to that referring expression. The distinction between PRO and trace comes from the generative grammar view that a trace involves movement. The constituent that is moved leaves behind a trace. While there is no actual movement in Double-R, there are focus constructions which behave very much as though movement is involved (based on accessibility in buffers). A Wh-Question is a focus construction with a special wh-focus function that does not occur in a normal declarative construction. Whenever there is an expression filling a focus function in a construction, there will be a corresponding grammatical function without an explicit expression. This correspondence is typically one of co-reference. Empty (e) subject arguments occur in tensed imperative clauses. Unlike PRO and trace which typically refer via binding to an existing referring expression (i.e. via co-reference), empty (e) arguments refer deictically like the pronouns you and we.

The word could has the grammatical type of an auxiliary verb (auxiliary) (subcategorized as a modal auxiliary, but not shown in the diagram) and functions as an operator within the wh-question construction. The operator function is specific to a limited number of constructions including wh-questions and yes-no-questions. The operator function comes from the grammar of Quirk et. al (1972, 1985) and is adopted in Double-R. Could also has the grammatical features tense-1 = finite (fin), tense = present (pres), and modality = could. These features project to the clause.

The word he has the grammatical type of a personal pronoun (pers-pron), another subtype of pronoun, and functions as the head of a pronoun object referring expression (pron-obj-refer-expr). The grammatical function head was important in the analysis of Chomksy (1970) which introduced X-Bar Theory. Double-R adopts this grammatical function and aspects of X-Bar Theory (Ball, 2007a), but differs from Chomsky (1970) in explicitly representing the head function, rather than assuming a configurational identification. This pronoun object referring expression functions as the subject (subj) of the wh-question construction.

The multi-word unit have been has the grammatical type of a double word auxiliary verb (aux+aux) and functions as the specifier (spec) of the wh-question construction. The specifier function also originates in Chomsky (1970) where it was applied to auxiliary verbs and determiners. Double-R adopts this grammatical function and the analysis of specifiers in Chomsky (1970) — which has since changed, but differs from Chomsky (1970) in representing the specifier function explicitly, rather than configurationally. Have been also projects the features aspect = perfect (perf) (from have) and voice = inactive (inact) (from been) to the clause.

The word given has the grammatical type of a verb and functions as the head of a predicate ditransitive verb construction, which in turn functions as the head of the wh-question construction. Given also has the grammatical features voice = passive (pass) and aspect = perfect (perf). These features project to the clause. The passive voice feature of given overrides the inactive (inact) feature of been. The perfect aspect feature of have is also overriden by the perfect aspect feature of given. The result is that no grammatical features from have been survive. However, have been still has a grammatical function since given cannot occur after could which must be followed by a base form verb (e.g. he could give vs. *He could given). Been is also needed since given following have is not passive, but active voice (e.g. I have given is active voice despite given). The active voice feature of have blocks the passive voice feature of given from projecting in I have given. As a special case, been has the effect of neutralizing the active voice feature of have effectively making it inactive (only been can do this), so that the passive voice feature of given can project (i.e. the passive voice feature of given overrides the inactive voice feature of been). The projection of verb features is considered in detail in Ball (2012) and in the verbal features section of this document.

The multi-word unit to be is the infinitive verb form of be. It functions as the specifier of the embedded clause headed by eaten. To be encodes the grammatical features tense-1 = non-finite (non-fin), tense = none (not shown in the diagram), and voice = inactive (inact) (also not shown since it is overriden).

The word eaten has the grammatical type of a verb and functions as the head of predicate transitive verb construction, which functions as the head of an embedded clause, which functions as the clausal complement (comp) of a predicate ditransitive verb construction. Most predicates optionally take a clausal complement, typically an infinitive clause. In this case, the clausal complement is also a passive construction due to the passive form eaten which functions as the head and projects the feature voice = passive (pass). Eaten also projects the feature aspect = perfect (perf) to the clause.

Passive constructions differ from focus constructions in that although there is a trace expression, there is no specialized focus function. Instead, the subject grammatical function is used to realize the trace expression. To represent this, Double-R binds the trace expression, either the trace object or the trace indirect object, to the subject. Double-R also projects a voice grammatical feature with the value passive (pass) to indicate the passive construction. Although the passive voice feature is projected from past-participle verb forms (given and eaten), it is only displayed at the level of the clause in the diagrams.

In sum, trace expressions are licensed by constructions like wh-question and passive which suggest movement, especially when a focus function is involved. PRO expressions are licensed by constructions like infinitives which do not suggest movement, but where the subject function is implicit. Trace and PRO expressions get their grammatical features indirectly from the referring expressions to which they bind. There is a third category of implicit argument called empty (e). In Give me it!, the implicit subject of give is categorized as e, with the grammatical features person = second, number = plural and animacy = human, which identify the implicit subject as equivalent to the pronoun you. In Let's go!, the implicit subject is also categorized as e with the grammatical features person = first, number = plural and animacy = human, which identify the implicit subject as equivalent to the pronoun we.

Incremental Processing

In this subsection we step through the incremental processing of the example sentence


The processing of the word what leads to projection of a wh object referring expression (wh-obj-refer-expr) that is placed in the wh-focus buffer. The processing of what does not lead to projection of a wh-question construction because what alone does not distinguish between wh-questions and wh situation referring expressions (wh-sit-refer-expr) like what he wanted was a lollipop.

Buffers — Double-R uses language specific buffers to retain the products of language analysis to support subsequent processing. A description and full list of buffers is available in the buffers section of this document.

what could

The processing of could following what provides sufficient context to project a wh question situation referring expression (wh-quest-sit-refer-expr). In this construction, the wh-obj-refer-expr projected by what and stored in the wh-focus buffer is integrated as the wh-focus. An implicit PRO subject is created and bound to the wh-focus via the bind index (bind-indx) slot with matching value 5. The wh-focus is copied into the subject buffer, but also remains in the wh-focus buffer.


what could he

The processing of he in the context of what could causes the implicit PRO subject to be replaced by the pronoun object referring expression (pron-obj-refer-expr) projected by he. This pron-obj-refer-expr overrides the wh-obj-refer-expr that was in the subject buffer. The auxiliary could is moved from the specifier function to the operator function and placed in the operator buffer. The specifier buffer is emptied.


what could he have been

The processing of have been leads to its recognition as a multi-word auxiliary (aux+aux) that is integrated as the specifier of the wh-quest-sit-refer-expr construction. In addition, a trace pron-obj-refer-expr is created and bound to the wh-focus. This demonstrates the greedy nature of the mechanism for resolving long-distance dependencies involving focus elements. Note that What could he have been? is a complete question with what functioning as the implicit trace head. The wh-focus is copied into the predicate buffer where the head of the wh-quest-sit-refer-expr that is in the situation buffer is retained.


what could he have been given

The processing of given leads to projection of a predicate ditransitive verb (pred-ditrans-verb) construction that replaces the trace pron-obj-refer-expr as the head of the wh-question. The pred-ditrans-verb is placed in the predicate buffer. Because this is a passive construction, a trace pron-obj-refer-expr is created and integrated as the indirect object (iobj). The trace indirect object is bound to the subject. The subject is copied into the indirect object buffer. Because this is a wh-focus construction, a trace pron-obj-refer-expr is created and integrated as the object (obj). The trace object is bound to the wh-focus. The wh-focus is copied into the object buffer. The decision to bind the trace indirect object to the subject and the trace object to the wh-focus results from the human feature of the subject he, and the inanimate feature of the wh-focus what.


what could he have been given to be

The processing of to be leads to projection of an infinitive situation referring expression (inf-sit-refer-expr) with the auxiliary to be functioning as the specifier and an expectation for a head indicated by the slot value head-indx. In the context of a matrix predicate with an available (clausal) complement (comp) slot, the inf-sit-refer-expr is integrated as the complement. An implicit PRO subject is created and bound to the indirect object of the matrix predicate. This is the default preference since indirect objects are often human or animate. The pron-obj-refer-expr with head he (in the matrix subject buffer) is copied into the embedded subject buffer which is distinct from the matrix subject buffer. The inf-sit-refer-expr is placed in the embedded situation buffer which is also distinct from the matrix situation buffer. Since there is only one specifier buffer, this buffer must be reused to process the embedded clause.


what could he have been given to be eaten

The processing of eaten leads to projection of a predicate transitive verb construction which is integrated as the head of the inf-sit-refer-expr in the embedded situation buffer. Since this is also a wh-focus construction (i.e. the wh-focus crosses clause boundaries), a trace object is projected and bound to the wh-focus. The wh-focus is copied into the embedded object buffer. In addition, since this is a passive construction, the implied PRO subject is bound to the embedded object which is now the wh-focus. Finally, the embedded object is copied into the embedded subject buffer.


What could he have been given to be eaten?

The processing of ? confirms the earlier decision to project a wh-question construction. In Double-R, punctuation typically guides processing, but doesn't otherwise appear in representations. However, it is possible for sentence final punctuation to alter the discourse function of a sentence. When this happens, the matrix clause level discourse function (df) slot is provided to indicate this. However, discourse function is better represented at the level of discourse. Unfortunately, Double-R currently has only the beginnings of a capability to provide discourse level representations.

Incremental Processing Summary

We have now finished the incremental processing of the example sentence. At each step in processing, Double-R builds the best representation it can given the current input and preceding context. However, we have seen numerous places where the locally best choice had to be adjusted given the subsequent input. Such is the nature of language analysis in Double-R, and presumably in Human Language Processing, as well. Most of the examples in the rest of this document discuss the processing of simpler inputs, building up to the kind of integrated processing required to handle this extended example.


Chapter 4: Sub-Lexical Representation and Processing

Phonetic, Phonological and Syllabic Representation and Processing

The focus of the computational implementation of Double-R is on the processing of written text. For this reason, the representation and processing of phonetic, phonological and syllabic knowledge is absent. However, it is clear that phonetic, phonological and syllabic knowledge does contribute to the processing of written text. This is especially true in the processing of text chat which makes extensive use of sound correspondences to shorten messages as in A more complete system would incorporate phonetic, phonological and syllabic knowledge even for the analysis of written text.

At this point, the best defense that can be made of Double-R is that it is amenable to the addition of phonetic, phonological and syllabic knowledge. This stems from the interactive and incremental nature of the language analysis capabilities in Double-R. Once added, phonetic, phonological and syllabic knowledge will interact with knowledge at other levels of linguistic representation to improve the behavior of Double-R.


Orthographic Representation and Processing

As a computational compromise, Double-R only encodes knowledge about the letters and trigrams which make up the words in the mental lexicon. The letters and trigrams have been computed automatically. Other forms of orthographic representation that have been proposed on the basis of psycholinguistic analysis (e.g. syllables, roots, stems, BOSS) are not used because of the challenges of computing them automatically.

We would especially like to see a representation of syllables as sequences of letters in the mental lexicon and plan to include them once a suitable resource is identified.


Morphological Representation and Processing

Double-R currently has only rudimentary morphological representation and processing capabilities. The lack of morphological representation and processing capabilities is compensated for to large extent by the inclusion of the morphological variants of words as separate entries in the mental lexicon. The reasons for including morphological variants in the mental lexicon are primarily due to processing and usage considerations. Processing is more efficient and less error prone when morphological variants are included. If morphological variants are not included in the mental lexicon, then there is no direct match from the physical form of the input to the representation in the mental lexicon. For example, consider the input The words is, reading and books are highlighted because they are morphological variants that are derivable from the base forms be, read and book. If is, reading and books are not represented in the mental lexicon, then it will be necessary to first determine that the form is matches the base form be in the mental lexicon, reading matches the base form read in the mental lexicon, and books matches the base form book in the mental lexicon. Once this is done and assuming it is successful given the mismatch in form (especially in the case of is and be which are completely different in form), the word is and the endings -ing and -s must be separately processed to determine which morphological variant is associated with the base form retrieved from the mental lexicon. In general, this extra processing is undesirable. Instead, if the morphological variants is, reading and books are represented in the mental lexicon, then a direct match between the form of the input and the form in the mental lexicon will be available and will facilitate processing.

Since humans are exposed to morphological variants and not base forms in linguistic inputs, a usage based account suggests that they will be represented as such in the mental lexicon. Determining the base form is a morphological process that occurs on the basis of morphological knowledge given morphological variants. Once these morphological regularities are learned, they can be used to analyze previously unseen inputs. But for inputs that have been experienced, there is no good reason for them not to be encoded in the mental lexicon. To argue otherwise, would be to commit what Langacker (1987) calls the Rule/List Fallacy A grammar which excludes lists where rules are available might be considered to be more elegant, but from a processing perspective, it will be far less efficient — especially given the large capacity for storing knowledge in the human brain and the efficient retrieval mechanisms that are available for accessing that knowledge.

Of course, Double-R has the opposite problem. It largely lacks a representation of morphological rules which are needed to support the processing of novel words which aren't available in the mental lexicon. Nonce expressions like the newspaper boy porched the newspaper (Clark & Clark, 1979), where porched means threw the newspaper on the porch are a classic example. A morphological analysis capability that recognizes -ed as the past tense verb form would be very helpful for making sense of this novel use of porch.


Chapter 8: The Situation Model and the Mental Universe

Double-R Grammar representations identify the referring expresions in the linguistic input and and the relationships between those referring expressions. These grammatical representations capture aspects of referential and relational meaning. However, they are composed of words and are not full representations of meaning since they are symbolic. For a fuller representation of meaning, the referring expressions need to be associated with the objects and situations to which they refer. We call this association one of grounding in the sense of Harnad (1999) and his symbol grounding hypothesis. The objects and situations to which referring expressions refer are themselves conceptual representations in the spirit of Jackendoff (2002). There is no direct reference to the external world. When we encounter the world, we build up a representation of that world which is referred to as a Situation Model (van Dijk & Kintsch, 1983). Grammatical representations tap into and can help create the situation model as does our mental universe which contains the knowledge of the world acquired over a lifetime of experience. The mental universe provides the context for creation of a situation model that goes beyond the current situation and linguistic input. Despite the availability of linguistic input, situation model representations are conceptual, not linguistic. Where we differ from Jackendoff is in our unease in using uppercase words as stand-ins for conceptual representations. We refer to the use of uppercase words for concepts as uppercase word syndrome. Whatever conceptual representations are, they are not uppercase words. However, lacking an adequate representational system for concepts, we currently do no better than Jackendoff. Our situation model representations contain word indices which generalize over morphological variants (e.g. book-indx for book and books) and we drop the uppercase pretense.

When we have a better theory of how the human experience of objects and situations results in conceptual representations of those objects and situations, we will be on a firmer footing for grounding Double-R representations and providing a fuller representation of meaning.

For now, there are a few assumptions we try to adhere to, including: We follow Hobbs (2003) in attempting to minimize the mapping from linguistic to conceptual representation. We also follow Hobbs (1985) in taking an ontologically promiscuous approach to semantic representation and eschew logical approaches like First Order Predicate Calculus which minimize the number of ontological categories. First order logic may be fine as a logical formalism, but it is inadequate to represent the range of meanings conveyed in natural language. If anything, conceptual representations will be more complex than grammatical representations.

Double-R grammatical representations lack semantic roles like agent and patient. We assume instead, that these roles are captured in the mapping from grammatical to conceptual representation. The referring expression functioning as the subject of an active sentence with a verbal predicate typically maps to the object that functions as the agent of the action in the situation model representation. The referring expression functioning as subject of a passive sentence with a transitive verbal predicate typically maps to affected object in the situation model representation. While the details of this mapping are not yet worked out in Double-R, the alternative of assigning semantic roles to referring expressions in grammatical representations is not supported by the grammatical evidence. There is considerable grammatical evidence for functions like subject and object, but little grammatical evidence for the grammatical encoding of semantic roles like agent and patient.

Other than stating these assumptions, we will have little to say about situation model representations. To be fair to ourselves, it is not clear that a theory of conceptual representation belongs in a grammatical description. For the purposes of this document, the conceptual system is essentially a black box, as it is and has always been in Chomsky's theories. With respect to generative grammar, we view Double-R representations as similar to LF representations, except that quantification is represented grammatically, not logically, in Double-R, and Double-R representations are semantically motivated and not purely syntactic. With respect to Jackendoff's conceptual representations, we view them as comparable to Double-R grammatical representations consisting of word indices, rather than concepts.

Caveats aside, we have actually implemented a domain specific situation model in our synthetic teammate project (Rodgers et al., 2012), and the results of that effort will inform future research aimed at creating a more general situation modeling capability as well as a representation of the mental universe.

From Referring Expression to Referent


In Jackendoff’s Conceptual Semantics (Jackendoff, 1983, 1990, 2002, 2007), reference to places, directions, times, manners, and measures in addition to situations and objects is supported, but reference is limited to tokens or instances of these conceptual categories, adhering to the basic notion that reference is to individuals. We propose an extension of Jackendoff’s referential types along an orthogonal dimension of reference which is cognitively motivated in suggesting the possibility of referring to types, prototypes and exemplars in addition to instances. Reference to classes and collections of referential types and vacuous instances and collections is also considered. The primary motivation for expanding the ontology of referential types is to simplify the mapping from referring expressions to corresponding representations of referential meaning. Hobbs (2003) pursues a similar strategy in arguing for logical representations that are as close to English as possible. Jackendoff’s (1983, p. 13-14) grammatical constraint makes a related claim: Taking the grammatical constraint seriously, we assume that if a linguistic expression has the grammatical form of a referring expression, then it is a referring expression. For example, a nominal like a man which contains the referential marker a, indicates that the expression can be used to refer. Unless there is a very strong reason to assume that any use of this referring expression is non-referential, it is assumed to refer. Further, the referential marker a indicates reference to a single referent as does the head noun man (i.e. both are grammatically singular). This expression cannot be used to refer to multiple individuals under normal circumstances.

Where other approaches argue for the non-referential use of referring expressions or for a complicated mapping from referring expression to possible referents (see discussion below), it is argued instead that referring expressions may refer to something other than an individual, and that the notion of reference is complicated by a secondary relationship between the referents in a situation model and objects in the mental universe. By expanding the ontology of referential types to include types, prototypes and exemplars, and classes and collections of these, it is possible to retain a simplified mapping from referring expression to referent — one which is consistent with the grammatical features of the referring expression. By introducing a bi-partite relationship between a situation model and the mental universe, it is possible to explain apparent non-referential uses of referring expressions. The viability of this approach hinges on adoption of the mentalist semantics of Jackendoff. Reference is to mental encodings of external experience and these encodings can provide alternative construals of reality. There is no direct reference to actual objects in the external world.

Theoretical Background

Ball (2007) presents a linguistic theory of the grammatical encoding of referential and relational meaning which is implemented in a computational cognitive model of language comprehension (Ball, Heiberg & Silber, 2007; Ball et al., 2010) within the ACT-R cognitive architecture (Anderson, 2007). The basic structure and function of nominals and clauses is bi-polar with a specifier functioning as the locus of the referential pole and a head functioning as the locus of the relational pole — where relational pole encompasses objects (noun, proper noun, pronoun) and relations (verb, adjective, preposition, adverb). If the head of the relational pole is a relation, one or more complements or arguments may be associated with the relation. Modifiers may surround the specifier and head and may be preferentially attracted to one pole or the other. A specifier and head combine to form a referring expression. A determiner functioning as an object specifier combines with a head to form an object referring expression or nominal (ORE → Obj-Spec Obj-Head). A possessive nominal (e.g. John’s in John’s book) or possessive pronoun (e.g. his in his book) functioning as specifier and called a reference point by Taylor (2000) may also combine with a head to form an object referring expression. In this case the object referring expression contains two referring expressions:
  1. the reference point functioning as specifier
  2. the referring expression as a whole
Ball (2010; revised 2013) extends the theory of referential and relational meaning to a consideration of grammatical features like definiteness, number, animacy, gender and case in object referring expressions. These features provide important grammatical cues for determining the referents of object referring expressions. The referring expressions in a text instantiate and refer to objects, situations, locations, etc. in a situation model which is a representation of the evolving meaning of the text. The term situation model originates in the research of van Dijk & Kintsch (1983). Originally a situation model was viewed as a collection of propositions extracted from a text and elaborated with additional propositions introduced by schemas activated by the text and resulting from inference processes operating over the text. However, situation models have evolved away from being purely propositional (or relational) representations towards encoding referential, spatial, imaginal and even motor aspects of meaning (cf. Zwann and Radvansky 1998). We view the situation model as the cognitive locus of Jackendoff’s Conceptual Semantics. Jackendoff has adopted similar extensions in his recent work (Jackendoff, 2002, 2007).

A situation model is a mental scratchpad for maintaining information about the referents of the referring expressions in a text. However, referents can also be implicit in the text, inferred from background knowledge or encoded from the environment. The situation model is constructed in the context of a mental universe. The mental universe is the experience of the real world filtered through the perceptual and cognitive apparatus of an individual over the course of a lifetime. Like situation models, the mental universe may be full of counterfactual objects and situations. An individual may have a long history of experience of unicorns, both perceptual (e.g. from movies and picture books) and linguistic, despite the fact that unicorns only exist as figments of imagination in objective reality. The mental universe may also have well established and distinct referents for the morning star and the evening star, despite the fact that these referents map to the same planet in objective reality.

The combination of the mental universe and the situation model provide the basic sources for grounding the meaning of referring expressions. A referring expression may be bound to a referent in the situation model which may or may not be ground in the mental universe. If the referent is ground in the mental universe then the individual has personal experience of the referent. If the referent is not ground in the mental universe, then the individual has only limited information about the referent and it may appear that the referring expression is non-referential. But as Lyons (1977) notes, allowing referring expressions to be non-referential is problematic for co-reference. Two expressions cannot have the same reference, if one of them is not a referring expression at all (Ibid, 191). In John’s murderer, whoever he is…, he co-refers with John’s murderer. The attributive use of a referring expression like John’s murderer is a type of reference which instantiates a referent into the situation model that is not grounded in the mental universe, but which supports co-reference.

The ontology of referential types presented in this document follows from basic principles of Cognitive Linguistics (cf. Langacker, 1987; Lakoff, 1987) and Cognitive Psychology (Rosch, 1975; Collins and Quillian, 1969). There is extensive empirical evidence supporting the existence of conceptual categories corresponding to types, prototypes and exemplars. We take the small step of suggesting that such conceptual categories can be referred to by linguistic expressions and explore the consequences.

The representation of referents in the situation model parallels the representation of referring expressions. Both are represented in ACT-R as chunks — i.e. frames with collections of slot-value pairs. Chunks are organized into an inheritance hierarchy which supports default inheritance and a distinction between chunk type and chunk instance. The value of a slot may be a chunk, supporting complex representations of structure needed for linguistic and Conceptual Semantic representation. With respect to object referring expressions which are the focus of this section, a chunk representing an object referring expression is bound to a corresponding referent via a matching value in an index slot. Depending on the object referring expression, situation model and mental universe, the referent may be an instance, type, prototype, exemplar, class or collection.

An Expanded Ontology of Referential Types

First Order Predicate Calculus (FOPC) is typically grounded in a model theoretic semantics with an ontology limited to atomic individuals. The model consists of a domain and a set of individuals in that domain and nothing else. Typically these individuals are assumed to correspond to objects (or individuals) in the real world being modeled. In FOPC, a relation is modeled in terms of the set of individuals (for 1-ary relations or properties) or set of ordered sets of individuals (for n-ary relations, n > 1) for which the relation is true. A relation with its arguments bound to individuals in the domain is either true or false of those individuals and it is said that the reference of the proposition is one of the values true or false.

Situation Semantics (Barwise and Perry, 1983) extends FOPC by allowing situations to be individuals. Not only are situations true or false of sets of individuals in the domain being modeled, but they are themselves individuals in the domain. We may say that situations have first-class status in situation semantics, whereas they are a second-order (or derived) notion in standard FOPC.

Situation Semantics is a step in the right direction. Whereas it might make reasonable sense to suggest that a predicate like dog denotes the set (or class) of individuals that are dogs (although psychologically humans cannot quantify over such a large set), it makes little sense to suggest that the predicate run denotes the set of all individuals who run, or that kick denotes the set of ordered sets of kickers and kickees, as is typical in FOPC treatments with a set-theoretic model limited to individuals that are essentially objects of various types (and sets of such individuals). (It is this sleight of hand in FOPC that collapses the distinction between nouns and verbs, treating both as predicates corresponding to sets of individuals.) It is much more reasonable to suggest that run denotes the set of all running events and that kick denotes the set of all kicking events. And if run denotes a set of running events and kick a set of kicking events, then allowing run to be used in an expression that refers to an instance of a running event, and allowing kick to be used in an expression that refers to an instance of a kicking event, follows quite naturally and is cognitively plausible. However, Situation Semantics stops short. What is needed is a referential ontology which supports a mapping from the types of referring expressions which are linguistically attested to the types of referents which are cognitively motivated.

With an ontology of referential types limited to individuals and sets of individuals, it is often assumed that a referring expression like a car in an expression like a car is a vehicle quantifies over the set of all individuals for which the predicate car is true (i.e. the set or class of objects of type car). In FOPC, this can be represented as However, from a grammatical perspective, a car is clearly singular, and from a cognitive perspective, quantifying over all individuals is cognitively implausible. The need to quantify over all individuals in the FOPC representation of the linguistic expression stems from the limited ontology available in FOPC for representing the meaning of indefinite referring expressions. Only the universal and existential quantifiers — which fail to capture the full range of quantification in natural language — are available. Similarly, one FOPC representation for the expression every man owns a car is given by However, in English every man is grammatically singular, and a mapping to the universal quantifier is problematic. Johnson-Laird (1983) introduced mental models as a way of overcoming the limitations of quantification in FOPC (among other things). He suggests that the expression a car in the sentence every man owns a car maps to some representative subset of cars. This representative subset of cars corresponds to the representative subset of individuals referred to by every man, plus a subset of cars that are not owned. He (1983, p. 421) represents this as But if every man and a car are singular and not plural, then every man does not refer to multiple men and a car does not refer to multiple cars. Johnson-Laird’s treatment is cognitively plausible, but inconsistent with the grammatical form of the referring expressions. From a perspective which assumes that the number feature of a referring expression corresponds closely to the number feature of the referent of the expression, there are several cognitively motivated referents for expressions like every man and a car which do not violate the singular status of the linguistic expressions: A car may refer to a type of object, namely the type of object that is a car. A car may also refer to a prototype that represents what is common to most cars, or it may refer to an exemplar which is an instance that is a representative car. Further, a car may refer to an indefinite instance with the determiner a marking the indefinite status of the referent of a car. Note that indefinite instance is used here as a referential type and not a type of referring expression. In all but a few cases, the type of the referring expression is an indefinite, singular object referring expression when grammatically marked by the determiner a and a singular head noun (a few cases being a notable exception where a combines with a plural head noun). Given the occurrence of the indefinite, singular determiner a and the singular noun car in this expression, a car cannot be used to refer to a definite instance of a car, or to a class or collection, but all the other referential types are potential referents of indefinite, singular object referring expressions. Likewise, every man may refer to a representative but indefinite, singular instance of a man as is suggested by the singular status of every man.

Reference to Definite and Indefinite Instances

The determiner the marks reference to definite instances. Consider the definite object referring expression the car. This definite expression indicates that there is already a referent in the situation model that is being referred to or that there is a salient car object in the mental universe that is being referred to and this object should be instantiated into the situation model. For a more complex example, consider: In the first sentence, the expression a car is indefinite and instantiates a new referent into the situation model — one that is not (known to be) ground in the mental universe. In the second sentence, the expression the car is definite and refers to the referent instantiated into the situation model by a car. Note that this referent is ungrounded in the sense that it has not been identified with any object in the mental universe, although it could be (e.g. Oh, it’s your car). It is the mental universe which ultimately grounds referents. In the first sentence, the expression the driveway is definite. In this case, the definiteness of the driveway indicates there is (or should be) a salient object in the mental universe that should be instantiated into the situation model. There are three primary types of definite reference:
  1. reference to an existing referent in the situation model which is grounded in the mental universe
  2. reference to an existing referent in the situation model which is ungrounded in the mental universe
  3. reference to a object in the mental universe which is not in the situation model, but is (or should be) salient
There are two primary types of indefinite reference:
  1. reference to an object which is being introduced and should be instantiated into the situation model — this object is not known to correspond to any object in the mental universe
  2. reference to a generic instance or type which exists in the mental universe and should be instantiated into the situation model

Reference to Types

Type hierarchies are common in systems of knowledge representation and making types first class objects allows expressions like a sedan is a (type of) car or a (type of) car I like is a sedan to be represented as relating two types a sedan and a car. A sedan and a car refer to instances of a type. The suggested reference to a type rather than a class of instances is based on the singular status of these referring expressions (i.e. a sedan vs. all sedans). A type is a reified class. From a referential perspective, the type is atomic with no subparts and singular reference is appropriate. An instance is added to the situation model which is grounded in a type in the mental universe. From a relational perspective, is establishes a relationship of equality between the two arguments a sedan and a car. However, from a referential perspective, there are two basic possibilities:
  1. both a sedan and a car may refer to types of objects which are equated
  2. the occurrence of a car within the context of is suppresses the normal referential behavior of a car such that is a car — a predicate nominal — is treated as a non-referential expression which is ascribed to the subject a sedan
The typical treatment of predicate nominals suggests that they are non-referential (cf. Jackendoff, 2002). In a sentence like John is a fool, is a fool is treated as a predicate nominal that says something about the individual that John refers to and this sentence is often considered synonymous with John is foolish. From the perspective of the grammatical constraint, there is a problem with this treatment. Grammatically, a fool has the form of an indefinite, singular object referring expression and all object referring expressions are capable of referring, regardless of context. In the case of a predicate nominal, the referent of the embedded object referring expression, if it is identified, is the same as the referent of the subject — they are coreferential. The assumption that is a fool is nonreferential rests on the availability of a referring expression John, the referent of which the predicate nominal is a fool is predicated. In the absence of a separate referring expression, it is unclear how to treat the predicate nominal. For example, in I wonder who is a fool, if who is nonreferential as Huddleston & Pullum (2002, p. 401) suggest, then what does is a fool get predicated of? An obvious suggestion is that who functions as an unbound variable (or variable bound via a lambda expression) which instantiates a referent whose grounding is yet to be determined, but which supports predication of is a fool and can be referred to subsequently as in the follow up he better be careful. In fact, it may turn out that nobody is a fool since wonder is non-factive (i.e. doesn’t entail the existence of its complement). Or it may be the case that the hearer can provide the grounding as in It’s John. In general, Huddleston & Pullum discuss a range of nonreferential object referring expressions (they prefer to use the term NP) in which there is no object in the real world to which the expressions refer, overlooking the possibility of a more flexible notion of reference within a situation model embedded in a mental universe.

In Jackendoff (2002), types are treated as lacking an indexical feature. While this treatment is attractive in providing a simple distinction between types and tokens (i.e. tokens have an indexical feature, types don’t), the lack of an indexical feature implies an inability to refer to types. Yet, Jackendoff acknowledges the existence of NPs which describe types. These NPs are necessarily non-referential. When an NP occurs as a predicate nominal and functions as a kind (or type) as in a professor in John is a professor, this approach coheres. There is an object in the situation model to which the expression refers. But what happens when an NP describing a type occurs as the subject or object as in A new kind of car is passing by or He wants a special kind of dog? If the object referring expressions don’t refer, then it is unclear how the situation model can represent the meaning of these expressions. At a minimum, Jackendoff needs to allow reference to generic instances and argue that apparent references to types are really generic instance references. However, since there is strong evidence that types exist as mental constructs (cf. Collins & Qullian, 1969), we see no good reason to preclude reference to them.

Reference to Generic Instances

The plural variant of the expression a sedan is a car is sedans are cars. This variant suggests a representation based on a collection of generic instances rather than a type.

The generic instance category generalizes over prototypes and exemplars. It is difficult to distinguish reference to prototypes from reference to exemplars since they have much in common. A prototype may be viewed as a washed out exemplar (some cognitive approaches treat prototype and exemplar as essentially synonymous). It is a washed out exemplar in that it is a generalization over the experience of particular instances of the type. In this respect, a prototype is more like a type than an instance, making the distinction between types and instances less clear cut than is typically assumed. The use of specific lexical items may help to make the distinction. Consider the sentence the prototypical car is a sedan. If the expression the prototypical car actually picks out a prototype for a referent, and the expression a sedan picks out a type, then equating a prototype with a type has the effect of defining the prototype to be of a particular type.

Allen (1986) discusses the semantics of generic NPs noting that there is no marking for the generic within NP morphology and that generics have to be inferred from context. Grammatically a singular object referring expression is either definite or indefinite. If the referent of the expression is a prototype or exemplar, then the reference is generic. In the expression the sedan is a car where there is no existing referent in the situation model for the sedan to refer to, the sedan presumably picks out a generic instance or type.

The motivation for distinguishing prototypes and exemplars is a cognitive one, although there is disagreement within the cognitive community as to whether or not both notions are needed. It may be sufficient to distinguish generic instances from types in the situation model without distinguishing prototypes and exemplars.

Reference to Classes, Collections and Masses

Classes, collections and masses complicate reference in interesting ways. Classes and types are two sides of the same coin. The type is atomic and has no subparts. However, the elements of a class are salient and a plural nominal is used to refer to classes as in all men. Collections are also referred to by plural nominals as in the men/all the men where the men/all the men refers to some salient collection of men, and not to the entire class. In these expressions, the noun head men denotes the type, and the specifier and plural grammatical feature determine the nature of the referring expression (i.e. class or collection). Masses differ from classes and collections in that the elements of a mass are not salient. Singular nominals are used to refer to masses in English.

Mass and plural nouns, but not singular count nouns, may function as referring expressions without separate specification. In rice is good for you, rice does not refer to any specific instance of rice and in books are fun to read, books does not refer to any specific collection of books. Both expressions are indefinite. They refer to something non-specific: a type or generic instance for rice and a generic collection for books. Reference to a specific mass or collection requires a definite determiner as in the rice is ready and the books are fun to read.

The use of a plural nominal to refer to a class or collection suggests that the members of the class or collection are cognitively salient and may be separately represented. This opens up the possibility of either referring to the class or collection as a whole or referring to the elements of the class or collection. However, for cognitive reasons having to do with the limited capacity of humans to attend to multiple chunks of information (e.g. Miller, 1956), it is assumed that any linguistic expression may only introduce a small number of referents into a situation model (cf. Johnson-Laird, 1983). In the sedans are cars example, the instantiation of a sedan collection and two generic instances of a sedan, and a car collection and two generic instances of a car is the minimal number consistent with the plurality of the object referring expressions. Given these referents, it is possible to refer to the collections as a whole, and it is also possible to pair the members of one collection with the members of the other collection. These alternatives correspond to the collective and distributive readings discussed in Lyons (1977). Lyons presents the example those books cost $5 which is ambiguous between a distributive — each book is $5 — and collective — all the books are $5 — reading. Distributive and collective readings involve inferential processes operating over collections and instances which are not part of the grammatically encoded meaning. However, addition of each to those books cost $5 each imposes a distributive reading.

We can now see that Johnson-Laird’s representation of every man owns a car corresponds closely to a distributive reading (constrained to a small number of referents). We are also in a better position to consider the representation of every man. Although expressions with every are singular, suggesting selection of an arbitrary instance of a collection, in Everyone is leaving. They are going to eat., subsequent references are plural. Further, Everyone is leaving. He is going to eat is infelicitous. There are two implications of these examples: 1) every instantiates or references a collection in the situation model, and 2) the arbitrary referent of every is not salient for subsequent reference. Even referring expressions with singular a as in Everyone owns a car. They are indispensable. support subsequent plural reference, although in this case Everyone owns a car. It is indispensable. is also felicitous. This may result from the flipping of the type/class coin. Subsequent singular reference is to the type (or generic instance), subsequent plural reference is to the class.

Reference to Vacuous Instances and Collections

The empty set is a useful notion in set theory. The null symbol (or empty list) is a useful symbol in the Lisp programming language. In both set theory and Lisp, these are actual objects that can be referred to and manipulated. The grammatical and lexical structure of English strongly suggests the possibility of referring to a corresponding empty or vacuous object whose existence is taken for granted. Yet Martinich (1985, p. 3) argues that the existence of nothing is an absurd view which rests on a misunderstanding of how language works. However, not only does grammar suggest the existence of objects corresponding to nothing, but it suggests that nothingness comes in lots of different types and collections. Consider It is true that a logical representation for expressions like no man which requires quantifying over every individual in the model makes little practical sense but this is taken to be a problem for the logical representation of the meaning of negative expressions, rather than as a criticism of negative referring expressions in language. Allowing negative object referring expressions to refer to empty or vacuous objects and collections in the situation model which do not map to any objects or collections in the mental universe is perhaps the clearest demonstration of how to simplify the mapping from referring expression to referent, relative to other approaches.

Summary and Conclusions

This section presents and supports an expanded ontology of referential types consistent with Jackendoff’s Conceptual Semantics, basic principles of cognitive linguistics and empirical evidence from cognitive psychology. By expanding the ontology of referential types and introducing a distinction between situation model and mental universe, it is possible to simplify the mapping from referring expression to referent, relative to approaches with a more limited ontology and single semantic space. We propose a bi-partite semantic space consisting of a situation model and mental universe that explains apparent non-referential uses of referring expressions, along with the existence of two partial orderings: The partial orderings are motivated by the linguistic expression of referring expressions, cognitive theory and a computational interest in simplifying the mapping from referring expressions to corresponding objects and situations. The partial orderings are not definitive. They capture important aspects of the mapping from referring expressions to referents, but there are more dimensions of meaning involved in this mapping than these two orderings can accommodate.



I would like to acknowledge the support of the Warfighter Readiness Research Division and its parent organizations: the Human Effectiveness Directorate, the 711th Human Performance Wing and the Air Force Research Laboratory. This research would not have been possible without their extensive support. Additional support has been provided by the Office of Naval Research. A long-term collaboration with the Cognitive Engineering Research Institute has also contributed significantly to the research.


Abney, S. (1994). Parsing by Chunks.

Abney, S. (1987). The English Noun Phrase in its Sentential Aspect. Dissertation: MIT.

Aitchison, J. (2003). Words in the Mind: An Introduction to the Mental Lexicon, 3rd Ed. NY: Basil Blackwell.

Aitchison, J. (1998). The Articulate Mammal: An Introduction to Psycholinguistics, 4th Ed. NY: Routledge.

Alba, J. & Hasher, L. (1983). Is memory schematic? Psychological Bulletin, 93, 203-231.

Allan, K. (1986). Linguistic Meaning. London: Routledge & Kegan Paul.

Allen, J. (1995). Natural Language Understanding, 2nd Ed. Redwood City, CA: Benjamin/Cummings.

Allen, K. (1986). Linguistic Meaning. London: Routledge & Kegan Paul.

Altmann, G. (1998). Ambiguity in Sentence Processing. Trends in Cognitive Sciences 2(4).

Altmann, G. & Mirkovic, J. (2009). Incrementality and Prediction in Human Sentence Processing. Cognitive Science, 222, 583-609.

Altmann, G. & Steedman, M. (1988). Interaction with context during human sentence processing. Cognition, 30, 191-238.

Anderson, J. (2011). Development of the ACT-R Theory and System. Presentation at the ONR Cognitive Science Program Review. Arlington, VA.

Anderson, J. (2007). How Can the Human Mind Occur in the Physical Universe? NY: Oxford University Press.

Anderson, J. (2005). Comments on the release of ACT-R 6 at the ACT-R workshop.

Anderson, J. (1990). The Adaptive Character of Thought. Hillsdale, NJ: Erlbaum

Anderson, J. (1983a). A Spreading Activation Theory of Memory. Journal of Verbal Learning and Verbal Behavior, 22, 261-295.

Anderson, J. (1983b). The Architecture of Cogntion. Cambridge, MA: Harvard University Press.

Anderson, J. (1980). Cognitive psychology and its implications. San Francisco: Freeman.

Anderson, J., Bothell, D., Byrne, M.D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of the mind. Psychological Review, 111, 1036-1060.

Anderson, J. & Lebiere, C. L. (2003). The Newell test for a theory of cognition. Behavioral & Brain Science 26, 587-637.

Anderson, J. & LeBiere, C. (1998). The Atomic Components of Thought. Hillsdale, NJ: Erlbaum.

Andrews, S (1996) Lexical retrieval and selection processes: Effects of transposed-letter confusability, Journal of Memory and Language, 35(6), 775-800.

Atkinson, R. & Shiffrin, R (1968). Human memory: A proposed system and its control processor. In Spence, W. & Spence, J. (eds.), The psychology of learning and motivation, 89-195. NY: Academic Press.

Baddeley, A. (2000). The episodic buffer: a new component of working memory? Trends in Cognitive Sciences, 4(11), 417-423.

Baddeley, A. (2002). Is Working Memory Still Working? European Psychologist, 7(2), 85-97.

Baddeley, A. (2003). Working memory and language: an overview. Journal of Communication Disorders, 36, 189-208.

Ball, J. (2013a). The Advantages of ACT-R over Prolog for Natural Language Analysis. Proceedings of the 21st Annual Conference on Behavior Representation in Modeling and Simulation.

Ball, J. (2013b). Modeling the Binding of Implicit Arguments in Complement Clauses in ACT-R/Double-R. In R. West & T. Stewart (eds), Proceedings of the 12th International Conference on Cognitive Modeling. 84-88. Ottowa, Carlton University.

Ball, J. (revised 2013). Projecting Grammatical Features in Nominals: Cognitive Theory and Computational Model. Downloaded from

Ball, J. (2012a). Explorations in ACT-R Based Language Analysis – Memory Chunk Activation, Retrieval and Verification without Inhibition. In N. Russwinkel, U. Drewitz & H. van Rijn (eds), Proceedings of the 11th International Conference on Cognitive Modeling, 131-136. Berlin: Universitaets der TU Berlin.

Ball, J. (2012b). The Representation and Processing of Tense, Aspect & Voice across Verbal Elements in English. Proceedings of the 34th Annual Conference of the Cognitive Science Society. Sapporo, Japan: Cognitive Science Society.

Ball, J. (2012c). Explorations in ACT-R Based Language Analysis – Memory Chunk Activation, Retrieval and Verification without Inhibition. In N. Russwinkel, U. Drewitz & H. van Rijn (eds), Proceedings of the 11th International Conference on Cognitive Modeling, 131-136. Berlin: Universitaets der TU Berlin.

Ball, J. (2011a). A Pseudo-Deterministic Model of Human Language Processing. In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 495-500. Austin, TX: Cognitive Science Society.

Ball, J. (2011b). Explorations in ACT-R Based Cognitive Modeling – Chunks, Inheritance, Production Matching and Memory in Language Analysis. Proceedings of the AAAI Fall Symposium: Advances in Cognitive Systems, 10-17. Arlington, VA: AAAI.

Ball, J. (2010a). Projecting Grammatical Features in Nominals: Cognitive Processing Theory & Computational Implementation. Proceedings of the 19th Annual Conference on Behavior Representation in Modeling and Simulation. Charleston, SC: BRiMS.

Ball, J. (2010b). A Computational Model of Tense, Voice and Aspect. Downloaded from

Ball, J. (2010c). Simplifying the Mapping from Referring Expression to Referent in a Conceptual Semantics of Reference. In R. Catrambone & S. Ohlsson (eds.), Proceedings of the 32nd Annual Meeting of the Cognitive Science Society, 1577-1582. Boston, MA: Cognitive Science Society.

Ball, J. (2008). A Naturalistic, Functional Approach to Modeling Language Comprehension. Papers from the AAAI Fall 2008 Symposium, Naturally Inspired Artificial Intelligence. Menlo Park, CA: AAAI Press

Ball, J. (2007a). A Bi-Polar Theory of Nominal and Clause Structure and Function. Annual Review of Cognitive Linguistics, 27-54. Amsterdam: John Benjamins.

Ball, J. (2007b). Construction-Driven Language Processing. In S. Vosniadou, D. Kayser & A. Protopapas (Eds.) Proceedings of the 2nd European Cognitive Science Conference, 722-727. NY: LEA.

Ball, J. (2006). Can NLP Systems be a Cognitive Black Box? Papers from the AAAI Spring Symposium, Technical Report SS-06-02, 1-6. Menlo Park, CA: AAAI Press.

Ball, J. (2004). A Cognitively Plausible Model of Language Comprehension. Proceedings of the 13th Conference on Behavior Representation in Modeling and Simulation, 305-316. ISBN: 1-930638-35-3.

Ball, J. (2003). Beginnings of a Language Comprehension Module in ACT-R 5.0. Proceedings of the Fifth International Conference on Cognitive Modeling. Edited by F. Detje, D. Doerner and H. Schaub. Universitaets-Verlag Bamberg. ISBN 3-933463-15-7.

Ball, J. (1992). PM, Propositional Model, a Computational Psycholinguistic Model of Language Comprehension Based on a Relational Analysis of Written English. Ann Arbor, MI: UMI Dissertation Information Service.

Ball, J. (1985). A Consideration of Prolog. Report No. MCCS-85-171. Memoranda in Computer and Cognitive Science, Computing Research Laboratory, New Mexico State University, Las Cruces, NM 88003.

Ball, J., Heiberg, A. & Silber, R. (2007). Toward a Large-Scale Model of Language Comprehension in ACT-R 6. Proceedings of the 8th International Conference on Cognitive Modeling, 173-179. Edited by R. Lewis, T. Polk & J. Laird. NY: Psychology Press.

Ball, J., Freiman, M., Rodgers, S. & Myers, C. (2010). Toward a Functional Model of Human Language Processing. Proceedings of the 32nd Conference of the Cognitive Science Society.

Ball, J., Myers, C., Heiberg, A., Cooke, N., Matessa, M., Freiman, M. and Rodgers, S. (2010). The synthetic teammate project. Computational and Mathematical Organization Theory, 16(3), 271-299.

Balota, D., Pollatsek, A., & Rayner, K. (1985). The interaction of contextual constraints and parafoveal visual information in reading. Cognitive Psychology, 17, 364-390.

Barker, D. (2002). Microsoft Research Spawns a New Era in Speech Technology. PC AI, 16:6, 18-27.

Barsalou, L. (1999). Perceptual symbol systems. The Behavioral and Brain Sciences 22 (4): 577–660.

Barwise, J. & Perry, J. (1983). Situations and Attitudes. Cambridge, MA: The MIT Press.

Bates, E. & MacWhinney, B. (1987). Competition, variation and language learning. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erlbaum, 157-194.

Bever, T. (1970). The cognitive basis for linguistic structures. In J. Hayes (ed.), Cognition and the development of language, 279-362. NY: Wiley.

Biber, D., Conrad, S., and Leech, G., (2002). Longman Student Grammar of Spoken and Written English. NY: Pearson ESL.

Biber, D., Johansson, S., Leech, G., Conrad, S. & Finegan, E. (1999, 2007). Longman Grammar of Spoken and Written English. Longman Group.

Binder, K., Pollatsek, A., & Rayner, K. (1999 ). Extraction of information to the left of the fixated word in reading. Journal of Experimental Psychology: Human Perception and Performance, 25, 1162-1172.

Boden, M. (2006). Mind as Machine: A History of Cognitive Science, 2 vols. Oxford: Oxford University Press.

Boden, M. (1992). Introduction. In The Philosophy of Artificial Intelligence. Edited by M. Boden. NY: Oxford University Press, 1-21.

Boland, J. M. Tanenhaus, & S. Garnsey (1990). Evidence for the Immediate Use of Verb Control Information in Sentence Processing. Journal of Memory and Language, 29, pp. 413-432.

Borer, H. (2004). The grammar machine. In Artemis, A., E. Anagnostopoulou, and M. Everaert (eds) The Unaccusativity Puzzle: Explorations of the Syntax-Lexicon Interface. Oxford: Oxford University Press.

Borer, H. (2011). Roots and categories. Keynote address at the Arizona Linguistics Circle 5 conference.

Borer, H. (2009). Roots and categories. Downloaded from

Bothell, D. (2011). ACT-R 6.0 Reference Manual. Downloaded from

Bresnan, J. (1978). A Realistic Transformational Grammar. In Linguistic Theory and Psychological Reality. Edited by M. Halle, J. Bresnan & G. A. Miller. Cambridge, MA: The MIT Press.

Bybee, J. (2001). Phonology and Language Us. Cambridge University Press.

Byrne, M. (2007). Local Theories Versus Comprehensive Architectures, The Cognitive Science Jigsaw Puzzle. In Gray (ed), Integrated Models of Cognitive Systems. NY: Oxford University Press, 431-443.

Cann, R. (1999). Specifiers as secondary heads. In D. Adger, S. Pintzuk, B. Plunkett & G. Tsoulas (Eds.) Specifiers: Minimalist approaches 21-45. NY: Oxford.

Carver, R. (1973). Understanding, information processing and learning from prose materials. Journal of Educational Psychology, 64, 76-84.

Carver, R. (1973). Effect of increasing the rate of speech presentation upon comprehension. Journal of Educational Psychology, 65, 118-126.

Cassimatis, N., Bello, P. & Langley, P. (2008). Ability, breadth, and parsimony in computational models of higher-order cognition. Cognitive Science, 32, 1304-1322.

Chater, N. & Christiansen, M. (2007). Two views of simplicity in linguistic theory: which connects better with cognitive science? Trends in Cognitive Sciences, 11, 324-6.

Cheng, L, and Sybesma, R. (1998). Interview with James McCawley, University of Chicago. Glot International 3:5.

Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press.

Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht-Holland: Foris.

Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press.

Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton.

Chomsky, N. (1995). The Minimalist Program. MIT Press.

Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris Publications.

Chomsky, N. (1957). Syntactic Structures. Mouton.

Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory (2): 113–124.

Christianson, K., Hollingsworth, A., Halliwell, J. & Ferreira, F. (2001). Thematic roles assigned along the garden path linger. Cognitive Psychology, 42, 368-407.

Clark, E. & Clark, H. (1979). When Nouns Surface as Verbs. Language, 55(4), 767-811.

Clocksin, W. & Mellish, C. (1984). Programming in Prolog, 2nd Ed. NY: Springer-Verlag.

Collins, A. & Loftus, E. (1975). A spreading activation theory of semantic processing. Psychological Review, 82, 407-428.

Collins, A. & Quillian, M. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8, 240-248.

Collins, M. (2003). Head-Driven Statistical Models for Natural Language Parsing. Computational Linguistics.

Colmerauer, A. & Roussel, A. (1993). The birth of Prolog. ACM SIGPLAN Notices 28: 37.

Cook, V. J, & Newson, M. (1996). Chomsky’s Universal Grammar. Malden, MA: Blackwell Publishers.

Cooper, R. (2002). Modelling High-Level Cognitive Processes. Mahway, NJ: LEA.

Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87-185.

Copestake, A., Flickinger, D., Pollard, C. & Sag, I. (2006). Minimal Recursion Semantics: An Introduction. Research on Language and Computation 3:281–332

Cowan, N. (2005). Working memory capacity. NY: Psychology Press.

Crocker, M. (2005). Rational models of comprehension: addressing the performance paradox. In A. Culter (Ed), Twenty-First Century Psycholinguistics: Four Cornerstones. Hillsdale: LEA.

Crocker, M. (1999). Mechanisms for Sentence Processing. In Garrod, S. & Pickering, M. (eds), Language Processing, London: Psychology Press.

Culicover, P. (2009). Natural Language Syntax. Oxford.

Culicover, P. & Jackendoff, R. (2005). Simpler Syntax. NY: Oxford University Press.

Culicover, P. (2011). A Reconsideration of English Relative Clause Constructions. Constructions 2.

Daelemans, W., De Smedt, K. & Gazdar, G. (1992). Inheritance in Natural Language Processing. Computational Linguistics 19 (2), 205-218. Cambridge, MA: MIT Press.

Dahlgren, K. & J. McDowell (1986). Using Commonsense Knowledge to Disambiguate Prepositional Phrase Modifiers. In Proceedings of AAAI 1986.

Dascal, M. (1987). Defending Literal Meaning. Cognitive Science, 11, pp. 259-281.

Davies, M. (2013).

De Vicenzi, M. (1991) Syntactic parsing strategies in Italian. Dordrecht: Kluwer Academic Publishers.

DenBuurman, R., Boersma, T., & Gerrissen, J. F. (1981). Eye movements and the perceptual span in reading. Reading Research Quarterly, 16, 227-235.

Dixon, R. (2005). A Semantic Approach to English Grammar. NY: Oxford University Press.

Douglass, S., Ball, J. & Rodgers, S. (2009). Large declarative memories in ACT-R. Proceedings of the 9th International Conference on Cognitive Modeling. Manchester, UK.

Eisenbach, A. & Eisenbach, M. (2006). phpSyntaxTree tool,

Ericsson, K. & Kintsch, W. (1995). Long-term working memory. Psychological Review, 201, 211-245.

Fass, D. (1988). Collative Semantics. Report No. MCCS-88-118. Memoranda in Computer and Cognitive Science, Computing Research Laboratory, New Mexico State University, Las Cruces, NM 88003.

Fauconnier, G. (1994). Mental Spaces: Aspects of Meaning Construction in Natural Language. NY: Cambridge University Press.

Ferreira, F. (2003). The misinterpretation of noncanonical sentences. Cognitive Psychology. 47: 164-203.

Ferreira, F. & J. Henderson (1990). Use of Verb Information in Syntactic Parsing: Evidence for Eye Movements and Word-by-Word Self-Paced Reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, pp. 555-568.

Fillmore, C. (1988). The Mechanisms of Construction Grammar. BLS 14: 35-55.

Fodor, J. A. (1983). Modularity of Mind. Cambridge, MA: The MIT Press.

Fodor, J. A. (1981). Representations. Cambridge, MA: The MIT Press.

Fodor, J. A. (1975). The Language of Thought. NY: Crowell.

Fong, S. & Berwick, R. (2008). Treebank Parsing and Knowledge of Language: A Cognitive Perspective. In B. C. Love, K. McRae, & V. M. Sloutsky (Eds.), Proceedings of the 30th Annual Conference of the Cognitive Science Society, 539-544. Austin, TX: Cognitive Science Society.

Ford, M. (1986). A Computational Model of Human Parsing Processes. In Advances in Cognitive Science 1. Edited by N. Sharkey. Chichester, England: Ellis Horwood Limited.

Ford, M., J. Bresnan, & R. Kaplan (1982). A Competence-Based Theory of Syntactic Closure. In The Mental Representation of Grammatical Relations. Edited by J. Bresnan. Cambridge, MA: The MIT Press.

Forster, K. (1989). Basic Issues in Lexical Processing. In Lexical Representation and Process. Edited by W. Marslen-Wilson. Cambridge, MA: The MIT Press.

Forster, K. (1979). Levels of processing and the structure of the language processor. In Sentence Processing: Psycholinguistic Studies Presented to Merrill Garrett. Edited by W. Cooper & E. Walker. Hillsdale, NJ: LEA.

Frazier, L, (1990). Parsing modifiers: Special purpose routines in HSPM? In Comprehension Processes in Reading. Edited by D. A. Balota, G. B. Flores d'Arcais & K. Rayner. Hillsdale, NJ: LEA.

Frazier, L. (1989). Against Lexical Generation of Syntax. In Lexical Representation and Process. Edited by W. Marslen-Wilson. Cambridge, MA: The MIT Press.

Frazier, L. (1987). Sentence Processing: a Tutorial Review. In Attention and Performance XII. Edited by M. Coltheart. Hillsdale, NJ: LEA.

Frazier, L. & Clifton, C. (1996). Construal. Cambridge, MA: The MIT Press.

Frazier, L. & Fodor, J. D. (1978). The sausage machine: a new two-stage parsing model. Cognition, 6, pp. 291-328.

Frazier, L. & Rayner, K. (1982). Making and Correcting Errors during Sentence Comprehension: Eye Movements in the Analysis of Structurally Ambiguous Sentences. Cognitive Psychology, 14, pp. 178-210.

Freiman, M. & Ball, J. (2010). Improving the Reading Rate of Double-R-Language. In D. D. Salvucci & G. Gunzelmann (eds.), Proceedings of the 10th International Conference on Cognitive Modeling, 1-6. Philadelphia, PA: Drexel University.

Freiman, M., & Ball, J. (2008). Computational cognitive modeling of reading comprehension at the word level. Proceedings of the 38th Western Conference on Linguistics, 34-45. Davis, CA: University of California, Davis.

Freiman, M., Rodgers, S. & Ball, J. (in preparation). Building a Functional Mental Lexicon.

Fu, W.-T., Bothell, D., Douglass, S., Haimson, C., Sohn, M.-H., & Anderson, J. A. (2006), Toward a Real-Time Model-Based Training System. Interacting with Computers, 18(6), 1216-1230.

Gabbard, R., Marcus, M. & Kulick, S. (2006). Fully Parsing the Penn Treebank. In Proceedings of the HLT Conference of the NAACL, 184-191. NY: ACL.

Gal, A., Lapalme, G., Saint-Dizier, P. & Somers, H. (1991). Prolog for Natural Language Processing. Chichester, UK: Wiley.

Gazdar, G., Klein, E., Pullum, G. & Sag, I. (1985). Generalized Phrase Structure Grammar. Oxford: Basil Blackwell.

Gazdar, G. & Mellish, C. (1989). Natural Language Processing in Prolog. Addison-Wesley.

Gibbs, R. (1989). Understanding and Literal Meaning. Cognitive Science, 13, pp. 243-251.

Gibbs, R. (1986). What makes some indirect speech acts conventional. Journal of Memory and Language, 25, pp. 181-196.

Gibbs, R. (1984). Literal meaning and psychological theory. Cognitive Science, 8, pp. 275-304.

Gibbs, R. (1980). Spilling the beans on understanding and memory for idioms in conversation. Memory & Cognition, 8, pp. 149-156.

Gibson, E., & Pearlmutter, N. (1998). Constraints on sentence comprehension. Trends in Cognitive Sciences, 2(7), 262-268.

Glucksberg, S., P. Gildea, & H. Bookin (1982). On Understanding Nonliteral Speech: Can People Ignore Metaphors? Journal of Verbal Learning and Verbal Behavior, 21, pp. 85-98.

Goldberg, A. (2003). Constructions: a new theoretical approach to language. Trends in Cognitive Sciences. Vol. 7, No. 5, pp. 219-224.

Goldberg, A. (1995). A Construction Grammar Approach to Argument Structure. Chicago: The University of Chicago Press.

Gordon, P., & Hendrick, R.  (1998).  The representation and processing of coreference in discourse. Cognitive Science, 22, 389-424.

Grainger, J., O'Regan, J. K., Jacobs, A. M., & Seguf, J. (1989). On the role of competing word units in visual word recognition: The neighborhood frequency effect, Perception and Psychophysics, 45, 189-195.

Gray, W. (2007). Integrated Models of Cognitive Systems. NY: Oxford University Press.

Gray, W. D. & M. J. Schoelles (2003). The Nature and Timing of Interruptions in a Complex Cognitive Task: Empirical Data and Computational Cognitive Models. In Proceedings of the 25th Annual Meeting of the Cognitive Science Society. pg 37.

Grossberg, S. (1987). Competitive Learning: from interactive activation to adaptive resonance. Cognitive Science, 22, 23-63.

Grosz, B., Weinstein, S. & Joshi, A. (1995). Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics 23, pp. 203-225.

Grosz, B., Joshi, A. & S. Weinstein (1995). Centering: A framework for modelling the local coherence of discourse. University of Pennsylvania: IRCS Technical Report Series.

Guerrera, C. (2004). Flexibility and constraint in lexical access: Explorations in transposed-letter priming. Unpublished dissertation, Department of Psychology, University of Arizona.

Halliday, M. & Matthiessen, C. (2004). An Introduction to Functional Grammar. NY: Oxford University Press.

Hawkins, J. (2004). Efficiency and Complexity in Grammars. Oxford.

Heiberg, A., Harris, J., & Ball, J. (2007). Dynamic Visualization of ACT-R Declarative Memory Structure. Proceedings of the 8th International Conference on Cognitive Modeling, 233-234. Oxford, UK: Taylor & Francis/Psychology Press.

Henderson, J. & Ferreira, F. (eds.) (2004). The Interface of Language, Vision and Action: Eye Movements and the Visual World. NY: Psychology Press.

Hobbs, J. (2003). Discourse and inference. Retrieved from

Hobbs, J. (1985). Ontological promiscuity. Proceedings, 23rd Annual Meeting of the Association for Computational Linguistics, 61-69.

Holmes, V. (1987). Syntactic parsing: In search of the garden path. In Attention and Performance XII. Edited by M. Coltheart. Hillsdale, NJ: LEA.

Holmes, V., L. Stowe, & L. Cupples (1989). Lexical Expectations in Parsing Complement-Verb Sentences. Journal of Memory and Language, 28, pp. 668-689.

Huddleston, R. & Pullum, G. (2005). A Student’s Introduction to English Grammar. NY: Cambridge University Press.

Huddleston, R. & Pullum, G. (2002). The Cambridge Grammar of the English Language. Cambridge, UK: Cambridge University Press.

Jackendoff, R. (2007). Language, Consciousness, Culture, Essays on Mental Structure. Cambridge, MA: The MIT Press.

Jackendoff, R. (2002). Foundations of Language. Oxford.

Jackendoff, R. (1991). Semantic Structures. Cambridge, MA: The MIT Press.

Jackendoff, R. (1983). Semantics and Cognition. Cambridge, MA: The MIT Press,

Jackendoff, R. (1977) X-Syntax: A study of phrase structure. Linguistic Inquiry Monograph 2. Cambridge, MA: The MIT Press.

Jensen, K. & J-L. Binot (1986). Disambiguating Prepositional Phrase Attachment by Using On-Line Dictionary Definitions. Report RC 12148 (#54633), Computer Sciences Department, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598.

Johnson-Laird, P. (1983). Mental Models. Cambridge, MA: Harvard University Press.

Joshi, A. (1987). Introduction to Tree Adjoining Grammars. In A. Manaster-Ramer (ed). Mathematics of Language. Amsterdam: John Benjamins.

Joshi, A. (1985). How much context-sensitivity is necessary for characterizing structural descriptions? In D. Dowty, L. Karttunen & A. Zwicky (eds.) Natural Language Parsing: Theoretical, Computational and Psychological Perspectives. NY: Cambridge University Press, 206-250.

Joshi, A., Prasad, R. & Miltsakaki, E. (2005). Anaphora Resolution: A Centering Approach. Encyclopedia of Language and Linguistics, 2nd edition.

Just, M. & Carpenter, P. (1992). A Capacity Theory of Comprehension: Individual Differences in Working Memory. Psychological Review, 99 (1), 122-149.

Just, M. & Carpenter, P. (1987). The Psychology of Reading and Language Comprehension. Boston: Allyn and Bacon, Inc.

Kamp, H. & Reyle, (1993). From Discourse to Logic. Kluwer, Dordrecht.

Kim, A., Srinivas, B. & Trueswell, J. (2002). The convergence of lexicalist perspectives in psycholinguistics and computational linguistics. In Merlo, P. & Stevenson, S. (eds), Sentence Processing and the Lexicon: Formal, Computational and Experimental Perspectives, 109-135. Philadelphia, PA: Benjamins Publishing Co.

Kaplan, J. (1995,1989) English Grammar Principles and Facts. Prentice Hall.

Kayne, R. (1994). The Antisymmetry of Syntax. MIT Press.

Kintsch, W. (1998). Comprehension. NY: Cambridge University Press.

Kintsch, W. (2001) Predication. Cognitive Science, 25, 173-202.

Kintsch, W. (1998). Comprehension, a Paradigm for Cognition. Cambridge.

Kintsch, W. & Mangalath, P. (2011). The Construction of Meaning. Topics in Cognitive Science 3 (2): 346-370.

Kilgarriff, A. (1997). I don’t believe in word senses. Computers and the Humanities 31 (2): 91-113.

Kipper, K., Korhonen, A., Ryant, N., Palmer, M. (2008). A Large-scale Classification of English Verbs. Language Resources and Evaluation Journal, 42(1), pp. 21-40, Springer Netherland, 2008.

Kornai, A. & Pullum, G. (1990). The X-Bar Theory of Phrase Structure. Language, 66, 24-50.

Kowalski, R. (1982). Logic as a computer language. Logic Programming. Edited by K. Clark and S. Tarnlund. NY: Academic Press.

Kucera, H., & Francis, H. (1967). A computational analysis of present-day American English. Providence, RI: Brown University Press.

Kintsch, W. (2008). Symbol systems and perceptual representations. In M. De Vega, A. Glenberg & A. Graesser (Eds), Symbols and Embodiment, 145-164. Oxford: Oxford Univ Press.

Lakoff, G. (1987). Women, fire and dangerous things.

Langacker, R. (1987, 1991). Foundations of Cognitive Grammar, Vols 1 and 2. Stanford.

Levelt, W. (1989). Speaking: From Intention to Articulation. Cambridge, MA: The MIT Press.

Levelt, W. (1970). A scaling approach to the study of syntactic relations. In G. B. Flores d’Arcais and W. Levelt (eds.) Advances in psycholinguistics 109-121. NY: Elsevier. Levelt, W. (1972). Some psychological aspects of linguistic data. Linguistische Berichte, 17, 18-30.

Levelt, W. & Kempen, G. (1975). Semantic and Syntactic Aspects of Remembering Sentences: a Review of Some Recent Continental Research. In A. Kennedy & W. Wilkes (Eds.), Studies in long term memory, 201-216. New York: Wiley.

Loosen, F. (1972). Cognitieve organisatie van zinnen in het geheugen. Dissertation, Leuven.

Levin, B. (1993). English Verb Classes and Alternations: A Preliminary Investigation. Chicago: University of Chicago Press.

Lewis, R. L. (1998). Leaping off the garden path: Reanalysis and limited repair parsing. In J. D. Fodor, & F. Ferreira (Eds.), Reanalysis in Sentence Processing. Boston: Kluwer Academic.

Lyons, J. (1977). Semantics, Volumes 1 & 2. Cambridge, England: Cambridge University Press

Manning, C. (2007). Machine Learning of Language from Distributional Evidence. Downloaded from on 22 Mar 09.

Marangolo, P., Nasti, M., and Zorzi, M. (2004). Selective impairment for reading numbers and number words: a single case study, Neuropsychologia. 42, 8, 997-1006.

Marcus, M. (1980). A Theory of Syntactic Recognition for Natural Language. Cambridge, MA: The MIT Press.

Marcus, M., Santorini, B. & Marcinkiewicz, M. (1993). Building a Large Annotated Corpus of English: the Penn Treebank. Computational Linguistics, 19(2): 313-330.

Marr. D. (1992). Artificial Intelligence: A Personal View. In The Philosophy of Artificial Intelligence. Edited by M. Boden. NY: Oxford University Press, 133-146.

Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: Freeman.

Marslen-Wilson, W., & L. Tyler (1987). Against Modularity. In Modularity in Knowledge Representation and Natural-Language Understanding. Edited by J. Garfield. Cambridge, MA: The MIT Press.

McClelland, J. (1987). The Case for Interaction in Language Processing. In Attention and Performance XII. Edited by M. Coltheart. Hillsdale, NJ: LEA.

Marcus, M., Santorini, B., & Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330.

Martinich, A (ed) (1985). The Philosophy of Language. New York: Oxford University Press.

Matuszek, C., Cabral, J., Witbrock, M. & DeOliveira, J. (2006). An Introduction to the Syntax and Content of Cyc. Proceedings of the 2006 AAAI Spring Symposium on Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering. Stanford, CA.

McClelland, J., & Rumelhart, D. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings. Psychological Review, 88(5), 375-407.

McCusker, L., Gough, P., Bias, R. (1981) Word recognition inside out and outside in, Journal of Experimental Psychology: Human Perception and Performance, 7 (3), 538-551.

McCloskey, M., Caramazza, A., and Basili, A. (1985). Cognitive mechanisms in number processing and calculation: Evidence from dyscalculia, Brain and Cognition, 4:171-196.

McConkie, G., & Rayner, K. (1975). The span of the effective stimulus during a fixation in reading. Perception & Psychophysics, 17. 578-586.

McShane, M. (2012). Resolving Elided Scopes of Modality in OntoAgent. Advances in Cognitive Systems, 2, pp. 95-112.

McShane, M. (2009). Advances in Difficult Aspects of Reference Resolution. Working Papers 01-09. Institute for Language and Information Technologies, University of Maryland Baltimore County.

McShane, M., Nirenburg, S. & Beale, S. (in press). Meaning-Centric Language Processing.

Meyer, D., & Schvaneveldt, R. (1971). Facilitation in recognizing pairs of words: evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90, pp. 227-234.

Miller, G. (1956). The Magical Number Seven, Plus or Minus Two. Psychological Review, 63, 81-94.

Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K. (1990). WordNet: An online lexical database. Int. J. Lexicograph, 3, 4, 235-244.

Mitchell, D. (1987). Lexical Guidance in Human Parsing: Locus and Processing Characteristics. In Attention and Performance XI. Edited by M. Coltheart. Hillsdale, NJ: LEA.

Mitchell, D., & V. Holmes (1985). The role of specific information about the verb in parsing sentences with local structural ambiguity. Journal of Memory and Language, 24, pp. 542-559.

Miltsakaki, E. (2002). Toward an Aposynthesis of Topic Continuity and Intrasentential Anaphora. Computational Linguistics

Miltsakaki, E. (2003). The Syntax-Discourse Interface: Effects of the Main-Subordinate Distinction on Attention Structure. Unpublished doctoral dissertation: University of Pennsylvania.

New, B., Ferrand, L., Pallier, C., and Brysbaert, M. (2006). Reexamining the word length effect in visual word recognition: New evidence from the English Lexicon Project, Psychonomic Bulletin & Review 13 (1), 45-52.

Newell, A. (1973). You can't play 20 questions with nature and win: Projective comments on the papers of this symposium. In W. G. Chase (ed.), Visual Information Processing. New York: Academic Press.

Newell, A. (1990). Unified theories of cognition. Harvard University Press.

O'Grady, W., Archibald, J., Aronoff, M. & Rees-Miller, J. (2001). Contemporary Linguistics, An Introduction. Bedford/St. Martin's.

O'Regan, J, & Levy-Schoen, A.(1987). Eye-movement Strategy and Tactics in Word Recognition and Reading. In Attention and Performance XII. Edited by. M. Coltheart. Hillsdale, NJ: LEA.

Ortony, A., Schallert, D.,Reynolds, R. & Antos, S. (1978). Interpreting Metaphors and Idioms: Some Effects of Context on Comprehension. Journal of Verbal Learning and Verbal Behavior, 17, pp. 465-477.

Paap, K., Newsome, S., McDonald, J., & Schvaneveldt, R. (1982). An Activation-Verification Model of Letter and Word Recognition: The Word-Superiority Effect. Psychological Review, 89, 573-594.

Palmer, M., Kingsbury, P., Gildea, D. (2005). The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics 31 (1): 71–106.

Perea, M., & Lupker, S. (2004). Can CANISO activate CASINO? Transposed-letter similarity effects with nonadjacent letter positions, Journal of Memory and Language, 51, 231–246.

Perea, M., & Lupker, S. (2003). Does jugde activate COURT? Transposed-letter similarity effects in masked associative priming. Memory and Cognition, 31, 829- 841.

Pereira, F. & Warren, D. (1980). Definite clause grammars for language analysis – A Survey of the Formalism and a Comparison with Augmented Transition networks. Artificial Intelligence, 13:231-2787.

Polk, T. & Farah, M. (1995). Late Experience Alters Vision, Nature, 376, 6542, 648-649.

Pollatsek, A., & Rayne, K. r (1990). Eye movements and lexical access in reading. In Comprehension Processes in Reading. Edited by D. Balota, G. F. d'Arcais & K. Rayner. Hillsdale, NJ: LEA.

Prabhakaran, V., Narayanan, K., Zhao, Z. & Gabrielli, J. (2000). Integration of diverse information in working memory in the frontal lobe. Nature Neuroscience, 3, 85-90.

Pullum, Geoffrey K. (1991). English nominal gerund phrases as noun phrases with verb-phrase heads. Linguistics, 29, 763-99.

Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Essex, England: Longman Group.

Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J (1972). A Grammar of Contemporary English. Longman Group.

Rawlinson, G. E. (1976) The significance of letter position in word recognition. Unpublished PhD Thesis, Psychology Department, University of Nottingham, Nottingham UK.

Rayner, K. M. Carlson, & L. Frazier (1983). The interaction of syntax and semantics during sentence processing: Eye movements in the analysis of semantically biased sentences. Journal of Verbal Learning and Verbal Behavior, 22, pp. 358-374.

Rayner, K. (1975). The perceptual span and peripheral cues in reading. Cognitive Psychology, 7, 65-81.

Rayner, K. (1986). Eye movements and the perceptual span in beginning and skilled readers. Journal of Experimental Child Psychology, 41, 211-236.

Reichle, E., Rayner, K., & Pollatsek, A. (2003). The E-Z Reader model of eye movement control in reading: Comparison to other models. Brain and Behavioral Sciences, 26, 445–476.

Reichle, E., Warren, T. & McConnell, K. (2009). Using E-Z Reader to model the effects of higher level language processing on eye movements during reading. Psychonomic Bulletin & Review, 16(1), 1-21.

Richardson, M & Domingos, P. (2006). Markov logic networks. Machine Learning, 62:107-136.

Robinson, J. 1965. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12(1): 23-41.

Roelofs, A. (2005). From Popper to Lakatos: A case for cumulative computational modeling. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones, 313-330. Hillsdale, NJ: LEA.

Rodgers, S., Myers, C., Ball, J. & Freiman, M. (2012). Toward a Situation Model in a Cognitive Architecture. Computational and Mathematical Organization Theory

Rosch, E. (1975). Cognitive Representations of Semantic Categories. Journal of Experimental Psychology: General, 104, 192-233.

Sag, I. (2007). Remarks on Locality. Proceedings of the HPSG07 Conference.

Sag, I. (2010). Sign-Based Construction Grammar: An informal synopsis. In H. Boas & I. Sag (Eds.), Sign-Based Construction Grammar. Stanford: CSLI.

Sag, I. (2009). Feature Geometry and Predictions of Locality. In A. Kibort & G. Corbett (Eds.), Features: Perspectives on a Key Notion in Linguistics. Oxford: Clarendon Press.

Sag, I., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP. Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics.

Sag, I., Wasow, T., & Bender, E. (2003). Syntactic Theory, a Formal Introduction, Second Edition. Stanford: CSLI Publications.

Schubert, L. (1984). On parsing preferences. In Proceedings of COLING 1984.

Schubert, L. (1986). Are there preference trade-offs in attachment decisions. In Proceedings of AAAI 1986.

Schvaneveldt, R. & Meyer, D. (1973). Retrieval and Comparison Processes in Semantic Memory. In Attention and Performance IV. Edited by S. Kornblum. NY: Academic Press.

Searle, J. (1979). Metaphor. In Metaphor and Thought pp. 92-123. Edited by A. Ortony. Cambridge, England: Cambridge University Press.

Shen, L. (2006). Statistical LTAG Parsing. Unpublished dissertation, University of Pennsylvania

Shen, L. & Joshi, A. (2005). Incremental LTAG Parsing. In Proceedings of the conference on Human Language Technology and Empirical Methods in NLP, 811-818.

Sidner, C. (1979). Toward a computational theory of definite anaphora comprehension in English (Technical Report AI-TR-537). Cambridge, MA: The MIT Press.

Somers, H. (1990). Brief note on Syntax and Semantics 21: Thematic Relations. Edited by W. Wilkins. In Computational Linguistics, Vol 16, pp. 124-5.

Steedman, M. (1989). Grammar, Interpretation, and Processing from the Lexicon. In Lexical Representation and Process. Edited by W. Marslen-Wilson. Cambridge, MA: The MIT Press.

Seidenberg, Mark S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96(4), 523-568.

Taatgen, N. & Anderson, J. (2008). ACT-R. In R. Sun (ed.), Constraints in Cognitive Architectures. Cambridge University Press, pp 170-185.

Talmy, L. (2003). Toward a Cognitive Semantics. Cambridge, MA: The MIT Press

Taraban, R. & J. McClelland (1990). Parsing and Comprehension: A Multiple- Constraint View. In Comprehension Processes in Readin. Edited by D. Balota, G. F. d'Arcais & K. Rayner. Hillsdale, NJ: LEA.

Taraban, R. & J. McClelland (1988). Constituent Attachment and Thematic Role Assignment in Sentence Processing: Influences of Content-Based Expectations. Journal of Memory and Language, 27, pp. 597-632.

Tanenhaus, M. & Carlson G. (1989). Lexical Structure and Language Comprehension. In Lexical Representation and Process. Edited by W. Marslen- Wilson. Cambridge, MA: The MIT Press.

Tanenhaus, M., Chambers, C. & Hanna, J. (2004). Referential Domains in Spoken Language Comprehension. In Henderson & Ferreira (eds.), The Interface of Language, Vision and Action. NY: Psychology Press.

Tanenhaus, M., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268, 1632-1634.

Tanenhaus, M., Stowe, K. & Carlson, G. (1985). Lexical expectations and pragmatics in parsing filler-gap constructions. In Proceedings of the Seventh Annual Meeting of the Cognitive Science Association.

Taylor, J. (2000). Possessives in English. Oxford.

Taylor, S. (1965). Eye movements while reading: Facts and fallacies. American Educational Research Journal, 2, 187-202.

Tenenbaum, J. (2007). Explorations in Language Learnability Using Probabilistic Grammars of Child Directed Speech. Downloaded from on 22 Mar 08.

Townsend, D. & Bever, T. (2001). Sentence Comprehension, the Integration of Habits and Rules. Cambridge, MA: The MIT Press.

Tyler, L. (1989). The Role of Lexical Representation in Language Comprehension. In Lexical Representation and Process. Edited by W. Marslen-Wilson. Cambridge, MA: The MIT Press.

Van Dijk, T. and Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press.

Van Eynde, F. (2006). NP-internal agreement and the structure of the noun phrase. Journal of Linguistics 42, 139-186.

Vosse, T. & Kempen, G. (2000). Syntactic structure assembly in human parsing: a computational model based on competitive inhibition and a lexicalist grammar. Cognition, 75, 105-143.

Wilkins, W. (ed.) (1988). Syntax and Semantics 21: Thematic Relations. NY: Academic Press.

Wilks, Y. (1975). A preferential pattern-seeking semantics for natural language inference. Artificial Intelligence, 6 (1) (1975), pp. 53–74.

Wilks, Y. (1975a). Preference Semantics. In Formal Semantics. Edited by E. Keenan. NY: Cambridge University Press.

Wilks, Y. (1975b). An Intelligent Analyzer and Understander of English. Communications of the ACM, 18, pp. 264-274.

Wilks, Y. (1972). Grammar, Meaning and the Machine Analysis of Language. London: Routledge & Kegan Paul.

Wilks, Y. & Gomez, R. (1988). New Mexico State University’s Computing Research Laboratory. AI Magazine, 9 (1), 79-94.

Wilks, Y., Huang, X. & Fass, D. (1985). Syntax, Preference and Right Attachment. In Proceedings of the 9th International Joint Conference on Artificial Intelligence (IJCAI-85). Los Angeles, CA, pp. 779-784.

Young, R. (2003). Cognitive architectures need compliancy, not universality. Commentary on Anderson & Lebiere (2003).

Zipf, G. (1932). Selected Studies of the Principle of Relative Frequency in Language. Cambridge, MA: Harvard University Press.

Zipf, G. (1949). Human behavior and the principle of least effort: An introduction to human ecology. Cambridge, MA: Addison-Wesley.

Zwann, R., and Radvansky, G. (1998). Siutation models in language comprehension and memory. Psychological Bulletin, 123, 162-185.


Appendix A: Implementation Details

Basic Terminology

Referring Expression

Every referring expression has a bind-indx to support binding of co-referential expressions

Clause Level Grammatical Function

Predicate (Head of Clause) Level Grammatical Function

Nominal Level Grammatical Function

Clause Level Grammatical Feature

Nominal Level Grammatical Feature


Lexical and Grammatical Hierarchy (Chunk Types)

The Full Ontology

Parts of Speech

Referring Expressions

Object Referring Expressions

Situation Referring Expressions

Oblique Referring Expressions

Predicate Verb Types

The Goal Hierarchy

Double-R includes a hierarchy of goal types that is used to control processing. The use of a goal hierarchy is uncommon for ACT-R models. It is more common to use a state slot to control the execution of productions. However, this practice is unfortunate. The use of a state slot to control the execution of productions basically tranforms a production system architecture into a procedural programming language. The real power of a production system is the fact that any production is eligible to fire at any time so long as it matches the context. The introduction of a state slot effectively reduces the number of productions which match the context to just one (or a small number). Whereas the use of a state slot requires an exact match for a production to be eligible to fire, the use of a goal hierarchy allows productions to match the context at different levels of abstraction.


Linguistic Representation Details (Chunks)

This section discusses the chunk representations that are the actual representations used in Double-R, not the abstracted tree representations that are displayed elsewhere in this document. These chunk representations rely on the representational capabilities of the declarative memory system of the ACT-R Cognitive Architecture.

Representing Referring Expressions in Double-R

In the main body of this document, we discuss the representation of referring expressions with diagrammatic representations generated from the output of Double-R using a tool called phpSyntaxTree (Eisenbach & Eisenbach, 2006; Heiberg, Harris & Ball, 2007a). These representations are simplified in various respects. In this section we present the full details of the representations using the representational framework of ACT-R.

ACT-R provides a frame-based notation for representing declarative knowledge. Frames in ACT-R are called chunks (a psychological term used to describe declarative memory elements). Chunks are organized into a hierarchy of chunk-types. The definition of a chunk-type specifies the type and its location within the type hierarchy. The chunk-type inherits slots and default values from its ancestors and can specify additional slots and default values. When a chunk of a given chunk-type is created, the chunk is given a unique name and the default values can be overridden.

Referring expressions (refer-expr) inherit from a construction-with-head chunk-type. The construction-with-head chunk-type includes a head slot with the default value head-indx which indicates the expectation for a head. The referring-expression chunk-type adds a specifier slot with the default value none and a bind-indx slot with the same default value. When a referring expression chunk is created via lexical projection, the bind-indx slot will be assigned a value which will either be a unique index or a previously assigned index (indicating co-reference). If the referring expression is projected by a lexical item that functions as a specifier (or operator), this lexical item will be integrated into the specifier (or operator) slot. If the referring expression is projected by a lexical item that functions as a head, this lexical item will be integrated into the head slot. Depending on the context, additional slots may be assigned values when the referring expression is projected. For situation referring expressions, this often includes the subject as well as the specifier (or operator) or head.

Referring expression chunks encode an ordered sequence of grammatical functions (reflecting normal surface order), followed by an unordered sequence of grammatical features. Each grammatical function or feature in the referring expression chunk is represented as a slot name-slot value pair. The slot name provides the name of the function/feature and the slot value is either the name of a chunk that represents the value for the function/feature, or a literal (string or number). Using the name of a chunk as the value of a slot introduces a level of indirection into ACT-R based representations. The contents of the chunk are not directly accessible from the chunk name. The chunk is a flat, two-level structure. This convention aligns with approaches like Minimal Recursion Semantics (Copestake et al., 2006) which use indices to support the encoding of complex, nested semantic representations in a flat logical notation. It contrasts with linguistic approaches like HPSG (Sag, Wasow & Bender, 2003) and SBCG (Sag, 2009; Sag, 2010) in which the value of a slot is not the name of a nested structure, but the structure itself. Despite this important distinction (cf. Ball, 2011b), the diagrammatic representations omit this level of indirection, displaying the referring expressions themselves (or other constructions) as the value of a slot. For example, in

the altitude restrictions

the object referring expression (obj-refer-expr) chunk contains a specifier (spec) function slot. The specifier function is filled by the name of a determiner chunk. Display of the determiner chunk name, the-determiner-wf-pos, is suppressed in the diagram. The determiner chunk has a word slot that has the value the-word. The-word is a special chunk that has no slots. Chunks without slots function like literals (i.e. they are the leaf nodes in the representations), except that they can spread activation to other chunks in declarative memory whereas literals cannot (cf. Ball, 2012a). To further simplify the diagrams, display of the word slot is also suppressed.

Shown below are the ACT-R chunks created or retrieved during the processing of the expression the altitude restrictions. This corresponds to the simplified diagrammatic representation shown above with the entries in that diagram highlighted (and colored) for cross-reference.
ACT-R chunks for the altitude restrictions

As can be seen, the ACT-R chunks encode more information than is shown in the diagram. We will not attempt to motivate all the slots in this document other than to say that they are the result of a grammatical and lexical analysis interacting with the representational capabilities of ACT-R. Many of the slots in the lexical chunks (those ending in wf-pos which stands for word form specific part of speech) including letter-1 to letter-11, trigram-1 to trigram-11 and p-trigram-1 to p-trigram-6 (i.e. peripheral trigram), are there to support ACT-R's spreading activation mechanism for DM retrieval. Since ACT-R chunks cannot have dynamically added slots, each chunk has the maximum number of these slots.

Although the ACT-R chunks are definitive, they are difficult to examine and diagrammatic representations have been used in this document for expository purposes. However, it is important to make a few points: In addition to referring expressions like the altitude restrictions which refer to objects, there are referring expressions which refer to situations. For example, the expression the book is on the table refers to a situation in which a book is on a table. This situation referring expression expresses a relationship of being on that exists between two objects which are themselves expressed as object referring expressions. In this example, the auxiliary verb is functions as a specifier for a situation referring expression, just like the functions as a specifer for an object referring expression. Whereas the indicates definite reference to an object with respect to a situation (and context), is indicates definite reference to a situation with respect to the temporal context of the situation as indicated by the present tense of is. The terms clause or sentence are often used to describe linguistic expressions which refer to situations. These terms are less problematic than form-based terms like NP, VP, IP, CP, and they have been used in this document for expository convenience. Sentences also prove important for the handling of intra-sentential vs. inter-sentential co-reference. We use the term argument as a generic cover term for referring expressions that participate in a situation referring expression. Arguments may be categorized in terms of their grammatical function as subject, object, indirect object and complement (i.e. an argument other than subject, object or indirect object). In syntactic treatments, the term complement is often used the way we use the term argument, except that the syntactic use of the term complement loses its semantic motivation and correspondence with our use of the term argument, when functional heads, which we reject, are introduced.

Situation referring expressions are typically headed by words which express properties of objects (e.g. (predicate) adjectives), actions involving objects (e.g. intransitive verbs) or relations between objects (e.g. transitive verbs, (predicate) prepositions). Situations headed by predicate nominals differ in that the head is not a relation or property, although predicate nominals function like properties. Although they function like properties, predicate nominals are referring expressions. However, they are co-referential with the subject (i.e. they share the same bind index as the subject). A possible exception is bare nouns like president in expressions like Obama is president. In this case, the noun functions as the head of the situation referring expression (very much like a predicate adjective) and need not project an object referring expression that is co-referential with the subject.

Double-R projects the minimal structure necessary to represent a linguistic expression. In the case of predicate adjectives and predicate intransitive verbs which do not combine with an object, the adjective or intransitive verb is directly integrated into the representation as in he is happy:

he is happy

However, Double-R also handles the case where more structure is needed as in he is very happy:

he is very happy

This is achieved in Double-R by projecting a predicate-adjective or predicate-intransitive-verb construction in parallel with the integration of the adjective or intransitive-verb. When the more complex construction is needed it replaces the simpler structure using the context accommodation mechanism. It is also possible to project the more complex structure only when it is needed, but this results in slower processing (Ball, 2011b).


Language Specific Buffers

ACT-R comes with a collection of buffers (e.g. retrieval buffer, imaginal buffer, goal buffer) that constitute (at least part of) its working memory (Ball, 2012). These buffers have proved inadequate to support language analysis, especially with respect to modeling binding and co-reference, and we have added a collection of language (and grammatical function) specific buffers which include subject, object, indirect object, wh-focus, relative-focus and locative-focus. The existence of these buffers is motivated on functional grounds. They are needed to support binding and co-reference, and language processing more generally. Although Taatgen & Anderson (2008) argue on theoretical grounds for limiting functionality and keeping ACT-R tightly constrained, functional considerations are important in the creation of complex cognitive models and may have theoretical implications as well (Ball, 2011b, 2012). Whereas a model which focused on a particular aspect of binding or co-reference might make do with the existing buffers, a broad coverage model like Double-R that is intended to be functional as well as cognitively plausible simply does not have the needed architectural resources. Fortunately, ACT-R 6 supports the addition of buffers (and modules) as a mechanism for extending ACT-R and we have taken advantage of this capability in our research.

The language specific buffers that have been added to ACT-R give the language analysis capabilities of Double-R the flavor of a language module. However, we do not claim that these buffers are fully encapsulated within a language module, and the language analysis productions in the procedural module which access these buffers are interleaved with productions which perform other cognitive functions, and which may also access these buffers. We also do not claim that these buffers are innate. For example, in languages like Chinese (unlike English), wh-words occur in normal argument position and a wh-focus buffer may not be needed. We do claim that humans are capable of learning how to buffer information that may be subsequently needed — a form of expertise. Binding and co-reference in language analysis provide concrete examples of this need.

To see how these language specific buffers are needed to support Double-R’s binding mechanism, consider the processing of
  1. I want it
  2. I want to eat
  3. I want you to eat
  4. I want the cookie to eat
  5. What do you want me to eat
Want is a transitive verb that can optionally take an infinitival complement in addition to or in place of the object. When the infinitival complement occurs, the subject of the infinitival complement is not expressed and must be inferred from the matrix clause. There are two possibilities: 1) the subject of the infinitival complement corresponds to the object of the matrix clause (if there is one), or 2) the subject of the infinitival complement corresponds to the subject of the matrix clause. To handle these alternatives, both the subject and object of the matrix clause must be available to support binding by the subject of to eat. In addition, if the infinitival complement is headed by a transitive verb (e.g. eat), the object of the transitive verb may also be unexpressed. In this case, the object may also be inferred from the matrix clause.

First, consider the processing of I want it. When I is processed, a nominal corresponding to I is retrieved from memory (or projected from I), its referent is determined and the nominal is placed in the subject buffer. (Note that the processing of I does not lead to projection of a clause. Language is often used to point out objects in the environment and projecting a clause on the basis of a nominal is not well motivated.) When want is processed, a declarative clause is projected based on want being a tensed verb and a subject being available in the subject buffer. The nominal in the subject buffer is integrated as the subject of the clause. When it is processed, a nominal is retrieved (or projected). The nominal is integrated as the object of the predicate-transitive-verb construction projected from want and it is also placed in the object buffer. The resulting representation is shown below:

I want it

This example demonstrates the need for a subject buffer to support integration of the subject into the situation referring expression projected by the main verb under the assumption that the subject does not project a situation referring expression by itself (for the reasons discussed above).

The reason the object referring expression is placed in a special subject buffer and not in a subject slot of a chunk in a core ACT-R buffer (e.g. goal, imaginal), is because the grammatical features of the object referring expression need to be accessible to support binding and co-reference. If the object referring expression chunk were placed in a slot of a chunk in a buffer, its grammatical feature slots would not be accessible. The reason the object referring expression is not placed in a core ACT-R buffer where the grammatical feature slots would be accessible is because there are an insufficient number of core buffers to hold all the referring expressions in the matrix clause. The alternative of storing the object referring expression in declarative memory and retrieving it when needed has proved functionally unmanageable for handling multiple and chained long-distance dependencies. For example, in What do you want the boy on the chair by the table next to the girl to eat, binding the subject of to eat to the boy and the object of to eat to what leads to severe interference without an object and wh-focus buffer to facilitate this binding.

The processing of I want you to eat proceeds similarly to I want it, up to the processing of to eat. At this point, the object referring expression retrieved from I is in the subject buffer and the object referring expression retrieved from you is in the object buffer. The expression to eat is processed as a multi-word unit which projects an infinitive clause. The subject of an infinitive clause must be recovered from the linguistic context. In this example, there are two object referring expressions available: the subject and the object of the matrix clause. The grammatical default is to prefer to bind the subject of the infinitive clause to the object of the matrix clause. This default applies so long as the grammatical features of the object are compatible with the subject of the infinitive clause. In particular, the subject of to eat is presumed to be animate or human. The pronoun you projects the animacy feature human, so the default applies and the subject of the infinitive clause is bound to you. To support binding, the subject of to eat is represented by an implied object referring expression with head PRO (the term PRO is borrowed from generative grammar and indicates an implicit subject). The bind index of PRO is set to match the bind index of you. Although PRO represents the binding from the subject of the infinitive to the matrix object, the matrix object itself (not PRO) is placed in an embedded subject buffer, to support further processing. This example demonstrates the need to retain the object of the matrix clause for binding. The resulting representation is shown below.

I want you to eat

In the processing of I want the cookie to eat, the animacy feature of the object is not compatible with the subject of the infinitive clause. In this case, the alternative of binding to the subject of the matrix clause is considered. (Actually, these alternatives are considered in parallel based on ACT-R’s production matching capability combined with production utility, with the highest utility production which matches the input and context determining the outcome.) Since the animacy of I is compatible, the implicit subject of the infinitive clause is bound to the matrix subject. In addition, the object of the matrix clause is available to be bound by the implicit object of the predicate-transitive-verb construction projected by to eat and the grammatical features are compatible with that binding. The implicit object of to eat is represented as an object referring expression with head trace (the term trace is also borrowed from generative grammar and indicates a displaced object). This trace element is set to match the bind index of the matrix object and the matrix object is placed in an embedded object buffer which is distinct from the object buffer. This example demonstrates the need to retain both the subject and object of the matrix clause to support binding. The resulting representation is shown below.

I want the cookie to eat

The grammatical features that get projected from lexical items to referring expressions (Ball, 2010a) are crucial for determining binding, as are the argument preferences of the verbs want and eat which are transitive — indicating the expectation for an object. However, grammatical features are not always definitive. Consider the expression I want it to eat. The pronoun it can be used to refer to either animate (e.g. dog) or inanimate (e.g. cookie) objects (and even humans when their sex is unknown as is often the case with babies). In the case of it, binding and co-reference depend on the actual referent. If the referent of it is a cookie, then binding the object to it is preferred; if the referent is an animal or human, then binding the subject to it is preferred. In the absence of an identified referent, the binding is ambiguous. By default, Double-R treats it as animate and binds the subject. Double-R doesn’t currently have the capability to use the referent of a referring expression to determine binding in ambiguous cases.

To motivate the need for retaining the indirect object in a buffer, consider the processing of the expression I gave him the cookie to eat. In this example, the processing of him leads to retrieval of an object referring expression which is integrated as the indirect object of the predicate-ditransitive-verb construction projected by gave. This object referring expression is also placed in the indirect object buffer. At the processing of to eat, the default preference is to bind the implied subject of the infinitive clause to the indirect object which is normally animate or human. It is also preferred to bind the (direct) object to the implied object of the transitive verb eat. The resulting representation is shown below.

I gave him the cookie to eat

The processing of intra-sentential infinitival complements provides strong motivation for retaining object referring expressions which function as arguments in the matrix clause in buffers to support binding by implied arguments in the subordinate clause. An earlier version of Double-R made use of a fixed size stack of object referring expressions, but lacked grammatical function specific buffers. This architecture proved to be functionally inadequate. In the previous example, it is possible for the object referring expressions for I, him and the cookie to be stacked such that the object the cookie is on top, the indirect object him is next and the subject I is on the bottom. While a stack will handle this example, it does not generalize to more complex examples. Consider I gave him the book on the table in the kitchen to read. If all object referring expressions (e.g. I, him, the book, the table, the kitchen) are stacked, then it is not possible to determine the grammatical function of the object referring expressions based on position in the stack. Further, if the stack is fixed in size (an unbounded stack is cognitively implausible), it is always possible to generate a linguistic expression which will cause the stack to overflow leading to the loss of a referring expression that is needed for subsequent binding. Of course, this would be OK if it matched empirical findings, but it doesn’t appear to. On the other hand, the stack of object referring expressions is still needed to support the integration of post-head modifiers. In the example, in the kitchen modifies the table, and on the table modifies the book. A fixed size stack on the order of 3 or 4 object referring expressions seems a cognitively reasonable mechanism for handling post-head modifiers which typically modify the preceding object referring expression, but may also modify earlier expressions (e.g. in I saw the man on the hill with the binoculars, with the binoculars may modify saw, the man or the hill, although modifying the hill is semantically dispreferred). The current model combines grammatical function specific buffers with a stack of the most recent object referring expressions. Besides being functionally motivated, this architecture is compatible with empirical evidence of primacy and recency effects. The grammatical function specific buffers retain the outermost object referring expression in a deeply modified expression—supporting primacy effects, while the fixed stack retains the 3 most recent object referring expressions—supporting recency effects. It is important to note that in this architecture, an object referring expression may constitute the contents of more than one buffer. In the example above, the cookie fills the object buffer as well as the most recent object referring expression buffer in the stack. In a sense, the buffers provide pointers to object referring expressions, except that the contents of the object referring expression are directly accessible in the buffer without a retrieval from declarative memory.

Wh Questions

The processing of wh-questions demonstrates the need for a wh-focus buffer to support binding. Consider the expression What do you want me to eat?. The processing of what leads to projection of a wh object referring expression that is put in the wh-focus buffer. Note that the processing of what does not lead to projection of a wh-question. (There are wh-constructions like what he said…is true that are not wh-questions.) The processing of the auxiliary verb do in the context of a wh object referring expression in the wh-focus buffer leads to projection of a wh-question with a wh-focus function that is filled by the referring expression in the wh-focus buffer and an operator function that is filled by do. The processing of you following do results in retrieval of an object referring expression that is integrated as the subject of the wh-question and this object referring expression is also placed in the subject buffer. The processing of want leads to projection of a predicate transitive verb construction that is integrated as the head of the wh-question. In addition, an implied trace object of want is created and bound to the wh object referring expression in the wh-focus buffer. The binding of the trace object to the wh-focus reflects Double-R’s greedy mechanism for modeling long distance dependencies involving fronted wh words. Note that if the entire input were What do you want?, the binding of the implied object of want to the wh-focus is expected.

What do you want

The processing of me leads to retrieval of an object referring expression. This referring expression is integrated as the object of want displacing the implied trace that was bound to the wh-focus. This displacement is an example of the context accommodation mechanism at work. The processing of to eat leads to projection of an infinitive situation referring expression. An implicit PRO object referring expression is projected and bound to the matrix object me. In addition a trace object referring expression is projected and integrated as the object of to eat. This trace expression is bound to the wh-focus.

What do you want me to eat


The examples above focus on the importance of representing grammatical features and verb argument preferences for determining the binding of implicit arguments in complement clauses associated with the main verb want. There is an additional contrast between the behavior of verbs like want (object control verbs, or better, object-to-subject control) and verbs like promise (subject control verbs, or better subject-to-subject control) which affects the binding of implicit arguments. Control is a central topic in modern linguistic theory (cf. Culicover, 2009). Consider the following classic examples from Chomsky (1981): Persuade is an object control verb: the object of persuade determines the binding of the implicit subject of the infinitival clause to go. This is the default behavior discussed above for want. Promise is a subject control verb: the subject of promise determines the binding of the implicit subject. Subject control is the exception for verbs — only a few verbs exhibit this preference. Control is not limited to verbs. Adjectives functioning as predicates also exhibit control. Consider (also from Chomsky, 1981). Subject-to-subject control is the default for (predicate) adjectives. The subject of eager determines the implicit subject of to please. Easy is exceptionally a subject-to-object control adjective: the subject of easy determines the implicit object of to please and the implicit subject of to please is unbound (e.g. He is easy for someone to please). The examples with adjectives also demonstrate the possibility of adding an optional clausal complement to clauses containing a predicate adjective, despite the fact that adjectives do not normally expect a complement. He is eager and he is easy are both grammatical without the infinitival complement.


This section motivates the introduction of grammatical function specific buffers (subject, object), the representation of grammatical features (number, animacy, gender), and the encoding of verb preferences (transitive vs. intransitive; subject control vs. object control) in order to model the binding of implicit arguments of complement clauses within Double-R Grammar.

List of Language Specific Buffers and Corresponding Chunk Types

The buffers are highlighted in blue.