home | research interests | publications | projects | CV | U Edinburgh

Markus Guhe (2003) Incremental conceptualisation for language production. PhD thesis, Department of Informatics, University of Hamburg.


Conceptualisation is the cognitive task that takes nonlinguistic knowledge and generates preverbal messages (semantic structures), which are linguistically encodable. It can, therefore, be seen as a mediator between perception and other cognitive faculties on the one hand and language on the other. I present the first computational model of the conceptualiser.

Conceptualisation is never directly observable but only via another modality, first of all language. To overcome this difficulty I investigate how descriptions of events in an online setting can be generated. This means, firstly, that I consider verbal descriptions of events – mostly motion events –, and, secondly, that I use a setting in which the verbalisations are produced while the events take place. The temporal interleaving of processing perceptual input and generating verbal output allows to correlate input and output, and is, therefore, a means to overcome the difficulty. Additionally, the online setting reduces the complexity of conceptualisation to a degree where it is possible to account for the open-ended issue of conceptualisation with a computational model. In this setting the conceptualisation task is subdivided into four subtasks: construction of a hierarchical conceptual representation, selection of the events that are described verbally, linearisation of the selected events, and generation of preverbal messages describing these events.

The computational model is inC, the incremental conceptualiser. It uses an incremental mode of operation to cope with the dynamics of the online setting, because incremental processing considers only the changes in the input. The characteristic behaviour of incremental models is to produce output before all input, which may be relevant for the correct and complete computation of the corresponding output, is available. It can be achieved by the parallel (cascaded) processing of a sequential information stream. This means that as soon as a piece of information (an increment) was processed on one stage it is passed on to the next stage. In its strong form I call this property Extended Wundt’s Principle: input is processed and output is generated as soon as it is available.

Based on different kinds of incrementality proposed in the literature I provide a general definition of incrementality and discuss the dimensions along which it can vary. Apart from a cascade of incremental processes incremental models consist of a shared representation of the model knowledge on which the processes operate. As a blueprint for developing incremental models based on the definition I provide a formalisation in the specification language Z.

Cascaded architectures have a unidirectional information flow with no feedback, which keeps the model efficient and simple. Since this is also a source of errors, I propose a relaxation without sacrificing the unidirectionality of the information flow by allowing indirect feedback. In this kind of feedback no explicit information is given back, but the effects of computations influence previous components in the cascade.

inC’s behaviour is adapted by assigning different values to parameters so that different preverbal messages can be generated for the same input. The input in simulations is identical to the one used for verbalisation studies in which participants have to perform the same task. The output of the simulations is compared to the observed verbalisations, which shows that inC is a realistic, ie cognitively adequate model of the human conceptualiser.