4 Altercentric perception of the speaker: ego's virtual participation in alter's act

Moving with the other's goal-oriented movements
Let me begin by drawing your attention to two daily-life episodes which I hope you may recognize as fairly typical of the way in which we more or less unwittingly engage in the activities and movements of others:
Episode 1. -- When your are standing in the sport field eagerly watching a high- jumper making her last attempt to get over the bar, just as she is in the process of jumping your may unwittingly behave as if you are making the effort with her, eliciting some firing of your own leg muscle cells, perhaps even resulting in one of your legs being slightly lifted as she makes the jump.

Episode 2. -- When spoon feeding a baby, we sometimes unwittingly open our own mouth as the baby opens his mouth to the spoon full of food. And as the baby reciprocates by offering food to the feeder's mouth (as I have demonstrated by video-records of infants (about 11 ½ month old) we may again notice by the baby's mouth movements how the baby appears to participate in the other's reception of the food.

In order for the baby to learn to reciprocate the spoon-feeding the baby has not only experienced being spoon-fed, but must somehow have felt to take a part in the feeder's activity from the feeder's stance, as if the baby were co-authoring the feeding, even though the feeder is the actual author. Moreover, sitting face-to-face, the infant's re-enactment of the activity of the feeder as a model depends on sensori-motoric reversal, i.e. mirror reversal of the model's movements in order for the infant to feel to be virtually moving with the model's movements.
Such mirror reversal is quite a feat, and difficult to accomplish if the learner is restricted to egocentric observation compelling the model to hand-guide or face the same direction as the learner. In contradistinction, the face-to-face exposure in deferred imitation experiments with infants evoke, I have proposed, sensori-motor perception in a participatory mirror-reversed sense enabling the infant to be feel to be virtually moving with the model's movements from the model's direction and to re-enact from the e-motional memory (in the double sense of 'out of motion' and 'emotion') of having thus felt to be co-enacting the model's activity. This I have termed learning by altercentric participation, and I shall address the issue of whether it may apply to early speech perception and language learning.
In a series of seminal papers on the pragmatic and cultural learning bases of language acquisition, Tomasello and his co-workers have highlighted some pertinent turning points, such as the advent of joint attention, and social-cognitive abilities such as simple perspective-taking revealed in nine-month-olds and simulation in older children (Tomasello 1993; Tomasello 1999; Tomasello et al. 1993; Akhtar & Tomasello 1998). Where I think we disagree pertains to my emphasizing the innate embodied social-emotive basis for such learning prerequisites. I believe there is an ontogenic path from social learning by bodily simulating other's movements to mental simulation of other minds, and which is grounded in an innate capacity for altercentric participation, dependent on sensitizing opportunities in face-to-face interactions in order to become operative. So far, I have elsewhere attempted to specify the innate basis and sensitized operations in infant learners at nine months or older (Bråten 1998ab). I regard the question still to be open, however, whether altercentric perception or participation may be attributed even to younger infants, for example to early speech perception by preverbal infants. To this I shall now turn.

Speech perception by young infants and temporal contours of feelings
In her studies of early speech perception Patricia Kuhl has demonstrated by a number of comparative experiments how infants by 6 months have begun to "turn a deaf ear", so to speak, to sound distinctions that make no sense in the ambient language, for example, responses to phonetic units /r/ and /l/ which is not distinguished in Japanese, and to /I/ in English and /y/ in Swedish. She points out that as they listen to speech they appear to include more than the auditory characteristics; "the infants store 'polymodal' aspects of speech -- the auditory and visual speech they experience, and the motor patterns they themselves produce." (Kuhl 1998: 300). She suggests infants acquire a life-long native-language accent inter alia by virtue of an innate link between perception and action, extending the influence of linguistic experience beyond perception to the motor patterns acquired in speech.
If that is the case, then this may perhaps invite a specification in terms of learning by altercentric perception of ambient speakers and virtual participation in their speech production (?). Once established, Kuhl points out, the perceptual and perceptual-motor system underlying speech is difficult to alter. Adult speech patterns are coloured by a native-language accent which encompasses the pronunciation pattern, timing, stress, and intonation typical of the rhythm and melody, as it were, of the specific language (Kuhl 1998: 306). Now, in order to learn the musical score of such rhythmic and melody patterns, including the adequate pronunciation pattern, timing, stress, and intonation, this would somehow entail that the learner, perhaps even the pre-verbal learner, be capable of altercentric perception of and participation in the sound-producing movements of the ambient speech performers.
The capacity revealed by vocal imitation in infants may be precursory to that.
If this kind of learning applies, however, it can hardly be accounted for in terms of perspective-taking in a social-cognitive sense, but rather in an e-motive and participatory sense of more primitive subjective experience (Erlebnis) evoking temporal feeling flow patterns that are shared by the speaker and the learner. Stern (1999: 70) writes about the "continual shifts in arousal, activation, and hedonics occurring split-second-by-split-second that are evoked by events taking place in the body and mind of the self and others and which are integrated into temporally contoured feelings". He terms this vitality contours, and applies it to activities that have a characteristic intensity time-course, not reflecting the categorical content of an act but rather the manner in which the act is performed and the feeling that directs the act. Altercentric participation invites such terms of feeling flow patterns of shifting intensity over time (like a musical phrase that cannot be captured or even imagined by taking a single note or a "sound-photograph", as it were). A smile or a laughter, for example, as acted or perceived, growing steadily or slowly, or exploding, or progressing slowly and suddenly bursting open, would entail or evoke such specific contours of patterned flow of feelings over the present time shared by the participants. As Stern points out, it is not only the external behaviours of the infant's social partners that are repeated and varied but also the vitality contours that their behaviour elicits in the infant. The parents create the optimal interactive opportunities for the infant best to identify, discriminate, and represent those vitality contours that are characteristic of his social experience as acted, perceived, and sensed (Stern 1999:73)
I would expect this to apply to the preverbal learner in speech perception and precursory accent learning by altercentric participation. Acquisition of the musical score of the specific rhythmic and melody patterns, may rest on the ability to experience by altercentric participating in the model, co-experiencing the specific temporal contour of feelings that split-second-by-split-second are evoked and shared with the model, leaving the learner with "e-motional memory"of the vitality contours as a basis for circular re-enactments and generation of events that evoke similar feeling flow.

Altercentric participation and simulation of processes in the conversation partner
Let me now turn to verbal conversation. While the previous examples were pre- and extra-verbal episodes, here is a conversational episode which I expect would be familiar to most of you:
Episode 3. -- You are listening to a conversational partner in the process of making a verbal utterance who, before the utterance is completed, appears to hesitate or to be at loss for the right words, and without hesitation you supply the words, completing the utterance of the speaker, who is silently nodding or confirming with just a "yes" your verbal completing of the other's half-made utterance.

When you more or less unwittingly complete covertly or overtly what you experience that the other is about to say, you do so by virtue of altercentric participation in the other's speech act. Even though the other is the initial author of the incomplete sentence, your virtual co-author participation in what has already been said enables you to overtly join in the co-authorship.
Sometimes, of course, the other may react differently, refusing to accept your completion of the sentence. A misunderstanding is entailed. Three decades ago I posed this question: What kind of coding regulation is evoked in the normal adult during conversation -- in particular when suspecting misunderstanding or being misunderstood by the other? Drawing upon my computer simulation studies, expanding upon Mead's notion of anticipatory response and referring inter alia to Libermann's (1957) report on some results on speech perception, I proposed a model of co-actor coding simulation circuits evoked in actors engaged in symbolic interaction. Upon experienced breakdown of intersubjective understanding, I posited, the actor will resort to simulations of the co-actor's perspective and processing; coding can be regulated on the basis of simulating the reverse coding processes in the coactor. Egos' encoding of what is (to be) said is regulated by predictory simulation of Alter's decoding. Ego's decoding of what has been heard is regulated by postdictory simulation of Alter's encoding (Bråten 1973, 1974).
I presented this model in the last chapter of a treatise in Norwegian, with the empirical situation-oriented semantics developed by Arne Naess (1953, 1961) as a point of departure, and consistent with Rommetveit's (1972:178-79) emphasis on the speaker's anticipation of decoding and the listener's reconstruction of the encoded message.

A model of internal coding regulation by simulation of processing in the partner
Here follows inserted a translated extract from my chapter on coding simulation ("Elements in a socio-semantic theory"):

Symbol programmes. - ... The perspective of programmes or "plans" as a bridge between cognitive content and execution of action was efficiently introduced by Miller, Galanter and Pribram (1960). Their TOTE-concept (for Test-Operate-Test-Exit) for circuits of part processes integrated in hierarchical processing structures is adopted by Rommetveit (1968) in his model of word-generation. The concept of 'symbol programme' is used by the author in two computer models for simulating communication and symbolic interaction (Bråten 1968, 1971a). Here the actors' interaction is modelled in terms of choice among, and execution, of sender- and seeker- or receiver-programmes which are building blocks of exchange programmes. The programmes consist of sequences of processes that may be activated and passivated, and oriented according to local operations and tests.
In this way production and processing of utterances are conceived as the execution of symbol programmes across various fields of a system of two co-actors. This precludes an analytical resolution into language signs and content relations. Rather than being analysed as isolated elements, they are seen as local, inter-related entities in a larger whole: Situational definition and sign- and content-processes are interactive parts of symbol programmes, and symbol programmes are building blocks of more comprehensive interplay.
Thereby it becomes impossible to maintain the analytical distinction between, on the one hand, relations between language sign and meaning (the semantic relation) and, on the other hand, relation between language signs and social usage situation (the pragmatic relation), introduced by Morris (1964). The relata of these relations invite a systems perspective that views them as integrated elements of a common whole...

Coding regulation by internal feedback circuits. - Some of the conditions for establishing practical understanding in language interplay may be highlighted by a model of coding as [integrated] regulative circuits. It concerns the questions about the kind of mechanisms that enable an actor, first, to be understood, and second, to understand a co-actor during language interplay.
G. H. Mead's concept of anticipation of the co-actor's response can be used in an explanatory reply to the first question: The actor regulates his utterance by calling up in himself the co-actor's response to the utterance he is in the process of making. To the extent that the co-actor's role is adequately taken, such regulative adjustment increases the likelihood that the actor will be understood. In his frame, Rommetveit (1968:65) points to encoding and decoding being complementary processes, and that encoding entails anticipatory decoding.
This explanatory principle is retained in the first of the two following propositions about coding mechanisms:

(H1) During execution of a sender programme, the successive encoding of activated content into sign complex is regulated by internal feedback from implicit successive decoding of the sign complex ... (informed by the deviation between the activated content and the output of the implicit decoding).
(H2) During execution of a receiver programme, the successive decoding of a given sign complex into activated content is regulated by internal feedback from implicit successive encoding of the content ... (informed by the deviation between the given sign complex and the output of the implicit encoding)....

The first statement (H1) expresses the assumption that satisfactory successive encoding depends on simulated decoding. Only if the sign complex being produced is also implicitly being processed can the sender programme be adequately executed. The second statement (H2) expresses a similar assumption regarding the recipient programme; it can be completed if and only if form is given to the content output of the sign processing. Adequate decoding presupposes simulated encoding...
In the execution of a sender program, the goal state may be consider satisfied when the actor's expectation of being understood exceeds a certain threshold, such that he may answer 'Yes' to the following question: Would I in the other's shoes have understood what I am now in the process of saying? Upon satisfaction, the regulation by simulated decoding according to H1 discontinues. Neither this test, nor the regulative [internal] feedback, is implied to be conscious.
In the execution of a receiver program, the goal state may be consider satisfied when the actor's evaluation of his understanding the co-actor exceeds a certain threshold, checked for by the following implicit test: Would I in the other's shoes have expressed in the way he did it what I have now understood him to say? As long as the reply is 'No' to such a question, the simulative regulation of his decoding according to H2 continues until his subjective evaluation exceeds the threshold value.
The two propositions H1 and H2 may be reduced to the following postulate:

(P1) By virtue of implicit simulation of the co-actor's [reverse] coding processes, the actor regulates his own coding processes from criteria that serve his attunement to the co-actor with respect to understanding and being understood.

It should be noted that P1 concerns internal feedback and regulation from subjective expectations and evaluations. An actor may successfully execute symbol programmes if and only if he has resources for implicit encoding and decoding. These resources as employed differently in sender programmes and receiver programmes. The fact that external feedback serves to regulate the programme execution is beyond the domain of P1, H1 and H2.

Analogical to sensory-motor circuits. - ... Understanding of the circular coherency of regulation within the sensory-motor and the motor-sensory systems is basic to the understanding of the principle expressed by H1 and H2. Sensory-motor processes are local to the same global system, entailing that motor programmes cannot be adequately executed without involving sensory operations, and that sensory programmes execution requires motor operations as programme parts (cf. Gibson 1966:320).
In conjunction, H1 and H2 express the postulate that the coding system, at a grosser level of resolution, entails circularity analogue to the circularity of the sensory-motor system....
The functional "explanations" differ, however, of the circularity of these types of system: Sensory regulation of motor processes and motor regulation of sensory processes serve the organism's adaptation to the environment through optimalization of input and output signals. Regulation of encoding through decoding (H1) and regulation of decoding through encoding (H2) according to (P1) serve the actor's adjustment to the co-actor with respect to understanding and being understood in situations of symbolic interaction...
The sensory-motor circularity may be given a psychophysiological explanation. Statements about circular coding processes evoke a social psychological explanation. By virtue of functioning as a simulated recipient of the utterance which the actor is making qua sender, he may adjust the encoding to the co-actor system (a person, a group, a population) being the target of the utterance. By virtue of functioning as a simulated sender of the utterance which the actor processes qua recipient, he may adjust his decoding to the co-actor system producing the utterance.
In the same manner in which a simulated prediction of the recipient's decoding serve to regulate the sender's encoding, a simulated postdiction av the sender's encoding serves to complete the recipient's decoding process...." (Bråten 1973a:71-72, 75-80)
Figure 4.1 specifies the model of such internal simulation2 circuits in a simplified form.

Figure 4.1 How the speaker and listener may virtually participate in one another's act. (Top) Indications of how the acts of making and processing an utterance Y are regulated by virtual participation in the reverse (complementary) act of the conversation partner. (Below) A computational model in terms of internal regulation by predictory and postdictory simulation circuits (adapted from Bråten 1973a, p.98; Bråten 1974, Fig.1, p.351).: (left) the speaker regulates her expression process by anticipating the listener's processing: if her utterance candidate Y' yields a simulated listener comprehension, Xcosim, which does not deviate from X, then that candidate is admitted as her actual expression Y. (right) the listener regulates his comprehension by postdictory simulation of the speaker's expression process: if his comprehension candidate X' yields a simulated speaker expression, Ysim, which does not deviate from Y, then that candidate is admitted as the content Xco he takes to have been expressed by Y. These internal feedback circuits operate without external feedback correction signals from the other about deviation between intended content (X) and comprehended content (Xco).

As specified in the diagram (Fig. 4.1) the speaker regulates the utterance she is in the process of making by simulating the listener's comprehension process: if her utterance candidate Y' yields a simulated listener comprehension, Xcosim, which resembles her intended content X, then that candidate is admitted as her actual expression Y. In a complementary manner, the listener monitors and regulates his comprehension by postdictory simulation of the speaker's expression process: if his comprehension candidate X' yields a simulated speaker expression, Ysim, which does not deviate from Y, then that candidate is admitted as the content Xco he takes to have been expressed by Y. These internal regulations are not necessarily conscious and do not depend on external feedback correction signals from the other about deviation between intended content X and comprehended content (Xco).

In current theory-of-mind terms and in terms of altercentric participation
In today's theory-of-mind terms, partly consistent with Tomasello's (1999:308) point that "we more or less simulate other persons' behaviour and psychological functioning on analogy to our own", my 1973-model amounts to stating that upon breakdown of intersubjective understanding the actor will use his theory or simulation model of the other's mind to simulate the other's coding processes. It implies that there would have to be close links between perceptual and motoric processes in the act of speech and in the act of listening: While listening implicit production of that which was said by the other would be evoked in the listener; while speaking implicit processing of that which would be heard by the other would be evoked in the speaker.3.
Today, I would regard these proposed mechanisms to be backed up by the preverbal capacity for what I have termed altercentric participation, and to invite description, albeit at a higher-order verbal level, in almost the same terms used to specify the phenomena examined in my chapter on "Infant learning by altercentric participation -- the reverse of egocentric observation in autism"(Bråten (ed.) 1998:105-124): Unlike echolalia in autism, and which I attribute to be a result of egocentric observation of the speaker from the outside, as it were, the ordinary listener may be seen to be virtually co-authoring the speaker's talking as soon as the listener realizes the end-point towards which the utterance is headed. Likewise, the speaker may be seen to be virtually co-enacting the listener's processing of the utterance and thus may come to arrest and modify herself in the midst of the utterance before any feedback is afforded by the listener to correct for the misunderstanding which the speaker by his altercentric participation feels to be building up in the listener.

The question of neural support
It now appears that the above model of the actor's internal mirroring of complementary coding processes in the coactor, or more specifically, the point made about the coding systems entailing circularity analogue to the circularity of the sensory-motoric system, evoking altercentric perception and participation, may find some potential support in current brain research. In the article on 'Language within our grasp' in the last year's May issue of Trends in Neuroscience Rizzolatti and Arbib (1998: 188-194) quotes a recent status report on speech research by Libermann (1993) where he inter alia makes the point that the processes of production and perception must somehow be linked in the sender and receiver of communication. With this as a preface the authors spell out the relevance of what they term 'mirror neurons', found in monkeys to discharge both when another is observed grasping a piece of food and when the monkey is preparing for grasping the piece by itself. Such mirror neurons appear to subserve a system that matches perceived enactment by the other to semblant, internally generated enactment in the perceiver. On the basis of other experimental results,4 Rizzolatti and Arbib find evidence to suggest that such a mirror system exists also in humans, probably subserved by Broca's area which not only serve speech, but appears to come active during execution and imagery of hand movement and tasks involving hand-mental rotation.
Actually, in a Centre for Advanced Study lecture in the Norwegian Academy of Science (7 March 1997) I had ventured such a prediction, with reference inter alia to the uncovery of allocentric neurones in animals5:

"[if] by way of experimental procedures the neural basis for supporting egocentric perception and the neural basis sensitized to support allocentric perception are uncovered in humans, then I would expect that neural systems, perhaps even neurons, sensitized to realize altercentric perception will be uncovered in experiments designed to test and disconfirm this expectation." (Bråten 1997; 1998:123)6.

The specific neurons to which Rizzolatti and Arbib (1998) refer enable a mirror system to match perceived enactment to semblant, internally generated enactment in the observer of that enactment. Examining experimental results suggesting that a mirror system may be operative in human ontogeny and phylogeny, Rizzolatti and Arbib suggest that usage of such 'mirror neurons' may mark the beginning of intentional communication:

"The actor will recognize an intention in the observer, and the observer will notice that its involuntary response affects the behaviour of the actor. The development of the capacity of the observer to control his or her mirror system is crucial in order to emit (voluntarily) a signal. When this occurs, a primitive dialogue between observer and actor is established." (Rizzolatti & Arbib 1998:191)

With reference to Donald's (1991) assumption about mimesis as precursor to language, they speculate on the sequence of events that might have led from gestural communication to speech. It is likely, they state, that the human capacity to communicate beyond that of other primates depended on the progressive evolution of the mirror system in its globality. Rizzolatti and Arbib see this as the basis for the necessary link between participants in human communication. Should the occurrence of such a mirror system in humans be confirmed, then they have found the neurosocial basis of altercentric participation

Above I have attempted to indicate how altercentric participation may be operative in speech perception by the learner and in discourse by the conversational participants, virtually taking part in one another's production and processing. May such a process of virtual participation in the other's move even apply to imagined others, such as the protagonist in a story listened to? On the basis of experimental story recall studies, including their own studies at Oxford of 3- and 4-year-old children's recall of 'Cinderella' and 'Little Red Riding Hood', Jaime Rall and Paul Harris make this suggestion. Finding that recall is more accurate for verbs, such as 'come' and 'bring', 'go' and 'take', if used spatially consistent with the point-of-view of the main protagonist, they state:

"[it] would be plausible to conclude that listeners engage in what we might call 'altercentric participation' (Bråten, 1998). This would allow us to make sense of the fact that listeners not only encode movements and location in relation to the protagonist, they also anticipate the emotional implications of impending events " (Rall & Harris (in press)).

In conjunction with his other studies of fictional absorption in children, Harris (1998) opens a window here for exploring further whether even the simulation of a situated imagined other may evoke feelings of virtually moving with the movements of such a re-presented other.

1 Read (with the exception of the insertion) at The First Oslo Workshop on Early Attention, Interaction and Communication, Institute of Psychology, University of Oslo, 8 November 1999, organized by Stephen von Tetzchner in connection with an invited talk by Michael Tomasello on the pragmatic basis of language acquisition. In addition is inserted a translated extract from the last chapter of a treatise on symbol and meaning processing (published by Universitetsforlaget/Scandinavian University Books 1973:71-72;75-80), and with a simplified version of the model proposed in that treatise.
2 The idea of such internal simulation made so little sense in those days that the publishers (Universitetsforlaget/Scandinavian University Books), without checking with me, corrected what they thought an obvious spelling error, replacing my term "simulation" by "stimulation" in the key diagram!
3 This would imply that it would be difficult to find pure cases of aphasia, and some refutable implications were discussed in light of reports on encoding and decoding impairments in aphasia. I also explored this proposal by computer and empirical experimentation with paired students in a perturbed task communication situation (Blakar's map design). Facing one another, they are each given maps believed to be identical, except of a route marked on one of the maps. The task is to communicate that map to the other. Since the maps actually are different, communication is bound to break down
4 For example, in one of these studies with human subjects, Grafton, Arbib, Fadiga & Rizzolatti (1996) used position emission imaging of cerebral flow to localize brain areas involved in the representation of hand grasping movements when another hand-grasping individual was observed and when the subjects imagined themselves grasping the object without actually moving their hand.
5 O'Keefe (1985) and others have discovered allocentric 'place cells' in freely moving rats, firing when the animal is in a particular part of a familiar environment regardless of the direction the animal is facing, and which thus differ from egocentric 'view cells' dependent upon own body coordinates.
6 See essay no. 16, this volume. This prediction (repeated in Bråten (ed.) 1998:122-123) is consistent with a theorem proposed in 1994 (Bråten (ed.) 1994:15-16), implied by a collateral about the chiral-like (handedness relation) between the bodily ego and the postulated virtual alter (see Prologue and essay no. 15, this volume).