In the treatment of stuttering we can distinguish between psychological and speech modification approaches. In the clinical reality a certain combination of these directions seems to be used most frequently.
Procedures which focus on direct modification of speech in order to improve fluency have to be based on one out of three strategies: (1) acquisition and use of a new way of speaking which is fluent 'by construction', (2) acquisition and use of certain motor patterns, so-called fluency skills, which make the occurrence of stuttering less likely, and (3) acquisition and use of more appropriate responses to the stimulus of the expectancy and/or experience of a stuttering event.
In all three strategies the operant paradigm has been applied both as a guideline in teaching and as a framework for the explanation of the behavior of the patient. Especially the latter has always been difficult as we don't know what the reinforcements and punishments really are which supposedly control the behavior of the patient.
It is a common problem that the patient very often refuses to use an 'artificial' way of speaking, even when he speaks perfectly fluently when using it in the clinical environment. For the therapist this may be hard to understand. The patient seems to reject a solution for the problem that he sought to solve when therapy began. The solution of acquiring and using a way of 'artificial speech' may not be perfect-the perfect solution would be normal speech of course-but in the judgment of the therapist the benefit of not stuttering seems to outweigh by far the cost of sounding artificial.
Maybe there are other costs to consider. This contribution is not an attempt to provide an easy answer to the question why stutterers reject artificial speech but an attempt to enrich the discussion by introducing a few concepts from the theory of the pragmatics of human communication.
In 1968 the book 'Pragmatics of Human Communication' by Paul Watzlawick, Janet Beavin and Don Jackson of the Mental Research Institute in Palo Alto, California, was published. In this book the authors make an attempt to use the axiomatic method of mathematics for laying a foundation of a theory of human communication. Among Watzlawick's five pragmatic axioms two axioms are of special relevance for the discussion of our present problem Axiom 2: Every communication has a content and a relationship aspect such that the latter determines the former and is therefore a metacommunication.
Axiom 4: Human beings communicate both digitally and analogically. Digital language has highly complex and powerful logical syntax but lacks adequate semantics in the field of relationship, while analogic language possesses this semantic potential but does not possess the logical syntax which is necessary for unambiguous communication. (Translated from the German version.)
The other axioms concern the impossibility not to communicate (Axiom 1), the punctuation of the sequence of events (Axiom 3), and symmetrical and complementary interaction (Axiom 5).
For the purpose of our discussion I want to explain the difference between the digital and the analogic channel of spoken communication. When a person speaks he constantly sends messages via these two channels. That part of the message which can be translated into discrete symbols like script is transmitted over the digital channel, while the other part of the message which is coded using variables which can vary on a continuous scale is transmitted over the analogic channel.
The statement that the content aspect of a message is coded digitally and the relationship aspect of a message is coded analogically is generally true. However, there are some interesting exceptions. As an example our authors use the sentence "Customers who think our waiters are rude should see our manager" which is ambiguous content-wise, and this ambiguity can be overcome by analogic coding, in this case by the use of stress. Stress the words 'rude' and 'manager' in one case and 'waiters' and 'manager' in another, and you will see the difference. Exceptions in the other direction (digital coding of the relationship aspect) are more common but many of them eventually are not convincing. At least the question whether words really matter when defining a relationship is not a trivial one.
It is very easy, however, to find examples to support the above statement. One example by Watzlawick, Beavin and Jackson goes as follows: "If Mrs. A. points to Mrs. B.'s necklace and asks 'are these pearls real?' the content of her question is a request for information about an object. At the same time she also-and there is no way not to do this-defines her relationship with Mrs. B. The way of her asking (the tone of her voice, her facial expression, the context etc.) will express either a kind friendliness, envy, admiration or some other attitude towards Mrs. B." (Translated from section 2.31 of the German version.)
Another aspect to consider is context. A message is always decoded by the listener against a background of information about the speaker, the listener himself and the situation in which the speech act happens. Even if a speech act is exactly duplicated by another speaker or by the same speaker to another listener or in another situation the meaning may be completely different.
Oral communication is the transmission of a message from the speaker to the listener by means of the emission and reception of sounds and noises. We may say that communication succeeds when the listener can understand what the speaker means on the basis of what he hears, or more simply put, on the basis of what the speaker says. Because of the many ambiguities of our language the listener has to use as much as possible what he can infer from the analogic code and from the context.
There seem to exist very stringent requirements concerning the agreement of what is said (content aspect) and how it is said (relationship aspect). To make it more complicated, the agreement can only be evaluated in the light of the context, which means information that is available prior to the speech act about the speaker, the listener, and the situation.
As Watzlawick and his co-workers explain, analogic modalities of communication comprise at least two classes, kinesics (body movements) and prosody. (They also consider communicational clues of context to be part of analogic communication which I don't find very practical.) In the text tone of voice, voice inflections and rhythm are mentioned when referring to prosodic variables as analogic modalities of speech. For the present discussion prosodic coding, that is the use of intonation, loudness, speed, phrase length, pauses and the complex variable of stress, will be considered only.
It is fairly easy to construct any number of examples to show how inappropriate prosodic coding can defeat a speaker's communicative intention.
It is completely impossible to use a slow monotone way of speaking when expressing interest in a topic ("Oh, that's very interesting!"). However, it is quite effective to use the same kind of speech when expressing boredom ("Oh gee, this bores me to death!").
It is completely impossible to use a staccato monostressed way of speaking when revealing romantic feelings for the first time ("I think about you all the time..."). However, it is quite effective to use the same kind of speech when threatening somebody ("If you don't stop this you'll stay home next time! ")
Another dimension of distortion is the inappropriateness of prosody with reference to the context. The situation becomes especially difficult when the speaker is unknown to the listener. The latter will make judgments on the speaker based on what he sees (appearance of the speaker, facial expression, gestures, other movements) and what he hears (in terms of language content and prosody). A negative judgment will be made immediately by the listener when any disagreement in content, prosody and context is detected. On the other hand, the effective use of prosody appears to be a very powerful tool for the speaker in order to make clear what kind of person he is, or as what kind of person he wants to be seen.
A speaker who uses artificial speech is restricted in his use of prosody-this may serve as a definition of 'artificial speech'. He constantly faces the risk that his listeners don't perceive messages as they were intended, that misunderstandings in the definition of the relationship between speaker and listener occur, and that the listener cannot conclude from the speaker's behavior what kind of person he is or as what kind of person he wants to be seen.
This is what I would like to call the 'message incompatibility conflict' which a stutterer faces when using 'artificial speech'. His message cannot be decoded by the listener as it was intended because the listener receives conflicting information.
It is very interesting that using artificial speech is much easier when the speaker informs the listener about the purpose of his speaking style at the beginning of a conversation, saying e.g. "I am in speech therapy now and I have to use this way of talking in order to ... (whatever the therapist told the patient what the goal is)." By this the speaker instructs the listener to discard all prosodic information. Communicative failure because of conflicting signals becomes less likely, communicative failure because of missing information may still occur.
But isn't artificial speech (AS), if it is stutter-free, better than stuttered speech (SS)? SS is, like AS, limited in the use of prosodic variables. But there are differences which may make SS more attractive than AS for the stutterer.
What are some possible consequences of the message incompatibility concept?
Stromsta, Courtney (1986). Elements of Stuttering. Oshtemo, Michigan: Atsmorts Publishing
Watzlawick, P., J.H. Beavin, and D.D. Jackson (1967). Pragmatics of Human Communication-A Study of Interactional Patterns, Pathologies and Paradoxes. New York, N.Y.: W.W. Norton
Watzlawick, P., J.H. Beavin, and D.D. Jackson (1969). Menschliche Kommunikation-Formen, Storungen, Paradoxien. Bern: Verlag Hans Huber.