Vio(len)ce Part 5: The Voice of Objection

To begin understanding the psychological implications of the violence and trauma of acquiring voice it is necessary to relate each physiological change in the infant to their subjectivity. I state this for two reasons. Firstly, the genesis of voice ought to be framed in relation to contingent psychological implications. Secondly I propose that to render the centrality of voice to contemporary control regimes, to understand why voice maintains itself as a locus of violence, an appreciation of the journey of wedding voice to subjectivity must be garnered.

It is crucial to begin at the precise emergence of voice. The pause after the distress cry, the moment when the infant learns of the presence of the caregiver and enters into socially formatted turn taking, is crucial. As previously detailed, this is the emergence of voice. As soon as the reflexive distress cry is re-formatted as socialized, intentional turn taking and calling the animalistic cry becomes voice. Appreciation of such an instances harmony with the Aristotlean definition of voice (as “a kind of sound with meaning, and not, like a cough, just of the in breathed air” (Aristotle, 1986, p.179)) is the platform from where a question of subjectivity emerges. The infant must, on some level, have a concept of it being a subject; a subject surrounded by objects that provide care and nourishment and alleviate distress. The turn taking, the call, the intent for the call and the wait, the listening all point towards subjectivity. On closer scrutiny the moment when a cry becomes a subject’s call, becomes a voice, is defined by silence, listening and absence. The silent wait that follows a brief distress cry marks the cry as a call from a being of awareness and subjectivity. It is at this juncture where the biological necessity becoming curiosity, the cry becoming voice, the animalistic sound becoming a human subjects calling, folds back into the psychoanalytic.

The Lacanian concept of the inaudible object voice is key here. Dolar states how:

“As soon as the object, both as gaze and as voice, appears at the pivotal point of narcissistic self-apprehension, it introduces a rupture at the core of self presence. It is something that cannot itself be present, although the whole notion of presence is constructed around it and can only be established only by its elision. So the subject, far from being constituted by self apprehension in the clarity of its presence to itself, emerges only in the impossible relation to that bit which cannot be present. (…) The voice may well be the key to the presence of the present and to an unalloyed interiority, but it conceals in its bosom that inaudible object voice which disrupts both.” (Dolar, 2006, p.42).

This passage illustrates the subject in relation to the object voice. The subject emerges bathed in the glow of an absence – this is the moment of ceasing to cry and commencing the silent act of listening that constitutes the cry as becoming voice. The self apprehension required for a coming subjectivity does not occur during the cry, but precisely when the silence of listening ruptures the sonic affect, the cry. Likewise, the unalloyed interiority of the crying being and its present presence are utterly defined and dependent upon the rupture brought upon by the object voice, the wait and deafening silence of listening. Auto-affection, the genesis of subjectivity, happens in the silence of the object voice in conjunction with listening and waiting. This follows Derrida’s concept of auto-affection; precisely in how thinking oneself is immediately a differ-encing action of thinking oneself through the other: “hark… is care coming to me?” In Derridean terms this is auto-affection as always already hetero-affection. In Lacanian terms it is a rupture that refers to a void. It is precisely this void (to reconcile Derridean concepts with Lacanian concepts I suggest the void between auto-affection and hetero-affection) where the pause of the newborns cry, creates voice(s) by bi-poiesis, reteroactively imbuing the cry with intent and turning it into voice and also flooding the silence of listening with the “inaudible and unbearable object voice” (Dolar, 1996, p.16). It is in this sense that the silent waiting and listening of the infant is the premier performative act par excellence of how the object voice “embodies the very impossibility of attaining auto-affection; it introduces a scission, a rupture in the middle of full presence, and refers it to a void- but a void which is not simply a lack, an empty space; it is a void in which the voice comes to resonate.” (Dolar, 2006, p.42, my emphasis).

Thus, just as Derrida states that “the “voice of being” (..) is silent, mute, insonorous, wordless, originarily a-phonic” (Derrida, 1998, p.22) and as Zizek states how “the object voice par excellence, of course, is silence. (…) and what effectively reverberates is the void: resonance always takes place in a vacuum” the consistency across both thinkers conceptions of subjectivity in relation to voice is precisely the silence of subjectivity, the object voice echoing in the void between auto and hetero-affection.

 The pause of the infant that signals both the seed of subjectivity and the coming of voice is violence, because this is the first of many instances of socialization and subjectivity ordering the reflexive impulses of the body, to cease the sound of distress requires strength, control and discipline – as any one who has caught their shin in a hushed cinema will attest to. The infant’s pause is a violent act of socializing and subject forming, but unlike potty training, the educational system and later modes of control and conformity it is a locus of self-flagellation. The simultaneous coming of voice and subjectivity is the first act of self-discipline. After the auto-violence of voice and subjectivity come further acts, tracking analogously with the fall of the larynx. The learning of language, the breaking of the voice in puberty etc all follow from a precedent set by the trauma of the infantile object voice. It is from this basis, of voice as a locus and genesis of violence and marks of trauma, that I wish to approach the contemporary context of voice in relation to technology and late capitalism.

Spectrogram of an infant's cry.


I would like to consider the timing of the laryngeal descent in infants in relation to developmental and psychoanalytic understandings of subjectivity. To appreciate the later application of the violence of voice/voi(len)ce an understanding of the developmental and psychoanalytical origins must be explored. The implications of when and how the larynx descends in the infant must be scrutinized closely.

To begin I will outline what happens before the larynx descends. Making a sound is not a taught or learnt affect. Making a sound happens automatically for newborns. Making sounds for this or that (voice), using it for communicative purposes (be it to signal distress or request nourishment) is another matter.  To be clear, the initial making of a sound is instinctive and reflexive. In fact, making a sound, like hearing, resides in the pre-occularcentric intra-uterine realm. Even before birth “(t)he human foetus may be practicing the use of its voice already in utero.” (Karpf, 2007, p.95) Just such a phenomena is vagitus uterinus. For example, “(i)n 1923, an American physician, George Ryder, heard the sound of a baby crying after he had applied traction with forceps. Listening via stethoscope, his assistant and nurses said the sounds were ‘high and squealing, much like the sound of a kitten’.”(Karpf, 2007, p.94-95) The first cry is not voice; it is merely a necessary physical reflex. The purpose of the first cry is to clear the airways of mucus so the first breath can be taken. Akin to a hiccup, there is no intentional affect in the first cry; it’s just a physical necessity that happens to make a sound.

However, shortly after such a coincidentally sonic physical action is embarked upon a voice begins to be born. Subsequent cries yield nourishment, comfort and care. But just because the cry yields nourishment, comfort or care does not mean the cry asks for these things. Infant cries do not change for hunger or pain; initially the cry is analogous to distress. Many things may cause distress but the causes of distress do not illicit different types of cry. Such simplicity of the infants distress cry does not detract from its crucial validity with regard to the genesis of cultivating a voice because at each cry of distress an interaction follows. Simply by the connection of a sonic reflexive reaction to distress yielding the care that is required to survive a voice is born:

“Newborns’ cries are desperate because they haven’t yet realized that they’ll produce a response. But already, six to eight weeks later, they sometimes go quiet after a bout of crying, to listen for their parent’s footsteps. If they don’t hear them, they then resume crying. The cry is no longer simply a reflex – now it’s begun to be one side of a conversation. Within a couple of months of being born, babies leave a space for a response to their cry: they’ve learned the art of turn taking.” (Karpf, 2007, p.99)

Thus, in the very transition from reflexive cries of distress to signalling distress cries embedded within a turn taking matrix an important shift happens; an animalistic cry turns to voice. The new cry has an expectation, indeed it is a call, not merely a sonic emission of the body; but an intent, a meaningful cry: a voice. It is precisely at this juncture where the Aristotlean distinction between voice and mere animalistic sound resonates. Aristotle states how “it is not every sound of an animal that is voice (…) as voice is a kind of sound with meaning, and not, like a cough, just of the in breathed air”. (Aristotle, 1986, p.179) As soon as the reflexive distress cry changes into a signalling, turn taking call it takes on meaning, intent and soul. It is this transition that is the birth of voice. It is a birth that transforms a natural instinct into a social and communicative tool. But this is just the beginning of the violence of voice that leads to an ongoing re-enactment of the evolutionary trauma.

Recall the evolutionary descent of the larynx, and how such a change is mirrored, in a microcosm, in the change from newborn laryngeal form to dropped adult laryngeal form. After the newborn’s transition from animalistic distress cry to turn taking signalling the laryngeal changes track analogously to the acquisition of language. A “child’s growing mastery of the sounds of speech is partly the result of the gradual descent of its larynx and the root of its tongue during the first two to six years of its life.” (Karpf, 2007, p.53) Think of young family members, hearing and attempting to say words and phrases of increasing complexity. Oftentimes they may sound like their tongue is clumsy and not under their full control. Yellow becomes a frustrating single word tongue twister; it may be pronounced as “lellulow” or “yayolo” for instance. This is because the infants tongue and larynx are still descending into positions that would afford the tongue the space for the serpent-like dexterity required to master the whole of their native language. Such a co-ordination of vocalization skills continues to mirror the evolutionary emergence of voice and language. Babies cannot speak on all fours, but by the time they are teetering around under out of reach vases and ornaments their larynxes are dropping and they begin gaining speech. The larynx continues dropping till the end of puberty, just like our ancestors who slowly stood upright to full height, our larynx doesn’t not reach the final place, conducive to speech and choking, until we reach our full height and have struggled though years of subject forming experiences. In gaining our adult voice we re-enact the evolutionary trauma of changing from mammal to walking and talking adult human.

Violence is always an effective way to control and order persons. But an even better mode of control and conformity is to ask a person to control, discipline and punish him or her self.  It is in this sense that the role of the voice, speech, is an intensive locus of societal control. “Stand up straight, don’t breathe, don’t eat - speak! Tell us what we want to hear!” Assigning speech to voice is akin to assigning oedipalized tyrannies to neuroses. For the young speaker, speech is a “form of systemization, transferred to them as criteria to human aspiration.” (Adorno and Horkheimer, 2012, p.88. My emphasis) Every aspiration, from our lungs through our damaged choke hazard an up to our two-tube vocal tract, is required to be sonically imbued with the signature phonic resonances of the evolutionary head smash; “quantal vowels such as /i/, /a/ and /u/” (Fitch, 2000, from abstract). And further, one must perform absurd labial/dental acrobatics, consonants must be conjured up by biting, lisping, clucking and gibbering our exhalations into strings of signs the authorities demand. Indeed when it comes to automatic disciplining and violence the voice is an inherently personal and widespread example.

The prevalence of songs and chants in theological histories (the major control mechanics before capitalism) and the role they play in educational and military organisations are all exemplary histories that elucidate how voice is a locus of societal control and organization. Further to this, as I outlined previously, voice is also a contemporary locus of the control mechanisms implicit in technologically mediated existence under late capitalism. For the moment I will not linger on the prevalence and particular dynamics of voice, speech, song and chant in these areas of societal organization, be it under theological or capitalist regimes, but I will draw attention to one particular aspect. The role of voice in western organised religions, as well as educational and military structures, illustrates precisely how voice is a dominant manifestation of how people are required to discipline and punish themselves when asked. The disciplining of oneself into voice and speaking is my focus here. But the crucial aspect of such disciplining is when it starts. The premier of auto-disciplining via vocal violence does not happen in school, the military, the church or the office; this phonic self-flagellation begins at birth. The core of the violence of voice is cultivated at birth. It is later that more phonocentrically complex, elaborate and systematized collective mechanisms of violent control emerge, but these are nonetheless derivatives from the infantile genesis.

Firstly air is inhaled, upon exhalation, the vocal cords within the larynx are activated and vibrate, imbuing the exhaled air with sound. This sound then resonates and echoes through the remaining parts of the body that fall under the name ‘the vocal tract’. The tongue, palate, teeth, lips, nasal cavities all fall under the territory of the vocal tract. It is quite peculiar how so many different body parts are involved in the production of voice as Cavarero notes positively: “lips, mouth, palate, tongue, teeth, (…) larynx, nasal cavities, lungs, diaphragm – come together for acoustic purposes.”” (Cavarero, 2005, p.65) and Chion, negatively: “it paradoxically appears that the human body does not have a specific organ for phonation” (Chion, 1999, p.127). The voice is a result of many parts and yet reducible to none; neither the sum of each part, or the remainder after all parts. The very corporeal violence of speech is uncovered precisely at the moment when one contemplates each parts involvement and what role it serves within the body. The concept is simple; every body part that contributes to speech has a better, more vital, more important role to do, because, as Tomatis notes “we were given a digestive apparatus and a respiratory apparatus, but no specific oral-language apparatus.” (Tomatis, 1996, p.59). The lungs ought to take in oxygen and expel the used air with higher levels of carbon dioxide; we may recall the apoplexy of the straining rock singer as a consequence of this need being neglected. Such apoplectic strains are the result of just how much the voice needs our vital lungs, and just how much it uses them at the expense of our breathing. “Syntax overrides carbon dioxide: we suppress the delicately tuned feedback loop that controls our breathing rate to regulate oxygen intake, and instead we time our exhalations to the length of the phrase or sentence we intend to utter.” (Pinker, 1994, p.164). Similarly vital organs, the teeth, tongue and palate are important for nourishment; we would die if we couldn’t eat, but in speech they are requisitioned for the sonic acrobatics of forming consonants and modulating vowels. The larynx itself is an important guard against inhaling material into the lungs, “its chief function has nothing to do with the voice: it’s to act as a sphincter to prevent anything but air entering the lungs.” (Karpf, 2007, p.23) What with all the eating and drinking our bodies require the role of the larynx is crucial for the prevention of choking, yet we refer to it as our “voice box”. Voice is often regarded as life, of presence; indeed hearing voices behind a door or under rubble is often understood to be a sign of life. But the voice is also a sound of violence against the various body parts that afford life. “All this talking is killing me!” Just such a statement is not as hyperbolic as it sounds.

Here an evolutionary detour is required. The focus of which is the larynx. The larynx has, in phylogenetic terms, undergone a strange slip in humans. The position of the larynx in adult humans rests in a place that grants communicative ability at the detriment of nourishing efficacy and survivalist practicality. Our larynx is so low that choking is more likely in adult humans than other mammals. The adult larynx rests in a lethal but vocally practical position. Infants’ larynxes are quite different, a new born can breathe air and swallow fluids simultaneously but by the time the human has mastered language the larynx lies in a dangerously low position, a position that risks choking. Many a dinner party conversation has been interrupted by the potentially lethal consequences of just such a change.

Fitch outlines the differences succinctly:

“(A)natomical studies comparing the vocal tract morphology of humans with non-human mammals suggest that the human vocal tract is fundamentally different from that of all other mammals. In particular, the resting location of the standard mammal larynx is high in the throat, and typically engaged in the nasopharynx, allowing animals to swallow fluids and breathe simultaneously. This position, and ability, also typifies human newborns. In contrast, the resting position of the larynx in adult humans is much lower in the throat. While this makes it impossible for us to engage the larynx in the nasopharynx, and thus to breathe and swallow simultaneously, it does appear to make possible a wider variety of vocal tract shapes, and thus speech patterns, than would otherwise be unattainable.  In particular, the "descent of the larynx" that occurs in human ontogeny, gives adults a vocal tract with a horizontal oral tube and a vertical pharyngeal one. This two-tube vocal tract allows the production of quantal vowels such as /i/, /a/ and /u/, that feature in the vowel systems of most human languages. ” (Fitch, 2000, from abstract)

Thus, we can talk better and choke better than our barking, mewing and neighing mammalian brethren. Karpf too quips, albeit casually, on such a dangerous peculiarity: “this is the trade off: talking makes us breathe and eat less efficiently, and splutter when drunk. Though we can speak, we can also easily choke on our food. Apes do neither, which is why they don’t need to learn the Heimlich manoeuver.” (Karpf, 2007, p.53)

As Fitch states, the human larynx does not begin in such a position; newborns begin life with a laryngeal formation akin to most other mammals. However, the human larynx descends after the first 9 months of an infant’s life and continues for years, accelerating its descent once more upon puberty.  Such a transformation reflects, in a microcosm, the laryngeal change undergone by our species generally. It is perhaps no coincidence that during the change from utterly dependent amorphous mammal to becoming an interacting, social, communicating and moving mammal that the descent of the larynx occurs. I will expand on this allusion to psychoanalysis and infant development later, now I will outline the evolutionary history of our dropping larynxes.

Australopithecus, a type of hominid that bled into the homo genus, was one of the last hominid types not to display a descent of the larynx. Australopithecus became extinct around two million years ago. Yet there is evidence to suggest that the laryngeal descent existed in other ancestors of homosapiens around the same period, for example “in skulls of Homo ergaster, from nearly 2 million years ago.” (McGill). It is important to remember that the larynx did not drop as an evolutionary response to the survivalist applications of clearer or more complex vocal communications, but rather as an odd glitch resulting from our ascent to pi-pedal verticality. Our ancestors lunged and teetered around for some time with an increased choke risk before venturing to wax eloquently over supper. For example, a “skull of Homo heidelbergensis found in Ethiopia shows that the larynx had almost reached its current position 600,000 years ago. These findings lead to the conclusion that a vocal apparatus capable of articulate language probably existed nearly half a million years before people began to speak.” (McGill) The cause of the laryngeal descent was not adaptive evolution but more of a corporeal re-organization brought on by standing up: “As early humans gradually adopted an erect posture, it gradually brought the position of their head back and up so that it tipped back at the base of the skull, thus causing the neck to emerge and the larynx to descend.” (Fitch, 2000, from abstract) Talking is the result of an evolutionary injured mammal eventually finding a use for a laryngeal maladaptation. The dropped larynx of homosapiens was, to begin with, a major hindrance to survivial. Just like in the comic books where unexpected incidents leave ‘average Joes’ with superhuman powers after a period of pain and suffering, the fallout of the evolutionary head smash was speech. Voice is a product of painful and nonsensical contortion, a trauma.

Professor Barker and William S. Burroughs both understood voice as a vocalization from a site of corporeal damage.  The former summarized this as Palate-Tectonics. Barker describes precisely and vividly how our current vocal apparatus is “a crash-site, in which thoracic impulses collide with the roof of the mouth. The bipedal head becomes a virtual speech-impediment, a sub-cranial pneumatic pile-up, discharged as linguo-gestural development and cephalization take-off.” (Land, 2011, p.502) Burroughs refers to the symptoms of the crash site as a talk sickness: “sick apes spitting blood bubbling throats torn with the talk sickness. (…) we waded into the warm mud-water. hair and ape flesh off in screaming strips. (…) when we came out of the mud we had names.” (Burroughs, 2010, p.127) Both Barker and Burroughs stress the traumatic and violent history of language and speech acquisition. It is something that exists because of its working against the body, speech is not natural and good, but is a form of evolutionary injury, names and signifiers become the viral proliferation of the trauma of such a phylogenetic head smash.

To recap briefly, after a fleeting outline of the division between the signified and signifier I focused on the corporeal violence inherent in voice. My focus has been on two types of violence and trauma; immediately corporeal and historically evolutionary. Firstly I detailed how voice is a form of immediate physical violence against the speaker, the voice is made by vital body parts that are responsible for respiration and consumption being used instead for the less vital purpose of vocalization. Secondly I detailed how the corporeal violence of voice can be understood in quite evolutionary macro-conceptual terms, the voice is a product of the repositioning of the larynx as a consequence of the maladaptive contortion of the pi-pedal upright posture - the head smash is the violence, the talking sickness the symptom. But it is important to hold these two aspects of violence and trauma within voice alongside the opening concept of the violence of signification. It is precisely here where three registers of violence within voice can be recalled: Metaphysical (the violence of the signifier), Physical (the local violence of voice requisitioning vital organs as one speaks) and Evolutionary (the evolutionary and phylogenetic contortion that the dropped larynx is a product of).

How do you feel about that?”
“It’s hard to put into words.”

The scene is familiar, a distraught or upset analysand and the stoic and authoritative analyst. The latter (the letter) demands that vague, unrealized feelings be put to words; this is the impossible challenge set to the analysand. Words are signs, differ-ing symbols. Things get lost in translation, not between the speaker and the listener, but between the speaker’s feelings and thoughts and the mission of utterance itself. This demand of language is a tyranny, and a violent, controlling and insidious tyranny that operates through Structuralist Psychoanalysis, technology and Capitalism. Indeed, the empire of word has never been so strong or wide reaching. In order to address this I will look at the primary form that language takes, speech, but more precisely the voice that speech requires. The following text will examine the genesis of voice as a form of violence and outline the different manifestations and different registers of ‘Voi(len)ce’. These manifestations are metaphysical, evolutionary, developmental, corporeal and psychoanalytical. The task is simple. To take the moment of speech, the use of a signifier, and work backwards accounting for how voice was acquired.

The tyranny of the word is not merely a case of the increasing use of text as a communicative means. Nor is it the ‘wrong peg’ symptom of having to make sure ones primary mode of affect is ‘understandable’. It is not merely the violence of asking an organic or emotional thing to become a set signifier, a symbol. The tyranny of word is not just a static and cumbersome grid we force ourselves to adhere to, it does more than this. It modulates our insides, our dreams, desires and fears without us even knowing. It even reaches into our hearts. But to understand this it is vital to outline the simpler violence of the word first.

Take a concept that is difficult to define, a concept that is a feeling and a relation: love. Ever since the concept of love has been around it has been grappled with through text. Poets and writers have seldom tired from embarking upon the Sisyphean task of communicating, putting to words and describing the magically and wondrous feelings and thoughts of love. However today, under contemporary capitalism and the ubiquity of attention atomizing technological social media, it is quite easy to settle for a brief array of trite symbols. “Luv u xxx.” That’ll do.

But spending more time on language to create ever more subtle, complex or artistic strings of signifiers is not what needs to be done, because all the love sonnets ever written have fallen short. They may reach closer, touch more or resonate more profoundly than “Luv u xxx”, but we are simply dealing with degrees of separation under a shared failing: an inevitable failing for the internal (be it emotion, soul, belief or thought) to be completely turned into something external and symbolic. To be brutally clear, both Shakespeare’s Sonnets and “Luv u xxx” fall short of the task of communication prescribed to them. It is here where we come upon the chasm between signified and signifier. The sign that stands for something is by very definition of the fact, a stand in, a referring, differ-ring, almost as good as but never the real thing, sign.

Of course, this is not an original observation. In western metaphysical histories since Aristotle the relationship of, say, Soul to Word has been negotiated and re-negotiated many times. Structuralist Psychoanalysis and Deconstruction have both focused large strands of enquiry on the psychological and metaphysical consequences of the history of the Signifiers relationship to the Signified, the Word to the Soul. Both schools of thought, to summarise crudely, elucidate a dominance of Logocentricism (dominance of the letter, the sign) over a concealed agency of Phonocentricism (dominance or necessity of voice apart from language). It is here where we come back the opening pretext; the voice as the loci of violence, the violence of attempting to externalize and socialize internal forces.

I use the term ‘violence’ for a reason. The signifier is a dead symbol, it is petrified; it is something made into stone (‘petrified’ is from the Latin petraficare, petra, rock + facere, to make). The signifier lingers around in this world longer than the soul or concept it refers to; like a mossy gravestone or stone statute, it is hardy and permanent but not alive whereas passions, the soul, concepts and thoughts are fleeting but living. Thus, in order to be externalized they must jettison their vital essence - their core. Like a tiger skin displayed amongst tasteless faux colonial décor the signifier of the signified is a dead artifact that can be preserved. The vital and living thing, the thing with energy, life and power is always already long since gone. Of course, the tiger skin is barely even a shadow or simulacra of the tiger, just as the word Love is nothing compared to the thing it signifies. But again, such poetic metaphor merely serves a point that others have investigated – the difference between life and death, between signified and signifier. My question is precisely: what happens in the conversion? Just as the tiger is subjected to being hunted, killed, mutilated and skinned what happens to the internal thing on its way to becoming externalized, to becoming Word?

What happens is violence, and voice is the loci of such violence. Phonology is just one example, it is:

“the total reduction of the voice as the substance of language. Phonology, true to its apocryphal etymology, was after killing the voice – its name is, of course, derived from the Greek phone, voice but in it one can also quite appropriately here phonos, murder. Phonology stabs the voice with the signifying dagger; it does away with its living presence, with its flesh and blood.” (Dolar, 2006, p.19)

Turning an internal thing, be it a soul, an emotion, a body or a thought, into language, into a sign, is not a pretty task; it is violence. It is a violence that kills and loses something. This is what phonology does to voice. Yet, turning a vocal affect into a system, a matrix of differ-ing signifiers is not the only form of violence in voice. Violence is also manifested in speech on a corporeal register. To understand this aspect of violence it is necessary to outline what it is to vocalize.