Journal of Otology & RhinologyISSN: 2324-8785

Reach Us +1 850 754 6199

Research Article, J Otol Rhinol Vol: 5 Issue: 4

The Pharyngoesophageal Segment in Laryngectomees with Non-Functional Voice: Is It All about Spasm?

Arenaz Bua B1,2,4*, Rydell R1,2,4, Westin U2 and Olsson R3,4
1Division of Logopedics, Phoniatrics and Audiology, Lund University, Skane University Hospital, Sweden
2Division of Ear, Nose and Throat Diseases, Head and Neck Surgery, Lund University, Skane University Hospital, Sweden
3Department of Clinical Sciences, Lund University, Skane University Hospital, Sweden
4Diagnostic Centre of Imaging and Functional Medicine, Lund University, Skane University Hospital, Sweden
Corresponding author : Arenaz Bua B, MD
Division of Logopedics, Phoniatrics and Audiology and Division of Ear, Nose and Throat Diseases, Head and Neck Surgery, Department of Clinical Sciences, Lund University Skane University Hospital, Jan Waldenstromsgata 18, SE 205 02 Malmo, Sweden
Tel: +46 40 332324, +46-762230949
Fax: +46 46 171758
E-mail: [email protected]
Received: April 18, 2016 Accepted: June 25, 2016 Published: July 02, 2016
Citation: Arenaz Bua, Rydell R, Westin U, Olsson R (2016) The Pharyngoesophageal Segment in Laryngectomees with Non-Functional Voice: Is It All about Spasm?. J Otol Rhinol 5:4. doi:10.4172/2324-8785.1000287


Objective: The aim of the present study was to characterize the pharyngoesophageal segment in non functional tracheoesophageal speakers and to confirm that the patients responded to treatment with decreased pressures, better voice and increased neoglottis vibration.

Methods: Voice perceptual assessment, high-resolution videomanometry of swallowing and phonation and high-speed camera recording during phonation provided information about anatomy and function of the pharyngoesophageal segment before and 1 month after treatment with balloon dilatation and/or botulinum toxin.

Results: High resolution videomanometry revealed 12 patients with phonation pressure higher than 20 mm Hg before treatment: 2 patients with pressure between 20-45 mm Hg, 5 patients with pressure between 45-66 mmHg and 5 patients with pressure higher than 66 mm Hg. Eight of twelve patients reported clinical improvement after treatment. Their phonation index (defined as the ratio between phonation pressure at pharyngoesophageal segment and distal oesophagus), phonation pressure and residual pressure at the pharyngoesophageal segment decreased after treatment. There was no significant difference between voice variables values before and after treatment. High-speed camera recordings revealed a wide variation in the anatomical and functional characteristics of the neoglottis.

Conclusions: Normal pressure of the PES during phonation is an important factor for successful sound emission in TE speakers. Others aspects as fibrosis at pharyngoesophageal segment and oesophageal peristalsis should be considered.

Keywords: Balloon dilation; Fibrosis; Tracheoesophageal


Balloon dilation; Fibrosis; Tracheoesophageal


BT: Botulinum Toxin; HRVM: High Resolution Videomanometry; HSC: High-Speed Camera; Mmhg: Millimetres of Mercury; PES: Pharyngoesophageal Segment; SLP: Speech And Language Pathologist; TL: Total Laryngectomy; TE: Tracheoesophageal; TEP: Tracheoesophageal


Total laryngectomy (TL) is the treatment of choice for advanced laryngeal cancer. Speech therapy and prosthetic voice rehabilitation are considered the gold standard for restoration of voice production after TL. The acquisition of tracheoesophageal (TE) speech requires anatomic and physiologic conditions that allow the passage of the air from the lungs through the tracheoesophageal prosthesis (TEP) and the pharyngoesophageal segment (PES). Additionally, the lining mucosa of the PES should be capable to vibrate sufficiently to produce the voice [1,2].
Normal pressure of the PES during phonation is believed to be important for a successful sound emission [3]. In general, the muscle activity of the PES is a protective reflex against reflux, but in patients with TEP it constitutes a significant impediment to voice production and rehabilitation. PES hypertonicity is the most common cause of failure in oesophageal (38%) and TE (35%) voice [4]. Muscle spasm at the PES causes an interruption of airflow from the oesophagus to the pharynx during phonation. This disrupts the vibration of the mucosa and prevents the voice production [5]. Singer and Blom studied 129 laryngectomized patients using the air insufflation test in combination with videofluoroscopy and concluded that spasm at PES needs to be treated in order to improve TE speech Singer et al. [6]. Several potential treatments for PES spasm have been described: Myotomy of the middle and inferior constrictor muscles of the pharynx and the cricopharyngeal muscle, partial neurectomy of the pharyngeal plexus and chemical denervation of the PES with botulinum toxin (BT) [6]. BT injection in the PES is a simple, quick and relatively cheap in-the-office procedure with effects lasting beyond two years in some cases [7-11]. Therefore, BT injection appears to be a reasonable and less invasive alternative. But, it is important to assess the sagittal diameter at PES, because the effect of BT may negatively affect the swallowing function. We establish 5 mm as the limit value of the sagittal diameter at PES thus patients in which the diameter of the PES is smaller than 5 mm should be treated with balloon dilation (BD) before BT injection [12].

Aims of the Study

The present study is a characterization of the PES in TE speakers, who rated themselves as having a non-functional TE voice. The aims of the study were:
• To use voice perceptual assessment, high resolution videomanometry (HRVM) and high-speed camera (HSC) recordings to characterize non-functional TE speakers.
• To confirm if the patients responded to treatment with decreased pressures, better voice and increased neoglottis vibration.
• To investigate if the phonation index, which is the ratio between phonation pressure at PES and phonation pressure at the distal oesophagus, changes after treatment and if this ratio could explain the difference between functional and non-functional TE speakers.

Material and Methods

We recruited 13 patients who reported themselves as nonfunctional TE speakers (no voice, not able to talk on the telephone, phonastenia). They were 9 men and 4 women, 5 had no voice. All except 2 patients reported dysphagia, Table 1. They were all former smokers with a mean age of 73 years (range: 61-82 years). All but two received radiotherapy, Table 1. Mean time after surgery was 30 months, median 12 months (range: 6-156 months), Table 1. All were diagnosed with squamous cell carcinoma, except two that had a condrosarcoma. Cricopharyngeal myotomy and insertion of the TEP, Provox®, was performed in the same session as the laryngectomy in all patients. They finished their medical/surgical treatment at least 3 months before being included in the study and presented no evidence of recurrent disease. Their stoma was covered with a heat and moisture exchanger valve. All participants signed informed consent and underwent clinical evaluation by an otolaryngologist. Those patients with a PES anterioposterior diameter smaller than 5 mm and dysphagia were treated with BD previous to the BT injection. High resolution videomanometry, voice perceptual assessment and visual assessment of digital high speed recordings of the neoglottis were made before and 1 month after BD and/or BT injection.
Table 1: Patient’s data on dysphagia, treatment, radiotherapy and postoperative time.
The study was approved by the local ethical committee, dnr 2013/70 (Table 2).
Table 2: Intralistener and interlistener reliability of the voice perceptual assessment.
Perceptual assessment
Three experienced speech and language pathologist (SLP) made the voice perceptual assessment, three times per patient, before and 1 month after treatment.
In order to complete the voice perceptual assessment, the SLP registered six variables. In the first two variables, quality and intelligibility, three options were available: good (=1), reasonable (=2), poor (=3). The three other variables are commonly used for perceptual assessment of all kind of voice patients and voice disorders (hyper functional/tense, breathy, rough) and the sixth variable, gurgle, is used in descriptions of laryngectomees voices [8]. In variables 3 to 6 a visual analogue scale (VAS) was used. Voice variables definitions based on the Stockholm Voice Evaluation Approach [13] were modified according to the anatomy of the laryngectomees:
Rough: Low-frequency aperiodicity, presumably related to some kind of irregular neoglottis vibration.
Breathy: The neoglottis is vibrating, but somewhat abducted, which creates an audible turbulent noise related to the insufficient closure (Table 3).
Table 3: Pressure values in functional versus non functional speakers before and after treatment.
Hyper functional/tense: Voice sounds strained, due to compression/constriction of the neoglottis during phonation.
Gurgly: Wet hoarseness/liquid voice quality
This examination was performed with the patient seated, using a high-resolution solid-state transducer system (ManoScan-360, Sierra Scientific Instruments, Los Angeles / CA, USA). The catheter, 4.2 mm in diameter, has 36 sensors spaced 1 cm apart from each other and every sensor contains 12 measuring points. All participants were instructed to swallow 10 ml of non water-soluble contrast (Barium contrast medium, 240 mg/ml, 60% weight/volume) three times. The catheter was introduced through the nose after applying Xylocain gel 2% (Astra Zeneca) in order to reduce patient discomfort. Time for examination was less than 10 min, total fluoroscopy time was less than 100 sec and radiation dose 0.2 mSv. All measures are in millimetres of mercury (mmHg).
Variables analysed during swallowing:
• Resting PES pressure.
• Residual pressure during PES opening.
• Pharynx contraction pressure 3 cm cranial to PES (= pressure at the level of the pharyngeal constrictor).
• Oesophagus peristaltic contraction pressure (= mean value at 3 and 7 cm cranial to the lower oesophagus sphincter).
Variables analysed during phonation:
• Pressure at PES, pharynx (3 cm cranial to PES), proximal oesophagus (3 cm caudal to PES), distal oesophagus (7 cm cranial to the lower oesophagus sphincter).
• Phonation index (=phonation pressure at PES/phonation pressure at the distal oesophagus)
• Craniocaudal length of the PES.
High-Speed camera examination
The system consisted of a computer and a camera head used in combination with a 70° rigid endoscope (HRES Endocam, model 5562.9 colour, R.Wolf, Knittlingen, Germany) and a 300 W cold light source. This system records images at a rate of 2000 or 4000 frames per second. In this study 2000 frames per second were used. Patients were asked to stick out their tongues to reveal the opening of the PES and to produce a sustained /ae/ or /e/ sound. Local anaesthesia was not routinely used. The variables used for visual assessment of digital HSC recordings of the PES, have been described by Van et al. [14]:
• Saliva: Amount of saliva present at the neoglottis that could impair the visibility. Graded as: None, little, moderate, much, obstructing.
• Neoglottis visibility: The origin of the neoglottis was judged as being visible when the starting point of the vibration could be identified, not visible when only the final part of the travelling vibration could be seen or when the origin of the neoglottis could not be identified. Described as: Visible, non visible
• Neoglottis shape: Contour of the lumen during the open phase of vibration: Circular, triangular, split side-to-side, anterior-posterior split, irregular, non assessable.
• Vibration location: Predominant site of vibration. Posterior, anterior, lateral, circular, non assessable.
• Mucosal wave: Differentiation of a mucosal wave from the vibration of the neoglottic wall, in analogy to the travelling wave on vocal folds. Described as: regular, irregular, non assessable.
• Vibration regularity: Visual impression of the regularity of the vibration. Graded as: Regular, irregular, non assessable.
• Closure phase: Duration of the open or closed phase of the neoglottis in relation to the complete cycle of vibration. Open, equal, close, non assessable.
The assessment of the recordings was made by two experienced specialists in Phoniatrics and Laryngology, who evaluated the recordings in three different sessions and rated them after reaching consensus.
Botulinum toxin injection
Topical anaesthetic with vasoconstrictor (Lidocain-Nafazolin APL 34 mg/ml + 0.17 mg/ml) was applied in the nostril and the patient swallowed lidocain (Xylocain viscous 20 mg/ml; Astra Zeneca, Södertälje, Sweden) to anesthetize the pharynx. We used an injection needle (Posi-Stop from Hobbs Medical inc.) through a channel fiberlaryngoscope, to inject the BT at three points (two lateral and one posterior) in the visible cranial part of the PES. We used freshly reconstituted, purified botulinum toxin type A (Botox, Allergen Inc, Irvine, California) at a 2.5- mouse units (MU)/0.1 mL concentration at a total dose of 30-50 units. All patients were discharged directly.
Balloon dilatation
BDs were performed in the outpatient clinic. Topical anaesthetic with vasoconstrictor (Lidocain-Nafazolin APL 34 mg/ml + 0.17 mg/ ml) was applied in the nostril and lidocain (Xylocain 10 mg/ml; Astra Zeneca, Södertälje, Sweden) was sprayed into the throat to anesthetize the pharynx. Dilatations were performed with controlled radial expansion balloons with diameter between 8-14 mm, during 2- 2.5 min through a channel fiberbroncoscope. The procedure was made twice in all patients, with 6-week interval between dilatations. All patients were discharged directly.
Intraclass correlation coefficients were calculated to assess intra and inter-rater reliability. Wilcoxon signed rank test was used to compare results pre/post treatment. P values ≤ 0.05 (two-tailed) were regarded as significant [15]. All data were analysed using Statistical Package for the Social Sciences (SPSS) 23 © Mac version.


We recruited 13 patients, but one died prior to treatment due to complications related to liver cirrhosis. Six received BT in doses between 25-45 IU. Four had an anterior posterior diameter at PES smaller than 5 mm and reported dysphagia, they were treated with BD twice and experienced clinical improvement and thus they were not treated with BT, Table 4. Two were treated with BD and BT in doses between 25-45 IU. Eight patients reported clinical improvement in voice and dysphagia after treatment, four reported no clinical improvement, two of them left the study.
Table 4: Response to treatment.
Voice perceptual assessment
Results regarding intra and inter-listener reliability are presented in Table 2. Five subjects had no voice before the treatment. The others were rated by the SLP: 1 had good, 4 had reasonable and 3 had poor voice quality; 2 had good, 5 had reasonable and 1 had poor voice intelligibility. Voice quality results are presented in Table 4. Wilcoxon test showed no difference between voice variables before and after treatment. Hypertonicity was the variable with highest values. For comparison of variables rough, breathy, hypertonic, gurgly before and after treatment (Figure 1).
Figure 1: Voice variables before and after treatment.
High-Speed camera examination
Results regarding intra-rater reliability are presented in Table 5. Recordings have been obtained in seven patients before treatment and in ten patients after treatment. Wilcoxon sign rank test showed no difference between HSC recordings before and after treatment. Results regarding neoglottis mucosal wave and vibration regularity before and after treatment are included in Table 4.
Table 5: Intra-rater reliability as intraclass correlation coefficients.
Wilcoxon sign rank test revealed significant differences before and after treatment in phonation index PES/oesophagus, Z= -2.8 (p= 0.005), phonation pressure at PES, Z= - 2.6 (p = 0.009) and residual pressure at PES, Z= -2.2 (p= 0.03). Group pressure values during swallowing and phonation are presented in Table 3. For individual phonation index, phonation pressure at PES and residual pressure values see Table 4. Six of the patients had an anterioposterior diameter less than 5 mm and required BD.


Success rates of TE voice can be as high as 90 % Op de Coul et al. [16] Due to the impact that the voice has on the quality of life, voice rehabilitation after TL may be a major challenge [17]. The patients included in this study reported themselves as non functional TE speakers and required treatment. A multidimensional evaluation of the PES using voice perceptual assessment, HRVM and HSC recording, was made in order to understand the mechanism of their voice impairment and their response to the treatment.
Perceptual assessment of the voice is a key to manage voice rehabilitation in laryngectomees. Variables based on the modified Stockholm Voice Evaluation Approach were chosen to make the assessment in our study [13]. Voice perceptual assessment made by the SLPs showed high intra-listener reliability when compared with other studies [18]. Before treatment, four patients were considered reasonable speakers and one was rated as a good speaker according to the SLPs voice perceptual assessment, but these patients rated themselves as non functional TE speakers, Table 4. This shows how the perception and the expectations of patients and health professionals may differ [19]. The inter-rater reliability was low for variables rough (0.55 p=0.079) and breathy (0.29, p= 0.025), Table 2, and points out that perceptual voice assessment after TL may be difficult. Acoustic voice assessment may help to detect differences in voice before and after treatment, since the spectrographic trace and type of signal (I, II, III or IV) may predict the contact between the anterior wall and the prominence of PES during phonation [20-22].
HSC is the only possible method to record vibrations in the neoglottis after TL, since it is not dependent on the fundamental frequency of the phonation [23]. Two of the patients could not produce any sound which is necessary in order to make a recording. Three of the patients did not tolerate the telelaryngoscope. The intra-rater reliability was high for all the variables, which might be expected in professionals with experience in using this examination method. HSC recordings revealed a wide variation in the anatomical characteristics of the neoglottis, in accordance with other studies [14,18]. Those patients who reported clinical improvement after treatment, showed a trend to more regular neoglottis vibrations and stronger mucosal waves after treatment, Table 4, although these results were not statistically significant.
In non laryngectomized subjects, phonation threshold pressure represents the minimum amount of subglottic pressure needed to initiate oscillation of the vocal folds [24]. The subglottic pressure may be estimated either indirectly, by recording air pressure and oral pressure using a mask firmly fitted over the mouth and nose [25], or directly using a percutaneous catheter into the trachea, a translaryngeal catheter through the nose into the trachea or an intraoesophageal catheter [26,27]. During phonation the catheter is partly surrounded by air and partly squeezed by the PES and the oesophageal walls. We therefore used HRVM to measure the phonation pressure and considered that phonation pressure in TE voice is a combination of contact and intraluminal pressure. Decreasing phonation pressure from the distal oesophagus to the pharynx was found after treatment, Table 3. These results agreed with those reported by Takeshita et al. [28] regarding functional TE speakers. Morgan et al. [3] reported phonation pressure at PES between 15-20 mm Hg for functional TE speakers and in a sample of 13 persons, they found four different groups: A hypotonic group with a pressure of 11.3 mmHg, a tonic group with a pressure of 18.3 mmHg, a hypertonic group with a pressure of 45 mmHg and a spasmodic group with pressure of 66.2 mmHg. Before treatment four different groups can be seen in our sample, Table 4: 1 patient with a pressure of 20 mmHg, 2 patients with pressures between 20-45 mmHg, 5 patients with pressures between 45-66 mmHg and 5 patients with pressures higher than 66 mmHg. The spasmodic group showed higher pressures in our study, this difference might be explained by the differences between the methods used to measure the pressure. TL causes oesophageal motility impairment characterized by low contraction amplitudes and non-peristaltic contractions [29-31]. Thus, in order to improve TE speech we should consider not only the PES pressure, but also the pressure in the distal oesophagus. If the pressure in the oesophagus is too low, it will be difficult to produce TE voice. We hypothesized that there might be a phonation index, defined as the ratio between the phonation pressure at the PES and at the distal oesophagus, which might explain the difference between a functional and a non-functional TE speaker. We aimed to reduce this phonation index by treating the PES of our patients with BT and/or BD. Patients who reported improvement showed a decrease in their phonation index, Table 4.
Postoperative and post-radiotherapy changes cause fibrosis and PES stenosis, which may impair the acquisition of TE voice. This may explain why patients respond to BD and do not respond to BT injection in our study. Thus PES hypertonicity is not the only component in TE speech failure, fibrosis at PES and impaired oesophagus motility need to be considered. After TL patients must also cope with dysphagia. Stenosis at the PES occurs in 20 % of patients after TL [32], six patients had stenosis and required BD. This and the disturbance of the oesophageal peristalsis may account for the high incidence of selfreported dysphagia following TL, which is up to 85% in our study and 72 % in the literature [33].


This is the first study that combines voice perceptual assessment, HSC recording and HRVM to assess non-functional TE speakers. It represented a small and heterogeneous group of patients which require individualized assessment. PES hypertonicity is not the only component in TE speech failure, fibrosis at PES and oesophagus pressure need to be considered. All except one of the patients included in the study, had phonation pressures at PES higher than 20 mmHg. After treatment with BD and/or BT the phonation index PES/ oesophagus, phonation at PES and residual pressure at PES decreased and those patients reported clinical improvements.


We thank Helene Jakobsson for providing statistical support, the SLPs Helene Carlsson, Christina Askman and Lotta Brovall for making the voice perceptual assessment, the ENT specialist and phoniatricians Malin Josefsson and Henrik Widegren for their assessments of HSC recordings.


Track Your Manuscript

Share This Page

Media Partners