Correlation of nasal symptoms with objective findings and surgical outcome measurement
Thesis submitted for the degree of Master of Surgery, University of London, 1993.
Published (excluding Chapter 9) 1996.
Recompiled HTML format June 2007
© 1993 – 2014 JW Fairley
Mr James W Fairley BSc MBBS FRCS MS
Consultant ENT Surgeon
Reliability and validity of a nasal symptom questionnaire for use as an outcome measure in clinical research and audit
Part of this chapter previously published
Fairley JW. Yardley MPJ, Durham LH. Parker AJ. (1993)
Reliability and validity of a nasal symptom questionnaire for use as an outcome measure in clinical research and audit of functional endoscopic sinus surgery.
Clinical Otolaryngology 18; 436-437
Chapter 4 Contents
- Statistical methods
- Cronbach’s alpha
- Reliability analysis
- Ease of application
- Effects of age and sex
- Face validity
- Content validity
- Criterion related validity
- Construct validity
- Uses for the questionnaire
- Patient satisfaction – a complicated construct
- Chapter 1: General Introduction and historical review (full text)
- Chapter 2: Correlation of subjective sensation of nasal patency with nasal inspiratory peak flow rate in healthy volunteers
- Chapter 3: Nasal pressure probe studies using a new device in healthy volunteers: Pressure applied to middle turbinate causes pain at lower threshold than inferior turbinate or nasal septum (full text)
- Chapter 4: Reliability and validity of a nasal symptom questionnaire for use as an outcome measure in clinical research and audit
- Chapter 5: The relationship between pain projected on a diagram of the face and systematically documented findings using rigid nasendoscopy
- Chapter 6: The relationship between symptom scores on a specially designed questionnaire and corresponding objective measurements: Nasal inspiratory peak flow and subjective sensation of nasal obstruction
- Chapter 7: The relationship between symptom scores on a specially designed questionnaire and corresponding objective measurements: Postnasal drip, rhinorrhoea, nasal obstruction, cough and mucociliary clearance time
- Chapter 8: The effect on symptoms of facial pain and headache of medical treatment and operations designed to remove endoscopically documented areas of mucosal contact between the turbinates and nasal septum
- Chapter 9: A prospective randomized controlled trial of Functional Endoscopic Sinus Surgery: Endoscopic middle meatal antrostomy versus conventional inferior meatal antrostomy. Interim results. (full text)
In rhinosinusitis, subjective and objective findings correlate poorly, yet there is no generally accepted validated instrument to quantify nasal symptom severity.
A self-administered 12 item questionnaire was developed, based on Lund’s (1988) study of inferior meatal antrostomies.
The questionnaire was tested on 411 general ENT outpatients, aged 14 to 92 years, 206 women and 205 men.
The coefficient of reliability was high (Cronbach’s alpha = 0.776). Removal of any item decreased alpha.
Validity was demonstrated
- at “face value”
- by construct analysis
- and by showing expected differences between diagnostic groups in their responses to the questionnaire.
A canonical discriminant function, based on all symptom scores and their interactions, correctly classified 80% of cases as nasal (n=180) or non-nasal (n=231).
In a further study, 23 patients successfully treated for rhinosinusitis were compared with a subgroup of 111 rhinosinusitis cases from the main study. Overall, 87% of patients, and 96% of the successfully treated group, were correctly classified by the canonical discriminant function.
The nasal symptom questionnaire is a convenient, reliable and valid method for assessing nasal symptom severity. Its principal use is to act as an outcome measure in rhinosinusitis, by comparing scores before and after treatment. There is a tendency for women to score higher than men, and young people to score higher than old, and this should be taken into account in the design of studies using the questionnaire.
The difference between pre-operative and post-operative scores on the questionnaire is now providing the main outcome measure for a randomized controlled trial of functional endoscopic sinus surgery versus inferior meatal antrostomy. We commend it to other investigators to facilitate meta-analysis. It is also simple and quick enough for clinical audit.
Symptom measurement is of obvious importance in studies of a chronic medical condition like rhinosinusitis. The subjective opinion of the patient is sometimes difficult to reconcile with objective findings, but the patient remains the best judge of outcome. Only he or she really knows what it feels like. Patients come to us because of symptoms. They hope and expect to have them relieved. If an operation fails to give relief, that treatment has failed, regardless of any technical success in achieving surgical aims.
Earlier investigators of middle meatal antrostomy concentrated on technical success rate, i.e. patency of the surgical opening, stating that
“little is achieved by quoting figures and statistics, as the results depend to a great extent on subjective response of a patient…”
(Lavelle and Harrison, 1971).
More recently, many different questions and scales have been used to investigate and quantify nasal symptoms (Eccles et al, 1988; Lund, 1988; Hardcastle et al, 1988; Hosemann et al, 1989; Toffel et al, 1989; Larsen and Kristensen, 1990; Cooke and Hadley, 1991; Qvarnberg et al, 1992). Most of these scales are simply made up for the purpose of the individual study.
Hoffman et al (1989) specifically studied symptom improvement following sinus surgery. They note the increasing importance of patients’ subjective perception of the efficacy of treatment as a measure of quality assurance. They state that their questionnaire, which covers five symptoms, had built-in checks on internal consistency and reliability, but do not give further details.
Juniper and Guyatt (1991) in Ontario have formally developed and validated a questionnaire to measure the quality of life in hay fever, and this has also been translated and validated in Germany (Neumann et al, 1992) but there is at present no generally accepted validated instrument for measuring subjective nasal symptoms in rhinosinusitis.
The situation is different in other areas of medicine, particularly psychiatry, where there are few objective clinical or laboratory investigations.
In psychiatry the diagnosis itself, as well as the definition of outcome, may depend on measurement of subjective symptoms. Various methods for determining the reliability and validity of symptom scoring have been developed (Powell, 1989) and these are well established in psychiatry (Davidson, Smith and Kudler, 1989; Zimmerman, Black and Coryell, 1989). Attempts have also been made in other fields to validate symptom scoring, including chronic pain (Vlaeyen et al, 1990), asthma (Richards et al, 1988) arthritis (Fries et al, 1982; Clarke, 1990) irritable bowel syndrome (Maxton et al, 1989) and general perception of ill-health (Hughson et al, 1988).
I needed to quantify nasal symptom severity to provide outcome measures for a planned randomized controlled trial of functional endoscopic sinus surgery versus conventional inferior meatal intranasal antrostomy (See Chapter 9).
An estimate of the expected reduction in symptom severity in the control group was required to conduct a power analysis, to decide the number of patients needed for the trial. Therefore it was logical to select a set of questions which had been used before on patients undergoing the control operation, inferior meatal antrostomy. Lund’s questions (1988) were chosen because detailed results were available, albeit in a small subsample of her study (19 patients).
Rather than simply using the questions in the trial, a study of the clinical applicability, reliability, and validity of the scale was performed, using some of the techniques described by Powell (1989).
As well as validating the outcome measure for the specific purpose of the randomized controlled trial, I wanted to see whether the questionnaire could have general application in clinical research and audit of the treatment of rhinosinusitis.
The series of 12 questions used by Lund (1988) in her studies of the outcome of surgical treatment of rhinosinusitis by inferior meatal intranasal antrostomies were formed into a self-administered questionnaire (figure 4.1).
As in Lund’s work, each question had four possible responses on an ordinal scale
0 = none / normal 1 = mild 2 = moderate 3 = severe
To evaluate the applicability, reliability and validity of the nasal symptom questionnaire in the clinical setting, unselected new patients attending general ENT clinics were asked to complete the questionnaire over a two month period (November and December 1989).
The study took place simultaneously in clinics at University College Hospital, London, Mount Vernon Hospital, Middlesex, and the Royal Hallamshire Hospital, Sheffield.
Patients were given the form by the clinic nurse and asked to fill it in while waiting to be seen.
Patients were instructed to answer the questions on an overall basis, of how much of a problem that symptom was to them, not necessarily how bad it was on that particular day. The nurse was available to help if needed.
On some clinics, the time taken to fill in the questionnaire was recorded by the nurse, to the nearest minute.
The diagnosis was recorded by whichever doctor saw the patient, not necessarily myself.
I later classified the diagnoses as either nasal or non-nasal (Other).
Nasal cases were subdivided into a rhino-sinusitis group, a nasal trauma/deviated septum group, and a miscellaneous group.
Other cases were subdivided into otological, pharyngeal/laryngeal and miscellaneous.
I collected a further group of selected cases of successfully treated rhinosinusitis patients during the same period. Most of these patients had undergone functional endoscopic sinus surgery, some had undergone conventional nasal surgery, and some had medical treatment consisting of intranasal steroid drops. Both patient and investigator agreed that the treatment had been successful.
The SPPS-PC program Version 3.1. was used to compute statistics. The reliability and internal consistency of the scale was tested by calculating Cronbach’s alpha (Norusis, 1988b), with associated descriptive statistics. The contribution of each item on the questionnaire to the reliability was tested by removing each item one at a time and recalculating alpha.
Cronbach’s standardized item alpha coefficient is a generalised measure of reliability, of which almost all other tests can be regarded as special cases. Alpha is based on internal consistency of the scale. It is calculated from the average inter-item correlation and the number of items in the scale, according to the formula:
k = number of items in scale
r̄ = mean inter-item correlation coefficient
The formula for the standardized item alpha requires the items to be standardized to a standard deviation of 1. If the items have widely differing variances, it is necessary to use the mean covariance between items, divided by the mean variance of the items, instead of the mean correlation in the equation. I calculated alpha using both formulae, and the answer is almost identical – 0.7767 (covariance/variance method) vs 0.7755 (correlation method). Therefore the simpler and more intuitive correlation method is described in detail.
Alpha behaves as a squared correlation coefficient and ranges from 0 (none) to 1 (perfect). As can be seen from the formula, if the number of items in the scale is large, the inter-item correlations do not have to be so high to obtain high reliability scores. Reliability in this context means the extent to which the total symptom score is likely to give the same result as another similar measurement of nasal symptom severity. If each item on the questionnaire is measuring some part of a related concept (overall nasal symptom severity) then individual items should be correlated with one another to the extent that they are measuring the common entity. The result can be interpreted as the extent to which the scale tested would be expected to correlate with all other possible k-item scales, constructed from a hypothetical universe of questions on the subject of interest. Another interpretation is that a x 100% of the variability in a hypothetical test, composed of all possible questions on the subject of the questionnaire, would be accounted for by the results of the k-item test used.
Following preliminary checks of the effects of age and sex on diagnostic groups and symptom scores, validity was tested by comparing the results of the questionnaire between diagnostic groups.
Each individual symptom score was compared, together with the total symptom score, and all interactions between them, by
- analysis of variance
- two-tailed unpaired t-tests, and
- discriminant function analysis.
The responses to all 12 questions were entered into a model using standardized canonical discriminant functions to predict membership of the diagnostic groups (known in each case).
Two tests were done on the 411 cases in the main study, first using all six diagnostic subgroups, then simplifying to nasal versus non-nasal cases.
A further test was done to compare the rhinosinusitis subgroup from the main study with the selected group of 23 patients who had been successfully treated for rhinosinusitis.
The relative importance of each symptom, and the overall effectiveness of the discriminant functions in correctly predicting diagnostic group membership were tested.
448 questionnaires were returned for the main part of the study, of which 411 were analyzed (92%). Twenty two were excluded because they had been completed by or on behalf of children under 14 years old. This small proportion was inadequate to assess the applicability of the questionnaire in children, but could have contaminated the results for adults. Eleven were excluded because of missing diagnoses and/or missing answers to the questions, and four because of missing data on age or sex. A further 23 cases of successfully treated nasal patients were analyzed separately.
The time taken to fill in the questionnaire was recorded in 47 patients (Table 4.1). No patient took more than 5 minutes, even where an interpreter was required. 85% were completed within 2 minutes.
Cronbach’s standardized item alpha for the 12-item scale was 0.7755, i.e. the summed score is quite reliable as a measure of nasal symptom severity.
Reliability in this context means the extent to which the total symptom score is likely to give the same result as another similar measurement of nasal symptom severity.
Table 4.2 shows that the standard deviations of the individual items on the questionnaire did not differ markedly, therefore the standardized item alpha is applicable to the data. As a further check, alpha was calculated for the scale using the variance-covariance method, and the result was almost identical at 0.7767.
The average inter-item correlation was 0.2235.
The correlation matrix (Table 4.3) shows that the strongest correlation was between nasal obstruction and reduced sense of smell (r = 0.5159) and the least between epistaxis and sore throat (r = 0.0159).
There were no negative correlations.
Groups of more closely related questions can be discerned from the table, such as the linking of facial pain with headaches (r = 0.4778) and feeling generally unwell (r = 0.4558). Postnasal drip and rhinorrhoea tend to go together (r = 0.4298) but Postnasal drip is not particularly strongly associated with cough (r = 0.1685).
The item-total analysis (Table 4.4) shows that no item on the scale detracts from its reliability, since alpha is not increased by the removal of any question.
Analysis of variance (Table 4.5) confirms the expected significant differences between patients and between items on the scale. It also shows a highly significant component due to interaction between the questionnaire items (p <0.001).
The total symptom score did not differ between men and women (Table 4.6), but there were significant differences in the individual items. Men had higher scores for nasal obstruction and poor sense of smell, while women had higher scores for headaches and feeling generally unwell. These effects cancelled one another out so that the total scores were the same.
The differences were not necessarily due to gender per se, since there were significant differences in diagnostic groups between the sexes (Table 4.7). The major sources of these differences are men in the nasal trauma group (“Saturday night fractures”) and females in the pharyngeal / laryngeal group (mainly tonsillitis; full details in Data Appendix 4).
A Scatter plot of the 411 cases showed a slight trend toward a decreasing total symptom score with increasing age (Figure 4.2). This was statistically significant, but only explains around 3% of the variation of symptom score. There was also a tendency for age to be negatively associated with being a nasal case. Table 4.9 shows statistically significant differences in diagnostic subgroups according to age (p = .0002). As would be expected, there is an excess of young people in the nasal trauma group, and excess older people in the otological group due to presbyacusis.
When age is considered in the rhinosinusitis subgroup (Figure 4.3) the tendency for older patients to have a lower total symptom score is more pronounced.
The highest total symptom scores were in the rhinosinusitis subgroup (Table 4.10). All nasal groups scored higher than non-nasal. The lowest scores – even lower than the non-nasal cases – were in the group of successfully treated rhinosinusitis patients.
Comparing individual symptoms between nasal and non-nasal cases (Table 4.11), scores for blockage, reduced sense of smell, rhinorrhoea, epistaxis, sneezing and postnasal drip were higher in the nasal group. Cough, feeling generally unwell, facial pains, sore throat, headaches and toothache did not differ significantly. The total score was higher in the nasal group.
All symptom scores except the sense of smell were significantly higher in rhinosinusitis compared with the successfully treated group (Table 4.12).
Discriminant function analysis showed that the questionnaire was able to distinguish nasal from non-nasal cases. An optimized canonical discriminant function based on the observed relationships between symptom scores and group membership correctly classified 80% of cases (Figure 4.4 and table 4.13). When all six diagnostic groups were used, the set of five canonical discriminant functions correctly classified 50% of cases (detailed results not shown).
The questionnaire performed best distinguishing rhinosinusitis from treated rhinosinusitis cases (Figure 4.5 and table 4.14). Overall, 87% of cases, and 96% of successfully treated cases of rhinosinusitis, were correctly classified by a standardized canonical discriminant function based on the relationship between the results of the questionnaire and membership of these two diagnostic groups.
Ease of application
The results show that the nasal symptom questionnaire is easily applied in the clinical setting, most patients being able to fill it in with no or minimal help within 2 minutes. Of the 426 completed questionnaires eligible for analysis (excluding the 22 childrens cases) 411 were completed correctly and in full (96%), most of the errors and omissions being by the doctors rather than the patients!
The summed score is quite reliable as a measure of nasal symptom severity. According to Jenkinson et al (1993) an alpha result of over 0.5 is acceptable, and over 0.8 ideal. Nunally (1978) recommends that alpha should exceed 0.7. Richards et al (1988) regarded their alpha results on a series of scales for asthma, ranging from 0.46 to 0.90, as acceptable to good.
The result obtained in this study of 0.78 means that 78% of the variability in a hypothetical test, composed of all possible questions on nasal symptom severity, would be accounted for by the results of this 12-item test.
- The item-total analysis (Table 4.4) shows that no item on the scale detracts from its reliability, since alpha is not increased by the removal of any question.
- The relatively low individual inter-item correlations (mean R = 0.2235, range 0.0159 to 0.5159) show that each question is measuring a different though related aspect of the total.
- There are no negative correlations, again showing that no item on the questionnaire is detracting from its reliability.
- If one wished to improve the reliability of this questionnaire, it would be necessary to add more items to it.
- Overall there is a reasonable compromise between ease of administration and reliability.
Effects of age and sex
Overall, total symptom score did not differ between men and women, but there were significant differences in the individual items (Table 4.6).
- Men scored higher on nasal obstruction and poor sense of smell.
- Women scored higher on headaches and feeling generally unwell.
- These effects cancelled out so that the total scores were the same.
The differences were not necessarily due to sex per se, since there were significant differences in diagnostic groups between the sexes (Table 4.7).
However, within the diagnostic subgroup of most interest, i.e. rhinosinusitis, women did have a significantly higher total symptom score (Table 4.8). This was principally due to their higher scores on
- feeling generally unwell
- rhinorrhoea and
It is therefore possible that the questionnaire may perform differently in men and women suffering from rhinosinusitis, and sex should be examined as a possible confounding variable in any studies using the questionnaire.
A Scatter plot of the 411 cases showed a slight trend toward a decreasing total symptom score with increasing age (Figure 4.2). Pearson’s correlation coefficient r between age and total symptom score was minus 0.171, r squared 0.029, p = 0.0005.
Although this is, statistically, a highly significant result because of the large number of data points, the relationship with age with only explains around 3% of the variation of symptom score. There is also a tendency for age to be negatively associated with being a nasal case, which could explain most of this weak relationship.
Table 4.9 shows statistically significant differences in diagnostic subgroups according to age (p = .0002). As would be expected, there is an excess of young people in the nasal trauma group, and excess older people in the otological group due to presbyacusis. However, when age is considered in the rhinosinusitis subgroup (Figure 4.3) the tendency for older patients to have a lower total symptom score is more pronounced (r = minus 0.261, r squared = 0.0679, p = 0.0057).
The effect of age accounts for around 7% of the variation of total symptom scores in rhinosinusitis, therefore age should also be considered a possible confounding factor in studies using the questionnaire.
If, in a comparative study, one group contained an excess of young women and another an excess of old men, the group of young women would be expected to score higher on the questionnaire.
I have shown so far that the questionnaire is reliable, subject to the caveats on age and sex matching. Validity is a different matter. (Powell, 1989) The questionnaire is valid only if it really measures what it is supposed to measure, i.e. nasal symptom severity.
The first test of validity is simply to look at the questions and consider them at face value, to see whether they make sense. The answer is yes. All 12 symptoms can occur in patients with nasal and sinus conditions, and the ordinal scale (None, Mild, Moderate, Severe) seems reasonable. This questionnaire is obviously about nasal symptom severity. In most ENT studies involving questionnaires, this “face validity” is the only kind of validation that takes place.
The next test of validity is to consider whether questions cover all aspects of the concept being measured. Here it could be argued that more questions should be asked. We are looking at nasal symptoms and their severity. Since we have asked about a related symptom, sore throat, why not earache or deafness, eustachian tube dysfunction being common in rhinosinusitis? (Knight et al, 1992).
Should more detail be asked? Why not separate questions for the right and left side on nasal obstruction, for instance? On the severity axis, it could be argued that more points should be allowed, (Very Severe, Intolerable), or that the whole set of questions should be repeated for “Average” and “Worst Ever” answers, perhaps even repeated three times to include “Right Now”.
These points may or may not be important depending on the use to which the scale is to be put. It must be borne in mind that a more complex and time consuming questionnaire is less likely to be of general use.
There is no statistical test for “Content Validity”, it is a question of informed opinion. In my opinion, these questions do cover the important areas of symptom severity likely to be encountered in rhinosinusitis.
A more formal method of establishing content validity (Fries et al, 1982) is to start out with a very large number of questions, culled from other studies of the problem, expert opinion, and unstructured interviews with patients. These are tested on patients, and by techniques such as cluster analysis (Norusis, 1988b) and repeated application of the reliability tests used in this study, independent dimensions are discerned and redundant questions can be eliminated progressively.
I have not done such an exercise because the prime purpose of this questionnaire was to act as an outcome measure for the randomized controlled trial. Since a power analysis was essential to plan the trial, I needed a ready-made set of questions, and chose those which had already been used by Lund (1988) in a study of the operation which the control group would be receiving. I then proceeded to test these questions to check their reliability and validity for the purpose.
The next test of validity recommended by Powell (1989) is “Criterion-related validity”. This means testing the questionnaire against a measure already known to be valid. Unfortunately no such “gold standard” is available.
The final and most difficult test of validity is “Construct validity”.
- Nasal symptom severity is not a simple physical property.
- It is an artificial “construct”, made up of numerous individual symptoms and their interactions with the subjective opinion of the patient.
- Does the questionnaire result provide a genuine representation of this construct?
- Is this scale really measuring nasal symptom severity?
Again, there is no simple test to establish construct validity. To some extent it only becomes established over time, with repeated use of the scale in many different studies. There is some good evidence for construct validity available from this study. This includes:
- Content validity (the content of the questionnaire pertains to the construct)
- Demonstrated internal consistency of the scale (if it could be broken down into two or more unrelated groups of items, it could not really be measuring a single construct)
- Diagnostic group differences.
If the nasal symptom scores really are measuring nasal symptoms, it would be reasonable to expect higher scores in patients suffering from nasal conditions.
Table 4.10 shows that the highest total symptom scores are in the rhinosinusitis subgroup. The lowest scores – even lower than the non-nasal cases – are seen in the group of successfully treated rhinosinusitis patients. Therefore the total symptom score does appear to be a valid indicator of severity of symptoms in rhinosinusitis.
Looking at the 12 individual items in the questionnaire (Table 4.11), some are more specific than others in discriminating nasal from other diagnostic groups. The principal symptoms which are higher in nasal cases are
- reduced sense of smell
- postnasal drip
The other symptoms – all of which could be considered “secondary” – include
- pains and headaches
- feeling generally unwell
These do not differ significantly between nasal and non-nasal cases.
Since I have only looked at patients attending ENT clinics, it might be expected that feeling “generally unwell” could be elevated in almost any group, although interestingly patients with rhinosinusitis scored highest on this question*. Headaches are unlikely to be confined to patients with sinusitis, they are more commonly caused by other conditions, and indeed headache is a poor discriminant between nasal and non-nasal cases (table 4.11).
However, when rhinosinusitis patients are compared with patients successfully treated for rhinosinusitis – which will be the principal use for the questionnaire – the opposite effect is seen (Table 4.12). All symptoms are significantly improved except for sense of smell, and the “secondary” symptoms of toothache, sore throat, cough, headache and facial pain are reduced to a greater extent than most of the primary ones.
It is therefore valid to include questions on these symptoms in a questionnaire designed to measure reduction in symptom severity following treatment for rhinosinusitis, even though they may not be very good in a diagnostic sense.
There could be certain patterns of score which categorise certain diagnostic groups. The discriminant analysis, which makes full use of all information including associations between questionnaire items, shows that the questionnaire can distinguish nasal from non-nasal cases.
If no information is known other than the 12 symptom scores, an optimized discriminant function based on the observed relationships between symptom scores and group membership will correctly classify 80% of cases (Figure 4.4 and table 4.13).
If all six diagnostic groups are used, the set of five canonical discriminant functions will still correctly classify 50% of cases.
This is quite impressive, considering that the questionnaire is not primarily designed as a diagnostic categorising tool, but a simple numerical indicator of severity of symptoms.
- Its principal use is to act as a basis for comparison before and after treatment.
- When applied to the two groups between which it is designed to discriminate, i.e. those suffering from rhinosinusitis and those successfully treated for rhinosinusitis, it performs best of all (Figure 4.5 and table 4.14).
- Overall, 87% of cases, and 96% of successfully treated cases of rhinosinusitis, were correctly classified by a discriminant function based on the results of the questionnaire.
Much of the variability in these results arises between patients (see analysis of variance, Table 4.5).
- In the context of clinical research and audit, we will not usually be making direct comparisons between patients.
- An individual who rates his symptom “severe” is not necessarily worse than another who rates it “moderate”.
- What we will be comparing is before and after scores in the same patient, thereby eliminating or at least reducing that part of the variation due to individual differences in self-rating of symptom severity.
- We will want to see whether this difference – the improvement – is affected by any treatment factors.
Uses for the questionnaire
This questionnaire is primarily designed as an outcome measure, to be applied before and after treatment of rhinosinusitis.
It could be argued that the questionnaire is unnecessarily complicated for this purpose. Why not simply ask the patients if they are satisfied with their treatment? This is easy to do, but difficult to interpret, because patient satisfaction is a complicated construct.
Fitzpatrick (1991) identified multiple dimensions of patient satisfaction including
- overall quality
- attention to psychosocial problems
Patients may hold distinct and independent views on each of these aspects of their care. The four major axes are
- the doctor’s conduct
- availability of care
- continuity and convenience
- financial accessibility
These issues are obviously important, especially in the context of audit and quality improvement, but tend to muddy the waters when assessing clinical outcome.
A doctor with a good “bedside manner” can have many satisfied patients even if the treatment given is of no medical value. A less socially gifted colleague prescribing better treatment may fare worse on patient satisfaction scores. The “three A’s” needed for successful private practice are said to be – in order of importance –
We are only interested in measuring the effectiveness of one treatment compared with another, and do not want to confuse the picture by including all these other variables.
- A specific validated self-administered questionnaire, provided it is carefully developed and well piloted, is the best option for clinical research use.
- Good audit studies should also focus on the aims of treatment and whether they are being achieved.
- This questionnaire is very good at identifying the principal symptoms, and therefore helps focus attention onto these.
- I have found in practice that it helps set realistic aims and also helps identify unrealistic aspirations, which might not otherwise be brought into the open.
- The questionnaire is simple and quick enough for practical use in routine clinical work. It can actually speed up the consultation, since the patient fills it in while waiting to see the doctor.
- By identifying the principal symptoms beforehand, these can be gone into in greater detail immediately, and the consultation can be more productive.
The nasal symptom questionnaire is a convenient, reliable and valid method for assessing nasal symptom severity.
Its principal use is to act as an outcome measure in rhinosinusitis, by comparing scores before and after treatment.
There is a tendency for women to score higher than men, and young people to score higher than old, and this should be taken into account in the design of studies using the questionnaire.