Methods & Results
The validity of the MAAS-MI General was examined by correlating MAAS measurements of five types of medical interviewing skills with seven distinct measures of medical competence.
We examined the validity of the MAAS History-Taking and Advice Checklist by correlating the MAAS scores with other measurements of medical competence.
We assessed physicians’ medical knowledge, interpersonal skills, care and concern for the patient and problem-solving skills as well as their medical interviewing skills.
We found that medical interviewing skills are confirmed by interpersonal skills, care and concern, and problem-solving skills as far as they concern information exchange.
In other words, medical competency supports the validity of MAAS Medical Interview.
Crijnen, A. A. M., Post, G. J., Kraan, H. F., van der Vleuten, C., & Zuidweg, J. (1987). Interviewing skills and medical competence. In H. F. Kraan & A. A. M. Crijnen (Eds.), The Maastricht History-taking and Advice Checklist – studies of instrumental utility (pp. 233–248). Lunbeck, Amsterdam.
The validity of the Maastricht History-taking and Advice Checklist – General is examined by correlating MAAS-scores with other measurements of medical competence. Physicians’ medical knowledge, interpersonal skills, care and concern for the patient and problem-solving skills are assessed in addition to the measurement of medical interviewing skill. Medical interviewing skills are confirmed by measurements of interpersonal skills, by measurements of care and concern and by measurements of problem-solving skills as far as they concern information exchange.
Validity coefficients of medical competency support the validity of MAAS-Medical Interview
The study reveals furthermore, that medical interviewing skills are clearly discerned from medical knowledge and the formulation of a treatment plan. The validity coefficients, therefore, support the validity of the Maastricht History-taking and Advice Checklist.
A plea has been made recently for the study of the validity of measurements of medical competence (Katz, 1982; Gonella, 1985). Some have said that greater attention should be given to the valid assessment of physicians’ actual job performance, whereas others have observed that medical competency cannot be conceptualized as a single variable, since knowledge, problem-solving skills, interviewing skills and attitudes should be distinguished from each other.
In medical competency, knowledge, problem-solving skills, interviewing skills, attitudes and examination skills can be distinguished
Validity studies of methods of measurements of medical competency should be designed according to generally accepted, scientific criteria. On reviewing the literature, the following criteria were detected (Cronbach et al, 1955; Thorndike, 1982).
The first criterion requires a clear definition of and theory for the competency under study in order to ascertain the match between the conceptualization of the competence and its empirical measures, and to distinguish the competence under study from other distinct competencies: e.g., history-taking skills should be defined clearly in order to achieve agreement among researchers about the content of a test designed to measure history-taking skills. Moreover, the definition of History-taking Skills should augment the distinction from other competences, such as Interpersonal Skills or Medical Knowledge.
The second criterion requires that validity is empirically underscored by different methods of measurement of the same competence. Evidence from different sources gathered in different ways should all point to the same competence domain: e.g., different methods of measurement of a physician’s interpersonal skills are expected to correlate higher with each other than with measurements of his other competencies.
The third criterion requires that methods of measurement of the competence under study are differentiated empirically from measurements of other competences (Cronbach et al, 1955; Kerlinger, 1973). Measurements ought to show the difference between theoretically discernable medical competences: e.g., the skills to Explore the Reasons for Encounter are expected to correlate low with Medical Knowledge or the quality of physician’s Treatment Plan.
These three criteria constitute a scientific model of how the validity of measurements of medical competency ought to be examined. Three validity studies of measurements of medical interviewing skills were scrutinized by applying the criteria (Stillman et al, 1977; Brockway, 1978; Swanson et al, 1981).
In the first study by Stillman et al (1977), who examined the validity of the Arizona Clinical Interview Rating Scale (ACIR), the 16 interviewing skills measured by the ACIR-scale are grouped into six major subsections which are treated ultimately as if they all contribute to a similar competence domain. Since the items purport to different skills, the correctness of this procedure can be questioned. No methods intending to measure the same competence domain were taken into account. However, the scores on the ACIR-scale were correlated with Medical College Admissions Tests which are supposed to measure a different competence. This study therefore fulfils only one of the three aforementioned criteria.
The second study under consideration is by Brockway (1978), whose analysis of the validity of her interview rating scale shows same shortcomings because the content of the two subscales, respectively Relationship Skills and Problem-solving Skills, is rather heterogeneous and can easily be interchanged. The differences between the subscales are not well defined. No measurements indicating a similar competence domain were taken into account. The study, however, fulfilled one of the three criteria, because the rating scale was compared to two different measurements of medical competence; namely, data collection and problem identification.
The third study under consideration is by Swanson et al (1981), who compared the validity of three measurements of medical interviewing skills, the ACIR-scale, an Interaction Analysis-rating, and a History and Physical Exam checklist. The heterogeneity of the measures hindered the researchers in adequately interpreting the matrix of correlations between the methods. They concluded that, in essence, no evidence for construct validity through comparison between measurements had been obtained.
By reinterpreting the correlation matrix, it is possible to arrive at a different conclusion. When all measurements are assigned as measurements of interpersonal skills, communicative skills or physical examination skills, each of the three measurements of interpersonal skills appears to correlate significantly with the others: this conclusion supports validity. Moreover, measurements of interpersonal skills can be distinguished from communicative skills and physical examination skills. The same is not true for measurements of communicative skills. The study fulfilled two of three criteria, because measurements with the same and a distinct meaning were applied simultaneously.
The validity of the Maastricht History-taking and Advice Checklist – General is established by taking into account the three criteria mentioned previously.
This preliminary study was conducted before the reliability and scalability of the MAAS-G were examined thoroughly by means of the procedures described in Why the Medical Interview? All items are thus taken into account as is the scale Basic Interviewing Skills which had not yet been divided into Interpersonal Skills and Communicative Skills.
The validity of the MAAS-MI General was examined by correlating MAAS measurements of five types of medical interviewing skills with seven distinct measures of medical competence:
The delineation of constituents of medical competence was derived largely from the work of Fabb and Marshall (1983).
Simulated Patients Simulated patients, lay-people staging a medical complaint, were used because they provide the opportunity for assessing medical interviewing skills as well as physical examination skills when real patients cannot be used (Stillman, 1983). In this study, the simulated patients were lay people, chosen from a large group of simulated patients who were trained by the Skills Laboratory at Maastricht Medical School to simulate complaints in the undergraduate-medical curriculum.
Two simulated patients presented respectively complaints of fatigue and dyspnoe in exertion, and low back pain without irradiation. The correlations between all measurements are expected to support either the similar meaning or the distinct character of MAAS-G-measurements of medical interviewing skills and, as a secondary goal, to highlight the interrelations between interviewing skills and other domains of medical competence.
The validity of the MAAS-MI General was examined by correlating MAAS measurements of five types of medical interviewing skills with seven distinct measures of medical competence.
All 45 physicians who graduated in the summer of 1982 from Maastricht Medical School were asked to participate in this study: 28 decided to take part. The participating physicians did not differ significantly from the total group of graduated physicians in terms of age, sex distribution or scores on a medical knowledge test. The mean age of these 10 women and 18 men was 26. All subjects had gone through the 6 year problem-based medical curriculum, part of which is a continuous teaching program of medical interviewing skills with use of simulated patients under expert supervision. The study was carried out 2-3 months after graduation with the original goal of following up on physicians’ competence after medical school (Post et al, 1985). We used this opportunity to validate the MAAS-G.
The MAAS-MI General is a 68-item observation instrument for the assessment of medical interviewing skills. Expert observers view videotapes of (simulated) medical interviews and rate these interviews on the items. Items are grouped into five scales measuring distinct types of interviewing skills.
The first scale, Exploring Reasons for Encounter, measures the physician’s ability to clarify the patient’s complaint, to explore the motives and expectations in the pre-patient phase leading to the visit and to obtain information about the patient’s causal attributions. It measures the patient-centered part of the medical interview.
The second scale, History-taking, measures skills which enable the physician to generate hypotheses about the nature of the patient’s complaint, to test these hypotheses and to describe the complaint in medical explanatory terms. It measures the collection of present and past medical data.
The third scale, Presenting Solutions, measures the quality of information exchange on diagnosis, aetiology, prognosis, treatment and the negotiation between physician and patient about the treatment plan.
The fourth scale, Structuring, measures the physician’s skill in opening, closing and phasing the interview.
The fifth scale, Basic Interviewing Skills, measures the ability to enhance effective information exchange and to establish rapport with the patient.
The items in the MAAS-G refer either to content or process of medical interviewing skills during initial consultations. They are described behaviorally and have to be scored by skilled observers. Items and criteria for scoring are defined in an MAAS-G Items & Manual, available in Dutch and English (Crijnen et al, 1987). Items are described in behavioral terms to enhance both the reliability and practical application of the MAAS-G in educational situations. The items in the first four scales are scored on a two-point scale (behavior is present or absent), whereas items in the fifth scale are scored on a three point rating scale (positive, indifferent, negative). In this study, items were rated by two skilled observers.
Medical knowledge was assessed by means of the Medical Knowledge Progress Test (Verwijnen, et al, 1982). The knowledge test was part of the examination system at Maastricht Medical School and consisted of approximately 250 true-false statements pertaining to medical knowledge. This score was obtained by counting the number of correct-scores. Each knowledge test was administered to all students at Maastricht Medical School four times a year. Since the present study was conducted only 2-3 months after graduation, the total correct-scores of the four tests administered during the physicians’ final year in medical school were included in this study as a measure of medical knowledge.
The quality of the physician’s interpersonal skills was rated by expert observers by means of a 10-item instrument, pertaining to the physician’s attention for the patient and warmth in the communication. The items were scored by general practitioners on Likert-type, 5-point scales after observing the videotapes of physicians interviewing simulated patients. In this study, each interview was observed by three general practitioners who were randomly chosen for each subject out of a pool of 10 general practitioners who served as expert raters in this study. The score was obtained by averaging the scores of these three general practitioners for each interview.
Care and concern during physical examination was measured by means of a four-item instrument, focusing upon the physician’s care in reducing a patient’s anxiety and his efficiency during the physical examination. The items were expert ratings on Likert-type, 5-point scales by the same experts who rate the interpersonal skills. The care and concern score was obtained by averaging the scores of these three general practitioners for each physical examination.
Medical problem-solving skills were assessed by a paper and pencil test, called Summative Evaluation of Initial Medical Problem-solving (SIMP) (de Graaff et al, in press). SIMP requires physicians to read a short case-vignette and to write down narrative answers to four open-ended questions which reflect the process of medical problem solving. These questions are:
Physicians’ responses to each question were compared to criterion-answers obtained from a group of experienced general practitioners who answered the questions for each case-vignette themselves and attained agreement about the correct answers. For example, the Subjective Information-score was the number of matches between the physician’s subjective information narrative and the experts’ preset criteria.
The 28 physicians all filled in 6 SIMP’s within one hour. They were then randomly assigned to one out of two simulated patients, yielding two subgroups of 14 physicians. Physicians were asked to behave as if they had taken charge of a colleague’s practice. They were expected to interview the patient, to conduct a physical examination and to present a treatment plan to the patient. The available time was not to exceed 45 minutes. Interview and physical examination were videotaped.
These videotapes were observed independently by a general practitioner and a fourth-year medical student who both rate each interview on all the items of the MAAS-G. Three general practitioners, who were randomly chosen for each subject, independently viewed the videotapes and rated them on the interpersonal skill-variable and the care- and concern-variable.
Finally, the physicians’ scores on four medical knowledge progress tests administered during their final year of medical school were included in this study.
In Table 1, the reliability of the instruments used in this study are shown.
Since subjects were assigned to one of the two simulated patients, two subgroups were formeel. Mean scores for none of the criterion measures differed significantly between the groups. However, scores on three scales of the MAAS-MI G, namely, Exploring Reasons for Encounter, Presenting Solutions and Structuring, showed significant differences between the two subgroups (t- -2.66, pS.05; t- -3.67, p.01; t- -1.75, pS.05; DF=26; two-tail) which reflect an influence of the cases on interviewing skills.
Because of the significant influence of the cases on the MAAS-MI G-measurements, variance due to cases was partialed out from the correlations between the five MAAS-MI G-measurements and the seven other measurements of medical competence. Since two physicians exceeded the available time to answer the 6 SIMP’s, their scores are omitted from the analyses. The number of complete cases on which the correlation matrix is based is thus limited to 26. Moreover, correlation coefficients have been corrected for attenuation in the criterion variables by taking the internal consistency as indication for reliability. Thus, validity is expressed as if the coefficients were based on completely reliable criterion measures.
Validity coefficients between the five types of medical interviewing skills and the seven other measurements of medical competency are shown in Table 2.
The scale Exploring Reasons for Encounter is strongly correlated with the measurement of interpersonal skills, subjective information and assessment of diagnosis. The scale is moderately correlated with care and concern and objective information.
The scale History-taking correlates strongly with subjective information and objective information, and is moderately correlated with interpersonal skills and assessment of diagnosis.
The scale Presenting Solutions correlates moderately with interpersonal skins, care and concern, and subjective information.
The scale Structuring correlates strongly with care and concern and moderately with interpersonal skills and subjective information.
The scale Basic Interviewing Skills is strongly correlated with measurements of interpersonal skills and subjective information. It correlates moderately with care and concern, objective information and assessment of diagnosis.
In general, the validity coefficients between MAAS-MI measurements of medical interviewing skills and seven other measurements of medical competence support the validity of the MAAS-G, although some unexpected dissonants are found.
Measures of medical competency support the validity of MAAS-MI
The scale Exploring Reasons for Encounter converges with ratings of interpersonal skills and subjective information as expected. The low correlation with medical knowledge and treatment plan, which have also been found in other studies (Stillman et al, 1977; Brockway, 1978), underscore the distinct character of these competence domains. The unexpectedly high validity coefficient with assessment of diagnosis may be explained by the necessity to ask patient-centered questions in order to include in the diagnosis issues defined by the patient as a problem.
Patient-centered information needs to be included in diagnosis and treatment plan
This is especially true for cases with a combination of somatic and psychological problems in which patient’s concerns and real-life circumstances have to be included in the assessment of the diagnosis (Mishler, 1982). The moderate correlation with the care- and concern-variable supports the validity of the scale Exploring Reasons for Encounter because an attitude of caring and reassurance during the physical examination reflects a human dimension in the physician’s approach to the patient, also reflected in the skills to explore the reasons for encounter.
The scale History-taking, which measures the collection of present and past medical data, is highly correlated with measurements of subjective and objective information during medical problem-solving. Medical problem-solving is seen as a collaboration of several competences of which the search for additional data by means of history-taking is considered to be the communicative aspect (Neufeld et al, 1981). Since history-taking skills are determined by the process of medical problem-solving, the strong correlations underscore the validity of the MAAS-MI G-scales.
Medical problem-solving drives the process of History-taking
The low correlations of history-taking with medical knowledge and treatment plan support the distinct character of these measures of medical competence. The moderate correlation between History-taking and assessment of diagnosis suggests that physicians who collect more data have a better chance of establishing an accurate diagnosis. Although this finding is supported by Brockway (1978), other studies reveal that the number of questions asked during history-taking is not unequivocally related to the quality of diagnosis (Kassirer et al, 1978). Specialists collect less information and mention the correct diagnosis earlier in comparison with non-specialists who often revert to a general review of organ systems.
The scale Presenting Solutions does not correlate strongly with any of the other competences. The validity coefficients with measurements of interpersonal skills, care and concern as well as with subjective information are moderate. The strength of these correlations indicates the continuous nature of information-exchange and the human factor of patient-centeredness and reassurance, which are also constituents of the scale Presenting Solutions. The low correlation between this scale and treatment plan was expected because the MAAS-MI G-scale deals only with the process of information-exchange and negotiation and not with the content of the exchanged information, the treatment plan itself. This argument also holds for the low validity coefficients with medical knowledge, objective information and assessment of diagnosis, which underscores once more the distinct character of the scale Presenting Solutions.
The scale Structuring is strongly correlated with the care- and concern-variable, which measures the physician’s care in reducing the patient’s anxiety during the physical examination. Items in this MAAS-MI G-scale measure the quality by which the physician structures the interview into naturel segments enabling the patient to voice his concerns and to understand the goal of certain interview behavior.
Structuring the interview is an important tool to show care and concern and reduce anxiety in your patient
Although not yet studied, it is conceivable that interviewing skills which structure the interview will reduce the patient’s anxiety. In analogy with the reassuring effects of introduction and explanation of the procedures during the physical examination, similar behavior by the physician during the interview may entail similar effects. The low correlation with medical knowledge, objective information and treatment plan indicate a divergency with knowledge and problem solving skills as expected.
The scale Basic Interviewing Skills accompanies measurements of interpersonal skills and of subjective information during medical problem solving. This agrees with the underlying aim of the scale, which is to measure a physician’s ability to establish an optimal rapport with the patient and to induce an effective exchange of information. This pertains also to the moderate correlation with care and concern. The moderate correlations with objective information and assessment of diagnosis might be explained by the statement of DiMatteo and DiNicola (1982), that a physician’s competence is likely to involve a scientific and technical ability translated into practice through both interpersonal skills and the art of medicine. The low validity coefficients with medical knowledge and treatment plan underscore the distinct character of the pertinent competences.
Methodologically, the present study posed several problems.
Firstly, the reliability indices of the MAAS-MI G were, unexpectedly, low to moderate: in prior studies, higher inter-observer reliability has been obtained. Further exploration by means of generalizability studies revealed that the coefficients differed markedly for both patients (respectively, .62 and .37). This difference in reliability was attributed at least partly to the controlling communication style of one of the simulated patients. Since MAAS-MI G-items are mainly directed at the physician’s interviewing skills, the observers easily make mistakes when patients take the initiative so intrusively. Item definitions and criteria for scoring cannot deal appropriately with this situation.
Secondly, the strong impact of the cases on physicians’ interviewing skills influenced the validity study to some extent. The influences of the different cases may be attributed to two factors: the presentation of the case by the simulated patients and the characteristics of the medical problem. Prior studies with the MAAS-MI G provided evidence that the case as medical problem did not influence the physician’s interview behavior significantly (Kraan et al, 1986), whereas others have pointed to a considerable impact of the characteristics of the medical problem on the physician’s interview behavior (Norman et al, 1981). Interpretation of the study presented here suggests that case influences seem to result from the case presentation by the simulated patients.
This conclusion is corroborated by the finding that history-taking skills were influenced least by the difference in cases whereas, at first sight, the greatest influence of the case as medical problem was expected on this type of interviewing skills.
The MAAS-MI validly measures 5 distinct types of medical interviewing skills, because the validity coefficients confirm the strength:
The study supports the model of distinct medical competences as delineated by Fabb and Marshall (1983) and underscores that the evaluation of students’ or physicians’ medical competency can no longer be based solely on the assessment of their medical knowledge as is the case in most examinations. Medical interviewing skills should be taken into account.
Selected Reading
DiMatteo MR, DiNicola DD. Achieving patient compliance: the psychology of the medical practitioner’s role. Pergamon Press, New York, 1982.
Fabb WE, Marshall JR. The assessment of clinical competence. Lancaster, England, MTP Press Limited, 1983.
Thorndike RL. Applied psychometrics. Houghton Mifflin Company, Boston, 1982.
Katz FM. Trends in assessment (editorial). Medical Education, 1982; 16: 61-62.
Swanson DB, Mayewski RJ, Norsen L, Baran G, Mushlin AI. A psychometric study of measurement of medical interviewing skills. In: Proceedings of the 20th Annual Conference of Research in Medical Education, Washington DC, 1981.
All References
Brockway BS. Evaluating physician competency: what difference does it wake? Evaluation and Program Planning, 1978; 1: 211-220.
Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychological Bulletin, 1955; 52: 281-302.
DiMatteo MR, DiNicola DD. Achieving patient compliance: the psychology of the medical practitioner’s role. Pergamon Press, New York, 1982.
Fabb WE, Marshall JR. The assessment of clinical competence. Lancaster, England, MTP Press Limited, 1983.
Gonella JS. Evaluation of clinical competence (editorial). Journal of Medical Education, 1985; 60: 70-71.
Graaff E de, Post GJ, Drop Mj. Validation of a new measurement of clinical problem-solving. Medical Education, (accepted for publication).
Kassirer JP, Gorry GA. Clinical problem solving: a behavioral analysis. Annals of Internal Medicine, 1978; 89: 245-255.
Katz FM. Trends in assessment (editorial). Medical Education, 1982; 16: 61-62.
Kerlinger FN. Foundations of behavioral research. Holt, Rinehart and Winston Inc., New York, 1973.
Kraan HF, Crijnen AAM, DeVries MW, Zuidweg J, Imbos T, Vleuten C van der. Are medical interviewing skills teachable? Perspectief, 1986; 4: 29-51.
Kraan HF, Crijnen AAM, Zuidweg J. The Maastricht History-taking and Advising Checklist: an observation instrument for the measurement of physicians’ interviewing skills in initial medical consultations in primary care, manual for scoring. Department of Social Psychiatry, University of Limburg, Maastricht, 1986.
Mishler EG. The discourse of medicine: dialectics of medical interviews. Ablex Publishing Corporation, Norwood, New Jersey, 1982.
Neufeld VR, Norman GR, Feightner JW, Barrows HS. Clinical problem- solving by medical students: a cross-sectional and longitudinal analysis. Medical Education, 1981; 15: 26-32.
Norman GR, Feightner JW. A comparison of behavior on simulated patients and patient management problems. Medical Education, 1981; 15: 26-32.
Post GJ, Hellemons-Boode BSP, Heyden PFA van der, Graaff E de, Drop MJ. Medische competentie: een vergelijking tussen verschillende meetinstrumenten (Medical competence: comparing different measurement instruments). Rijksuniversiteit Limburg, Maastricht, 1985.
Stillman PL, Brown DR, Redfield DL, Sabers DL. Construct validation of the Arizona Clinical Interview Rating Scale. Educational and Psychological Measurement, 1977; 37: 1031-1038.
Stillman PL, Burpeau-Di Gregorio MY, Nicholson GI, Sabers DL, Stillman AE. Six years of experience using patient instructors to teach interviewing skills. Journal of Medical Education, 1983; 58: 941-946.
Swanson DB, Mayewski RJ, Norsen L, Baran G, Mushlin AI. A psychometric study of measurement of medical interviewing skills. In: Proceedings of the 20th Annual Conference of Research in Medical Education, Washington DC, 1981.
Thorndike RL. Applied psychometrics. Houghton Mifflin Company, Boston, 1982.
Verwijnen GM, Imbos T, Snellen A, Stalenhoef B, Pollemans M, Luyk S van, Sprooten SM, Leeuwen Y van, Vleuten C van der. The evaluation system at the Medical School of Maastricht. Assessment and Evaluation in Higher Education, 1982; 3: 225-244.