Survey (including pre-test probabilities)

Yousuf Alweshani, Dwight Harley, David Cook.
Student's perception of the characteristics of effective bedside teachers.
Medical Teacher
2007;29:204-209

Submitted by:Simon Carley - Consultant in Emergency Medicine
Institution:Manchester Royal Infirmary
Date submitted:15th January 2008

Before CA, i rated this paper: 5/10

1 Objectives and hypotheses

1.1 Are the objectives of the study clearly stated?

They are not clearly stated at any point. There is no statement of aim and no clear objective. However, it is not especially complex and it is fairly easy to see what the authors were trying to achieve.

2 Design

2.1 Is the study design suitable for the objectives

Hmmm, it is one way of doing it! The study uses a fairly fixed questionnaire that has predetermined questions by the faculty. This is restrictive in that it only allows the students to answer in domains 'chosen' by faculty. It can only give a fairly superficial insight within these domains as the responses are collated as 5 point Likart scales on a 25 questionnaire. I am concerned that this approach although quick and easy may not be able to sample the depth of perceptions that surely exist. However, I can see this paper as a springboard for further enquiry using more insightful qualitative methods.

2.2 Who / what was studied?

A single university in Oman. 84 final year medical students were sampled.

2.3 Was this the right sample to answer the objectives?

Final year students have the advantage of plenty of experience, but they may value teachers differently as compared to more junior students. There is every reason to suspect that different characteristics may be more valued at an earlier stage of the course. It is a limited sample and small numbers. The single site and Middle Eastern setting does limit the generalisability of this sample.

2.4 Did the subject represent the full spectrum of the population of interest?

See above. Probably not. I believe that they should have conducted this study using a sample of students from several years of study.

2.5 Is the study large enough to achieve its objectives? Have sample size estimates been performed?

No sample size estimate has been performed. It is a small sample. It could have been bigger (by using more students) and is of a size where purposeful comparative statistical analysis is likely to result in type 2 errors.

2.6 Were all subjects accounted for?

Yes, but the response rate was only 74%.

2.7 Were all appropriate outcomes considered?

No. The limitations of the questionnaire domains was restrictive. The authors did go to some effort in order to try and reduce this bias through expert review, literature analysis and pilots. However, limited scope is a feature of all questionnaires and perhaps more so when investigating innate perceptions.

2.8 Has ethical approval been obtained if appropriate?

Yes it was needed; yes it was obtained.

2.9 What measures were made to contact non-responders?

No mention is made of this. It seems it was a on-shot attempt. No comparison of responder: non-responder demographics was made.

2.10 What was the response rate?

74%. Ideally this should be over 80% for a survey (Cochrane recommendations)

3 Measurement and observation

3.1 Is it clear what was measured, how it was measured and what the outcomes were?

The questionnaire had 25 items marked on a Likert scale (1-5). Respondents expressed their agreement against a statement of fact derived from the researchers. So they measured agreement, using a likert scale against predetermined domains.

3.2 Are the measurements valid?

Yes, this is a reasonable method of measuring agreement with a statement.

3.3 Are the measurements reliable?

Not known really as this has not been tested nor compared with other data collection methods.

3.4 Are the measurements reproducible?

Not known as not tested.

4 Presentation of results

4.1 Are the basic data adequately described?

Yes....and no. They report their findings very well in terms of means and Standard deviations. I would argue that this analysis is wrong as they are using continuous data analysis for a categorical (arguably ordinal), but clearly non parametric data (many of the SDs reported pass the extremes of data collection).

4.2 Are the results presented clearly, objectively and in sufficient detail to enable readers to make their own judgement?

Again, I must criticise their reductionist presentation as means and SDs. I personally think that any disagreements are potentially as interesting as agreements and that data is hidden. For example 'be a female' scores 3.15 (mean) yet this may hide some repondents fiercely pro or anti female teachers. Such a finding would be very interesting and worthy of comment but this method of presentation hides this.

4.3 Are the results internally consistent, i.e. do the numbers add up properly?

They appear to.

5 Analysis

5.1 Are the data suitable for analysis?

Yes. but arguably more description than analysis.

5.2 Are the methods appropriate to the data?

There are many tests used. See previous sections for comments on type and presentation of data. I believe all the continuous data analysis to be wrong (e.g. students t-test is not appropriate here). The analysis of correlation is self fulfilling in that scores with higher means have higher correlations but the data has been adjusted in a manner that is unclear (corrected item-scale correlations) and this is unclear to me.
The factor analysis methods used by the authors are well known but extremely difficult to conduct on small sample sizes. The method quoted by Tabachnick and Fidell would describe this sample size as very 'poor'. They quote Kaiser (1970) in defence, but having read that fascinating article (well worth a look for the style or writing more than anything else) I do not find a justification in the quoted work

5.3 Are any statistics correctly performed and interpreted?

See above. There are concerns. In addition the actual method of computation is not described. The statistical tests used are very complex and not widely used. More information should be given as at the moment it appears as though tests have been applied in order to find something positive to report. I firmly believe that statistical analysis should help not hinder the reader.

6 Discussion

6.1 Are the results discussed in relation to existing knowledge on the subject and study objectives?

Yes. The findings are placed well in context and argued through differences in setting and methodologies.

6.2 Is the discussion biased?

No. It is well argued and reflective. There are clearly elements where the authors have found results which they like (language use), but it is not apparent that these are overly stressed.

6.3 Can the results be generalised?

Perhaps not as this is a culturally different environment from that in my practice. It may be generalisable within the geogaphical and/or cultural locale.

7 Interpretation

7.1 Are the authors' conclusions justified by the data?

Yes. The students appear to value several factors in good teaching. Interestingly not all of these (e.g.seniority) are things that can be adopted by all.

7.2 What level of evidence has this paper presented? (using CEBM levels)

Level 4 or 5 (poor quality)

7.3 Does this paper help me answer my problem?

To some extent. Despite the weaknesses there is always something to take away and in this paper it is that many factors are considered important. Personally I find little to argue against and find some findings reassuring (critical thinking being valued). The setting can make interpretation and generalisation difficult, but the factors identified in this paper may be of value in adoption within my own teaching.

After CA, i rated this paper: 5/10

8 Implementation

8.1 Can any necessary change be implemented in practice?

Yes as described above

8.2 What aids to implementation exist?

Personal effort. Observed teaching (peer review) to guide and promote.

8.3 What barriers to implementation exist?

Time and personal motivation.