The impact of a data-monitoring program implemented at study midpoint on PANSS score consistency
Background: It is well understood that there is a degradation of training impact over time. This drift can cause problems for both reliability and validity in clinical trials. In this study, we sought to determine what the impact of a coordinated program of data-monitoring would be in terms of score consistency on the PANSS (Positive and Negative Syndrome Scale) when implemented 12 months from the initial training period. Many studies (e.g., Müller & Szegedi, 2002) indicate that reliability can suffer if training and calibration is not conducted regularly. However this can be resource intensive. Data-monitoring provides targeted feedback for the raters that require it on an ongoing basis, yet the impact of such systems has not been studied in a systematic manner.
Methods: Raters were trained at the investigator meeting utilizing a standardized method consisting of didactic and applied techniques. There was no further training until a data-monitoring program was introduced at the mid-point in the study due to concerns about data integrity. A retrospective analysis of existing data was conducted at mid-point using computer-based algorithms designed to detect logical inconsistencies within the PANSS instrument. The results were used to provide targeted remediation. This system was then left in place for the remainder of the study.
Results: There was a significant reduction in inconsistent scores before and after data-monitoring was implemented. A chi-square (Pearson) test was conducted which indicated the difference in proportions is significant, X2(1, n = 2415) = 30.977, p < 0.001.
Conclusions: The implementation of a data-monitoring system 12 months from the initial training uncovered a range of rater behaviors inconsistent with generation of reliable scores. In the analysis of the data obtained both before and after the data-monitoring was implemented we observed a relationship between those raters that generated very high numbers of inconsistencies and the patient’s subsequent failure to respond to treatment.
Müller MJ, Szegedi A: Effects of Interrater Reliability of Psychopathologic Assessment on Power and Sample Size Calculations in Clinical Trials.J Clin Psychopharmacol; 2002; 22: 318-325.