Introduction
Cardiac arrhythmias are experienced by more than 1 million people a year in the UK. Arrhythmia-related symptoms include palpitations, breathlessness, chest pain, dizziness, and fatigue [
1], which can have profoundly negative effects on patients’ quality of life (QoL) [
2]. The cost to the UK National Health Service (NHS) of treating people with atrial fibrillation (AF), the most common of these arrhythmias, is large; estimates put the direct cost at 0.9–2.4 % of overall healthcare expenditure [
3]. The intended benefit of percutaneous radiofrequency cardiac ablation is to improve patient QoL and eliminate or reduce arrhythmia-related symptoms.
Patient-reported outcome measures (PROMs) aim to determine a patient’s own view of their symptoms, functional status, and health-related QoL before and after an intervention. PROMs offer a particularly useful platform to evaluate the effect of cardiac ablation on arrhythmia-related symptoms because arguably the biggest impact of a successful ablation treatment is an alleviation of anxiety and physical symptoms [
4].
Patient-reported outcome measures (PROMs) should include both a generic and a disease-specific tool; both can be used simultaneously to build a comprehensive picture of patients’ status [
5]. Whilst generic QoL instruments such as SF-36 and EQ-5D-5L have been validated and extensively used in a variety of populations, formal validation of disease-specific tools is not always often performed. Many QoL questionnaires have been developed specifically for use in patients with AF [
4,
6]. The questionnaires validated in this study provide a method of including both generic and disease-specific measures combined with measures of patient expectations and experiences of their ablation procedure into one questionnaire.
The key elements of validation involve evaluating a PROM tool for its reliability (test–retest, internal consistency), validity (content and construct), sensitivity (to differences between groups), and responsiveness (to change in patients’ condition) [
7]. These steps are essential to enable data derived from a tool to be useful and interpretable. The aim of this study is to formally validate a PROM tool for UK patients with arrhythmias treated with catheter ablation. This builds on a previous feasibility study [
1] and the initial stage of this study which was to establish content validity through patients’ interviews [
8].
Methods
This multicentre, prospective, observational cohort study was designed to formally develop, evaluate, and validate a new UK PROM tool for patients with cardiac arrhythmias treated with cardiac ablation procedures (UK Clinical Research Network Study Portfolio reference 13148).
Ethics
The study protocol was reviewed and approved by the Nottingham 1 Research Ethics Proportionate Review Sub-Committee (reference 12/EM/0164) and conducted in accordance with Good Clinical Practice and the Declaration of Helsinki. Informed consent was obtained from all patients who took part in this study.
Development of the initial C-CAP questionnaires
Initial item generation, face validity, and content validity of the cardiac ablation PROM were evaluated as described previously [
1,
8]. The pre-validation C-CAP questionnaires used in this study consist of a 17-item pre-ablation questionnaire (C-CAP1) and a 19-item post-ablation questionnaire (C-CAP2) and are described in full by Withers et al. [
8]. Questionnaires which incorporate the amendments identified in the current validation study are termed “final C-CAP1 and final C-CAP2” for clarity and have been provided as an online resource.
Patients
Patients under the care of physicians at three clinical sites in the UK (University Hospital Wales, Cardiff; Queen Elizabeth Hospital, Birmingham; Freeman Hospital, Newcastle-Upon-Tyne) were eligible for inclusion in this study. Patients were enrolled only if they were aged 18 or over, had a diagnosis of symptomatic cardiac arrhythmia, had consented to and were awaiting a cardiac ablation procedure, and were able to read, write, and understand English or Welsh.
Questionnaire procedures
Patients across three sites were approached consecutively to take part in the study and provided with a participant information sheet, consent form, the pre-validation, pre-ablation questionnaire (C-CAP1), and a stamped addressed envelope at the time of their appointment or with their appointment letter. Patients were given time to consider their involvement and completed the C-CAP1 questionnaire if and when they wished to do so (some completed the questionnaire on the day of their ablation). Patients from whom a signed consent form was received were considered to be enrolled. Returned C-CAP1 questionnaires were excluded from analysis if the patient had received their ablation procedure prior to completion of the questionnaire.
No change in treatment or clinical assessment was carried out on patients as a result of their participation in this study. Patients underwent percutaneous radiofrequency or cryotherapy ablation using conscious sedation or a general anaesthetic.
At 8–16 weeks following their ablation, patients were sent a pre-validation, post-ablation questionnaire (C-CAP2) to their home with a freepost envelope. Non-responders were sent a reminder letter with a replacement questionnaire approximately 2–3 weeks after the initial mailing. C-CAP2 questionnaires were excluded from analysis if they were completed more than 20 weeks after the ablation.
Identical retest questionnaires (C-CAP1R and C-CAP2R) were sent to a random subset of patients (no patients were sent retest questionnaires for both). Pre-ablation retest questionnaires (C-CAP1R) were sent 1 week after completion of C-CAP1. The following exclusions were applied to select for patients who were assumed to have a stable condition between completing test and retest questionnaires: (1) an ablation was carried out in between completion of C-CAP1 and C-CAP1R; (2) >30 days elapsed between the patient completing C-CAP1 and C-CAP1R. Post-ablation retest questionnaires (C-CAP2R) were sent to patients 1 week after completion of C-CAP2. Returned C-CAP2R questionnaires were excluded if >30 days passed between completion of C-CAP2 and C-CAP2R.
Further follow-up is currently being conducted at 1 and 5 years post-procedure (data not included in this publication).
Sample size
No formal sample size calculation was conducted for this questionnaire validation study. A minimum sample size of 150 patients from each centre has been suggested in previous studies to allow meaningful comparisons to be made [
9]. A target of 450 enrolled patients was set to ensure that smaller sub-groups are adequately represented within the sample and to provide representation from various arrhythmia types.
Description of C-CAP questionnaires
Pre-validation C-CAP1 was split into four domains (Table
1) and comprised 17 questions related to patient expectations; condition and symptoms; activity and healthcare visits; and medication and general health. The conditions and symptoms domain contained three multi-item scales: (1) symptom severity (15 sub-items); (2) frequency and duration of symptoms (2 sub-items); (3) impact on life (10 sub-items). Pre-validation C-CAP2 comprised five domains (Table
1), three of which are replicated from C-CAP1 and allowed comparison before and after the procedure (condition and symptoms; activity and healthcare visits; and medication and general health). In addition C-CAP2 covered change in symptoms and procedure-related symptoms. Both questionnaires also include the generic EuroqoL EQ-5D-5L [
10] questionnaire and visual analogue scale (EQ-VAS). The EQ-5D-5L, used since the beginning of this project, was chosen over the EQ-5D-3L (used in other NHS PROM tools) because of its improved discriminatory power which we felt was important during development and testing of these questionnaires [
11].
Table 1
Description of domains within the pre-validation C-CAP questionnaires used in this study
Pre-validation, pre-ablation questionnaire (C-CAP1) |
Pre-ablation patient expectations | 1–5 | Contains a 4 item Likert scale (Q1–3b) with five response options (each item scored 0–4); each explored patients’ treatment expectations prior to the procedure. The “treatment expectations” multi-item scale had a minimum score of 0 and a maximum of 16 (the 4 items in the scale were given equal weight, and each had a minimum score of 0 and a maximum of 4). This domain also asked whether this is the patient’s first ablation (Q4) and for the number of previous ablations received (Q5) |
Condition and symptoms | 6, 7, 8, 13 | This domain was a modified version of the disease-specific Patient Perception of Arrhythmia Questionnaire (PPAQ) originally developed by Wood et al. [ 21]. Following adaptations for use in a UK population with specialist, lay and patient input, this updated tool included elements which were divided into three multi-item scales where a high score reflects a worse health state: Symptom severity (Q6a–o): 15 item symptom scale, each symptom/item had 4 response options (scored 0–3). The minimum score was 0 and the maximum was 45 (equal weight was given to each item in the scale and all subsequent scales) Frequency and duration of symptoms (Q7–8): two item scale, each item had 5 response options (scored 0–4). The minimum score was 0 and the maximum was 8 Impact on life (Q13a–j): 10 item scale, each item had 4 response options (scored 0–3). The minimum score was 0 and the maximum score was 30 |
Restricted activity days and healthcare visits | 9a–12b | This domain was modified from the PPAQ [ 20] and aimed to count how many days (either work/school/college, social activities, or normal daily activities) in the last 30 have been affected by arrhythmia symptoms. The number of visits to a GP or hospital in the last 30 days was also recorded |
Medication and general health | 14–17 | Q14 asked whether the respondent normally takes medication (yes/no); Q15 asked for the name and dose of medication (free text); Q16 asked how important a reduction in medication is for the respondent (Likert scale with 4 response categories scored from 0 to 3); Q17 asked whether the respondent had been diagnosed with any one of a list of 12 common conditions (with a “tick all that apply” instruction) |
EQ-5D-5L | Not numbered | This comprised the widely used global health questionnaire which provides a simple descriptive profile and a single index value for health status [ 10]. The EQ-5D-5L questionnaire consists of five questions related to mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Each question can be answered on five different levels. The EQ-5D-5L also includes a visual analogue scale (question 19) from 0 (worst health imaginable) to 100 (best health imaginable). Therefore, a higher score is related to a better outcome, in contrast to the other scoring systems used elsewhere in this paper |
Pre-validation post-ablation questionnaire (C-CAP2) |
Post-ablation change in symptoms | 1–3b, 7 | This domain consists of 4 items each with 4 available responses relating to changes in patients’ arrhythmia-related symptoms since receiving a procedure to treat their condition (scored from 1 to 4 for each scale). Therefore, the change in symptoms multi-item scale has a minimum score of 0 and a maximum of 16. This domain also asks whether the outcome of the procedure met the patients’ expectations |
Procedure-related complications | 4–6 | This domain comprised a binary question relating to whether patients experienced any ablation-related complications and two tables asking patients whether they were warned of or experienced any of a list of complications |
Condition and symptoms | 8, 9, 10, 15 | As described for C-CAP1 |
Restricted activity days and healthcare visits | 11a–13b | As described for C-CAP1 |
Medication and general health | 16–19 | As described for C-CAP1 |
EQ-5D-5L | Not numbered | As described for C-CAP1 |
Data management and statistics
Patients completed the C-CAP questionnaires by hand in their own time, and responses were entered by researchers at Cedar Healthcare Technology Research Centre within the UK NHS (Cardiff and Vale University Health Board) into the National Audit of Cardiac Rhythm Management (NACRM) database administered by National Institute for Cardiovascular Outcomes Research (NICOR) at University College London. All data entered onto the database were checked for accuracy by a second researcher. Missing data were not imputed. Data were exported from NACRM and were analysed using IBM SPSS Statistics version 21. All statistical tests were two-sided, and p values less than 0.05 were considered statistically significant.
Validation of C-CAP instrument
Feasibility and acceptability
Feasibility was assessed as the proportion of patients who, following enrolment, returned questionnaires within the required timeframe. Acceptability of individual items and multi-item scales was assessed by patient response rate and missing data. Ceiling and floor effects were evaluated as the proportion of patients who responded with the minimum and maximum scores for each dimension.
Reliability
Cronbach’s alpha was used to assess internal consistency for disease-specific multi-item scales in C-CAP1 and C-CAP2 (those with ≤2 items were excluded). Coefficients above 0.7 were acceptable, 0.8 (good), and 0.9 (excellent) [
12].
Intraclass correlation coefficient (ICC) was used to evaluate test–retest reliability. Scales with an ICC of ≥0.7 were considered to have good reliability. Bland–Altman plots [
13] were produced for multi-item scales. For binary items, repeatability was assessed using the kappa coefficient (
κ) [
7].
Validity
Content validity was evaluated using one-to-one cognitive interviewing of patients as described by Withers et al. [
8]. Convergent validity was tested by comparing the multi-item scales in the condition and symptoms domain (symptom severity; frequency and duration of symptoms; impact on life) to validated global health scores (EQ-5D-5L index and EQ-VAS scale). Correlation coefficients of 0.4–0.7 [Spearman’s Rho (
ρ)] are considered moderate. We expected that correlations between disease-specific multi-item scales within C-CAP questionnaires would be higher than the correlation between C-CAP scales and global health scores. Discriminant validity was tested by comparing scales in C-CAP questionnaires relating to symptoms and impact against a multi-item scale relating to patient expectations of the results of the procedure. It was assumed that these domains measure different concepts and therefore low correlations (<0.4) were expected.
Responsiveness and minimal clinically important difference (MCID)
Several distribution-based methods [effect size (ES), standardised response mean (SRM), relative efficiency (RE)] were used to evaluate changes in C-CAP scores following ablation. For both ES and SRM, values of 0.20, 0.50, and 0.80 represent the limits of small, moderate, and large change, respectively [
14].
Standard error of measurement (SEM) is a measure of the precision of the instrument. The smallest detectable change (SDC) was calculated from the SEM; it reflects the smallest within-person change in score (
p < 0.05) that can be interpreted as a real change above measurement error [
15] SDC
= 1.96
*√2
*SEM.
Minimal clinically important difference (MCID) is defined as the smallest difference in score in the domain of interest which patients perceive as beneficial [
16]. Four anchor questions were used to estimate MCID in the case of C-CAP: patients who reported that their symptoms became less frequent; those who reported that the duration of their arrhythmia episodes became shorter; patients whose expectations were met; or patients who reported a global health score improvement of 20 points were considered appropriate to show minimal important change.
Discussion
Following incorporation of amendments proposed in this manuscript through the validation process, the final C-CAP questionnaires are valid, reliable, and responsive tools for measuring symptom change in patients undergoing ablation for cardiac arrhythmias (final C-CAP questionnaires are available as an online resource). The final validated C-CAP questionnaires (C-CAP1 and C-CAP2) combine generic global health measures with disease-specific domains to provide a comprehensive picture of the effect that arrhythmias have on patients’ lives. This large validation study builds on previous pilot and content validity work by our research group [
1,
8]. The study has demonstrated that C-CAP questionnaires can be used on patients with a range of arrhythmia types and are not limited to those with AF.
The following amendments have been made to produce final version of the C-CAP questionnaires (online resource):
-
Removal of passing out/fainting/blackouts from the “symptom severity” scale and my palpitations have had a financial impact from the “impact on life” scale in C-CAP1 (Q13) and C-CAP2 (Q15)
-
Removal of the free text section for medication taken by patients in C-CAP1 (Q15) and C-CAP2 (Q17)
-
Amendment of the N/A option for “days you have missed at work/school/college” to read “I do not attend work/school/college (✓)”in C-CAP1 (Q9) and C-CAP2 (Q11)
-
Removal of the N/A option from “days you have had to cut down on your social activities” and “days you have been unable to carry out normal daily activities” questions in C-CAP1 (Q10–11) and C-CAP2 (Q12–13).
We chose to use a classical test theory approach in our psychometric analysis, mainly because of our linear model (pre-/post-testing), our focus on test level scoring, and our relatively small sample size (< 500 subjects) at each measurement point. Further work is being undertaken to compare pre-ablation C-CAP measures with those collected post-ablation, at 1 and 5 years. Incorporation of proposed revisions to the C-CAP questionnaires will be considered for the 5-year follow-up (1-year follow-up uses the pre-validation questionnaire as the validation work was not complete at this study point).
The importance of PROMs as tools to drive improvements in service provision is well recognised [
5]. Through future research and use in routine practice, the C-CAP questionnaires provide a tool for UK clinicians and commissioners to collect evidence on whether provision of a cardiac ablation service is having a positive impact on patients which may be difficult to demonstrate through other means. C-CAP has the advantage of enabling comparison of outcomes across different arrhythmia groups, and inclusion of the generic EQ-5D-5L tool allows for wider cross-speciality comparisons [
10]. With appropriate further translation and validation work, the C-CAP tool could be used outside of the UK.
The influence of patient expectations on their treatment and recovery has been widely demonstrated [
17]. A novel section has been included in C-CAP1 enabling clinicians to explore and manage patient expectations. Appropriate expectation management may improve overall satisfaction with the service. Future analysis will provide an insight into how patient expectations influence the perception of procedural success.
High response rates for C-CAP1 and C-CAP2 indicate that patients find the questionnaires acceptable and that they are not overly burdensome. We identified issues with high numbers of responses to the not applicable option for questions of “number of days impacted”. We suspect that some are valid responses but that a proportion may be unreliable. Also the free text format questions relating to medication intake were difficult to validate in any meaningful way. The original purpose was to use a medication dose question to explore changes following ablation; however, a lack of consistency in patients’ responses coupled with the challenges of extrapolating changes in medication regimes as better or worse led us to conclude that this question provided limited value.
Ceiling/floor effects are considered to be an issue if 15 % of patients report maximum and minimum scores [
12,
18], and were observed in the “frequency and duration of symptoms”; this may be due in part to patients experiencing a true alleviation of symptoms and also a function of fewer items within the scale. Test–retest reliability was impressive across individual questions and scales and high internal consistency measures were observed. Removal of some items improved the Cronbach’s alpha values.
Disease-specific multi-item C-CAP scales (shared across C-CAP1 and C-CAP2) “symptom severity” and “impact on life” were more responsive to changes following ablation than the global health measures of EQ-5D-5L and EQ-VAS, shown by much larger effect sizes. We have presented distribution-based and anchor-based estimates of MCID alongside SDC values to aid interpretability of quantitative scores. Anchor-based MCIDs are variable because the MCID depends on the definition of “important difference” in the global measure [
19]. Several threshold values for SEM have been suggested to estimate MCID; some assert that 1 SEM is roughly equivalent to the minimal important difference determined using anchor-based methods, and others prefer 1.96*SEM [
14]. Our results demonstrated that 1.96*SEM came close to the anchor-based method of MCID estimation.
The disadvantage of anchor-based methods is that they do not take into account the measurement precision of the instrument, and alone cannot tell us whether the MCID lies within the measurement error [
14]. This study indicated that the SDC is higher than the MCID across three disease-specific C-CAP multi-item scales (symptom severity; frequency and duration of symptoms; impact on life). The study by Lin et al. [
20] states that in some instances the MCID scores do not exceed the SDC scores but still convey information about whether a patient group experienced a clinically important change. A 9 point change on the “symptom severity” scale indicates a true and reliable improvement, but a 6–7 point change may be considered clinically meaningful to the patient. The MCID cannot be used to define an important deterioration because we only analysed improved patients and caution should be applied with low baseline scores.
As well as the observation that SDC is higher than the MCID in the disease-specific scales, there were several methodological limitations in this study. Anchor questions were not prospectively designed and included to calculate MCID, although those questions provided adequate proxies for the definition of minimal improvement. Known-groups validity could have been explored in patients for whom their arrhythmia symptoms are adequately controlled by medication. Convergent validity would have been better evaluated by testing another validated AF questionnaire [
6] which we would assume measures a similar construct. Future research should test the structure of the C-CAP questionnaires using confirmatory analysis.
The results of this validation study show that the final C-CAP questionnaires (online resource) can be used reliably to measure changes in arrhythmia-related symptom severity, symptom frequency and duration, and impact on life before and after percutaneous cardiac ablation. Additional domains of patient expectations and complications can also be reliably explored using C-CAP1 and C-CAP2. We encourage researchers and clinicians to use C-CAP questionnaires in research and routine clinical settings to measure the impact of ablation services on patients’ QoL (final questionnaires are provided as an Online Resource, copyright Cedar).
Acknowledgments
This work was supported by the National Institute for Health and Care Excellence (NICE). Cedar, Cardiff and Vale University Health Board (institution of JW, KW, AW, GCR) is funded by NICE to act as an external assessment centre. This study was conducted as part of Cedar’s contract to provide services to NICE. The authors thank Peter O’Callaghan (Cardiology Dept., Cardiff and Vale UHB), Stephen Murray (Cardiology Dept., Newcastle Upon Tyne Hospitals NHS Foundation Trust), and Racheal James (Cardiology Dept., Cardiff and Vale UHB) for patient enrolment, Ann James (Cedar, Cardiff and Vale UHB) for administrative support and data entry.