Skip to main content

How fragile the positive results of Chinese herbal medicine randomized controlled trials on irritable bowel syndrome are?

Abstract

Objective

The fragility index (FI), which is the minimum number of changes in status from “event” to “non-event” resulting in a loss of statistical significance, serves as a significant supplementary indicator for clinical physicians in interpreting clinical trial results and aids in understanding the outcomes of randomized controlled trials (RCTs). In this systematic literature survey, we evaluated the FI for RCTs evaluating Chinese herbal medicine (CHM) for irritable bowel syndrome (IBS), and explored potential associations between study characteristics and the robustness of RCTs.

Methods

A comprehensive search was conducted in four databases in Chinese and four databases in English from their inception to January 1, 2023. RCTs encompassed 1:1 ratio into two parallel arms and reported at least one binary outcome that demonstrated statistical significance were included. FI was calculated by the iterative reduction of a target outcome event in the treatment group and concomitant subtraction of a non-target event from that group, until positive significance (defined as P < 0.05 by Fisher’s exact test) is lost. The lower the FI (minimum 1) of a trial outcome, the more fragile the positive result of the outcome was. Linear regression models were adopted to explore influence factors of the value of FI.

Results

A total of 30 trials from 2 4118 potentially relevant citations were finally included. The median FI of total trials included was 1.5 (interquartile range [IQR], 1–5), and half of the trials (n = 15) had a FI equal to 1. In 12 trials (40%), the total number of participants lost to follow-up surpassed the respective FI. The study also identified that increased FI was significantly associated with no TCM syndrome differentiation for inclusion criteria of the patients, larger total sample size, low risk of bias, and larger numbers of events.

Conclusions

The majority of CHM IBS RCTs with positive results were found to be fragile. Ensuring adequate sample size, scientifically rigorous study design, proper control of confounding factors, and a quality control calibration for consistency of TCM diagnostic results among clinicians should be addressed to increase the robustness of the RCTs. We recommend reporting the FI as one of the components of sensitivity analysis in future RCTs to facilitate the assessment of the fragility of trials.

Peer Review reports

Introduction

Hypothesis testing is fundamental in statistical analysis, aiding in discerning significant differences between experimental samples or populations [1, 2]. In randomized controlled trials (RCTs), hypothesis testing using p-value as the probability value is used as an indispensable and extensively utilized tool to draw significant statistical conclusions when p-value is smaller than the predefined significance level “α” [3,4,5,6,7].

The fragility index (FI) concept is significant, as it measures the minimum number of events required for an outcome to shift from statistically significant to nonsignificant [8]. A lower FI value indicates a more fragile result, and suggests that the statistical significance of the outcome is sensitive to small changes. Besides, FI also can indicate a fragile result when encountering trials with limited clinical events or small-scale studies [9], since their results may be misleading.

Introduced in 2014 by Professor Walsh and his team [8], the FI has been investigated across various medical fields, such as spine surgery [10], hand surgery [11], critical care [12], and anti-cancer drugs [13]. FI holds paramount importance in assessing the robustness of clinical trials. Its primary objective is to provide fragility insights to patients, clinicians, and policymakers, enabling a comprehensive understanding of clinical trial results and facilitating well-informed clinical decision-making [14].

Traditional Chinese medicine (TCM) faces unique challenges when evaluating trial result fragility. It adopts individualized and complex treatments in response to the TCM syndrome [15, 16], which is a combination of multiple symptoms and signs subjectively observed and organized [17,18,19]. The subjectiveness of TCM syndrome differentiation induces incoherence among practitioners while practice, and thus may put the results of clinical trials of TCM in a higher risk of being fragile.

Our objective is to assess the FI in RCTs comparing Chinese herbal medicine (CHM) treatments for irritable bowel syndrome (IBS). We aim to explore potential associations between study characteristics and the robustness of the RCTs, illustrating the fragility of positive results and underscoring the importance of reporting FI for a more accurate and scientific understanding of positive CHM trial results.

Methods

Literature search

A search of Medline (Ovid), Embase, Cochrane Library, Web of Science, China National Knowledge Infrastructure (CNKI), SinoMed, China Science and Technology Journal database (VIP), and Wan-Fang database from their inception until January 1, 2023 was conducted to identify potentially eligible studies. The search strategy was structured using “Chinese herbal medicine” “irritable bowel syndrome”, and “randomized controlled trial” with no language restrictions were imposed (Supplementary Table S1). In this study, we included only published RCTs on the treatment of CHM for IBS. The reference lists of all included studies and relevant systematic reviews were checked for further reports and contacted trial authors where necessary. In conformance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [20] (Supplementary Table S2), all supporting data have been provided in the article and the supplementary data.

Study selection

Types of trials

We included RCTs that utilized a 1:1 ratio for treatment allocation and reported binary outcomes with statistical significance. In order to maintain the integrity of the analysis, trials that employed inappropriate methods of randomization were excluded from the study. The criteria used to determine the appropriateness of randomization method was defined as an appropriate detailed description of generation methods for random sequences. Randomization that was described only by the word “random” or was conducted via assigning patients to different groups by time of admission, date of birth, or number of hospital medical records and so on was defined inappropriate.

Types of participants

Trials were conducted among patients afflicted with IBS. There are no restrictions on the subtypes, including diarrhea-predominant, constipation-predominant, mixed, or other types.

Types of interventions

The included interventions encompassed three main categories: (1) single herb; (2) Chinese proprietary herbal medicine, typically administered as granules, decoction, oral liquid, capsule, or pills; (3) herbal compound decoction prescribed by TCM doctors, that discriminated based on the specific symptoms and conditions exhibited by the patients. The study did not impose any restrictions on the formulation or integration of different herbal medicines. However, studies involving non-oral administration modes for herbal medicines were excluded from the analysis.

Comparison group

The control group in the included studies consisted of various interventions, including: (1) placebo; (2) standard treatment, interventions recommended by clinical guidelines; (3) treatment as usual, the routine medical care provided to patients in the study group; (4) another alternative oral CHM; (5) integrative medicine, interventions combining CHM with standard treatment, routine treatment or an alternative oral CHM; and (6) no treatment.

Outcomes

Trials included in this study reported at least one statistically significant binary outcome in their abstracts. We included effective rate [21] and the response rate [22] of the IBS Symptom Severity Scale (IBS-SSS), adequate relief (IBS-AR) [23], response rate of abdominal pain (visual analogue scale, VAS scale) [24], and response rate on Bristol Stool Scale [25]. Based on the IBS-SSS scale, there were 4 graded outcomes including remission (less than 75 points), mild (76–175 points), moderate (176–300 points), and severe (over 300 points), respectively. Remission was considered as cured, 2 grades improvement as markedly improved, 1 grade improvement as improved, no improvement or worsen condition as ineffective. Effective rate [21]= (cured + markedly improved + improved) / total cases × 100%. Additionally, the response rate [22] on the IBS-SSS scale was defined as the proportion of patients who had a ≥ 50% reduction in the total score compared to the pre-treatment. IBS-AR [23] was defined as a binary answer (Yes/No) to the question “In the past 7 days, have you had adequate relief of your IBS symptoms?” Response rate to abdominal pain [24] was defined as the proportion of patients whose worst abdominal pain score (score range, 0–10, with 0 indicating no pain and 10 indicating unbearable severe pain) decreased by at least 30%, and response rate on Bristol Stool Scale was defined as the proportion of patients whose type 6 or 7 stool days decreased by 50% or greater [25].

Data collection process

One review author (Li YL) extracted data from 2 included trials into drafted, piloted, Excel-based extraction forms, and a second review author (Wang YQ) checked data entry in full to pilot the electronic data extraction form. Four reviewers (Li YL, Wang YQ, Huang JH, and Liu ZH) independently extracted data from the eligible studies into the formal form. Discrepancies were resolved through discussion, or if needed, through consultation with another overview author (Luo MJ). In order to ensure the consistency in data extraction, we conducted an example data extraction training using one of included RCT reports priory to the formal. The information extracted for each eligible RCT included the data related to the target outcome, such as the number of events and nonevents for each group. Furthermore, the following data were also extracted: the basic characteristics of patients (gender, mean age), the pathological type of IBS, duration of the condition, the TCM syndrome differentiation and typing of IBS (yes or unclear), the flexibility of interventions (whether the interventions were tailored to the patient’s condition, yes or unclear), the type of interventions comparisons (placebo, standard treatment or treatment as usual), total sample size, number of patients lost to follow-up, year of publication, funding status, adequacy of allocation concealment (recorded as adequate or unclear), patient and investigator blinding, and statistical analysis principle. In cases where more than one significant binary outcome was reported in the abstract of a trial, only the primary outcome was included in the assessment of the FI.

Risk of bias assessment

Two reviewers who have good calibration of the Cochrane Collaboration’s risk of bias tool 2.0 (RoB 2.0) [26] independently assessed the risk of bias for each included RCT. RoB 2.0 is consisted of six domains including randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome and selection of the reported result. Each domain contained several signal questions were judged with five potential responses: ‘Yes’, ‘Probably Yes’ ‘Probably No’, ‘No’ and ‘No Information’. In all cases, a judgement of ‘Yes’ indicated a low risk of bias, a judgement of ‘Probably Yes’ or ‘Probably No’ indicated some concerns and a judgement of ‘No’ indicated a high risk of bias. If insufficient detail was reported, our judgement would be ‘No Information’. We resolved disagreements that arise at any stage by discussion between the review authors or with a third reviewer, when necessary. We assessed the domains above using answers to signaling questions and generated a risk of bias’s assessment table for each study, with overall judgments derived from the tool.

Statistical analysis

For the included binary outcome, the FI was calculated from a two-by-two contingency table by incrementally changing 1 patient at a time from an “event” status to a “non-event” status in the treatment group [8]. This process was executed in a way that maintained the total number of participants in that treatment. After each change, the Fisher exact test was recalculated, and the resulting two-sided P value was recorded. This iterative process continued until the P value reached ≥ 0.05, indicating that the result was no longer statistically significant (see Fig. 1 for calculation example). The minimized required number of subtracted target outcome event occurrences was considered the FI for that RCT outcome. This index serves as a valuable measure to evaluate the fragility of the study results, reflecting the sensitivity of the statistical significance to increased number of events, and at the same time, to manifest the robustness of the results.

We summarized the FI for the included studies using descriptive statistics and described the distribution of FI frequency using histogram. A linear regression model was conducted to explore potential factors that might have an influence on fragility index. As the FI was highly skewed, it was log transformed and the categorical variables were transferred into dummy variables prior to the regression analysis. The assumptions underlying linear regression were examined and confirmed [27]. Based on the previous studies related to the exploration of fragility index [8, 28, 29], potential factors that influenced the fragility index including total sample size, the type of interventions comparison (placebo, standard treatment, or treatment as usual), the proportion of patients lost to follow-up, the total number of events in the trial, the risk of bias assessment, and funding status. In addition, we pay special attention to the characteristics of TCM intervention. According to TCM theory, most TCM experts typically implement syndrome-based TCM treatments using various forms of TCM formulas, appropriately combined with individual herbs that differ in nature and medical value. However, due to the subjectivity of diagnosing patients’ syndromes and the inherent flexibility in intervention measures, the robustness of clinical trial results may be affected. We reported the regression coefficients and the 95% confidence intervals (CIs), obtained using 1000 bootstrap samples, providing a robust statistical analysis approach.

Fig. 1
figure 1

Example of fragility index calculation for the trial titled “Shugan Liqi Zhixie Tang in Treatment of 68 Patients with Diarrhea-predominant irritable bowel syndrome”

Results

The initial search yielded a total of 24 118 potentially relevant citations. Thirty RCTs that met the inclusion criteria and were eligible for analysis (Fig. 2).

Fig. 2
figure 2

Details of the literature search

Characteristics of trials and outcomes

Table 1 shows the comprehensive data on the specific outcomes extracted from each study. The median sample size of the included studies was 72 patients (range, 50–360), with a median of 4 patients (range, 1–14) who were lost to follow-up. Regarding the reported outcomes, 18 studies (60%) focused on the effective rate, 4 studies (13.33%) examined the response rate on the IBS-SSS Scale, while 3 studies (10%) assessed the response rate to abdominal pain, 2 studies (6.67%) reported data on adequate relief and another 3 studies (10%) measured the response rate on the Bristol Stool Scale. In terms of the utilization of TCM in the trials, approximately two-third (70%) of the eligible trials incorporated patients’ TCM syndromes as criteria for inclusion. However, only one in six of the trials’ interventions (16.67%) were flexibly tailored to the patient’s condition based on TCM theory. In the 18 trials that clearly reported lost to follow-up, the median number of participants lost to follow-up was 6 (IQR, 1–13). 12 (40%) trials did not report the number of lost to follow-up, 6 (20%) trials had a number of lost to follow-up less than 10% of their respective total sample size, 10 studies (33.3%) had a number of lost to follow-up accounted for 10-20% of their respective total sample size, and in 2 studies, the number of lost to follow-up was more than 20% of the total sample size. According to the quality assessment of included RCTs, 9 trials were at low risk of bias, 2 were at high risk and the remaining 19 at some concerns of risk of bias. A summary of the RoB 2.0 results is shown in Table 2.

Table 1 Trial characteristics
Table 2 Review of author’s judgements about each risk of bias item presented as percentages across all included studies

Fragility index

The overall median FI was found to be 1.5 (IQR, 1–5). Among the trials, half of them (n = 15, 50%) had an FI of 1, indicating that even a small change of one patient in the treatment group from an event to a non-event could lead to a loss of statistical significance in the RCT findings, raising concerns about the robustness and reliability of the findings. Meanwhile, in only 3 trials (10%), the FI exceeded 10, suggesting a higher robustness of their results to alterations. Notably, the total number lost to follow-up exceeded the FI in 12 (66.7%) trials, which might have a significant impact on the robustness of their outcomes (Fig. 3).

Fig. 3
figure 3

Distribution of fragility index for all trials

Associations between the fragility index and study characteristics

Table 3 shows the results of the linear regression analysis. It was observed that larger total sample size (β, 0.27; 95% CI 0.09, 0.46; P = 0.005), low risk of overall bias (β, -2.15; 95% CI -3.96, -0.34; P = 0.022), and larger numbers of events (β, 0.22; 95% CI 0.02, 0.45; P = 0.043) were associated with more robust results. Furthermore, in the different domains of the Cochrane’s RoB 2.0 tool, we explored potential positive influence of low risk assessment of randomization process (β, 2.78; 95% CI 0.77, 5.07; P = 0.010), deviations from intended interventions (β, 2.92; 95% CI 0.86, 4.90; P = 0.007), and measurement of the outcome (β, 3.26; 95% CI 1.19, 5.21; P = 0.003) on FI. Of interest, trials without adopting TCM syndrome differentiation for inclusion criteria of the patients were associated with more robust results (β, 2.32; 95% CI 0.06, 4.57; P = 0.044). However, the linear regression did not identify any significant differences in the FI concerning positive funding support, the flexibility of the intervention treatment tailored to the patient’s condition, the pathological type of IBS, the proportion of lost to follow-up or the type of control intervention.

Table 3 Association between trial characteristics and the fragility index using linear regression

Discussion

Summary of findings

The median FI was 1.5 (IQR, 1–5) among included RCTs reported CHM treatment on IBS which conducted a statistically significant result, indicating that even a small change of one or two patients in the treatment arm from a positive target outcome event to a negative target outcome event could lead to a loss of statistical significance in the RCT findings. Furthermore, in 40% of the trials, the number of patients lost to follow-up exceeded the respective FI. The study also identified that increased FI was significantly associated with no TCM syndrome differentiation for inclusion criteria of the patients, larger total sample size, low risk of bias, and larger numbers of events, but was not associated with whether there was a funding support or not, the pathological type of IBS, the type of control intervention, the proportion of lost to follow-up or the individualized treatments re patients’ condition.

Strengths and limitations

The present study stands as the first to report the FI of RCTs comparing different formulations of CHM for treating IBS. The FI provides a straightforward measure, represented by the number of individual patients, that can assist clinical practitioners, patients, and policymakers in assessing the strength of research conclusions. We conducted a comprehensive search of the literature and explored the association between the FI and the sample size, the number of events, the funding support, the pathological type of IBS, the type of intervention comparisons, the proportion of lost to follow-up, and result of the risk of bias assessment. Furthermore, we also considered the unique characteristics of RCTs in TCM such as whether trials adopted TCM syndrome differentiation for inclusion criteria of the patients and whether there were individualized treatments according to patients’ symptoms. As limitations, the regression analyses were univariable since the small sample size that included in this study and a high degree of multicollinearity that may lead to a loss of statistical significance of analyses and potentially excluding important variables from the analyses limited the conduction of multivariable analyses, such as the set of sample size and number of events as a larger sample size leads to a larger number of events, and the set of risk of randomization process, risk of deviations from intended interventions, and risk of measurement of the outcome as they were various domains of one tool. Furthermore, the eligible criteria for outcomes were limited to internationally recognized measure, which excluded some outcome measures typically associated with TCM syndromes, such as the effective rate measured by self-made TCM symptom scales as these outcome measurements lacked certain reliability and validity, which may have implications on the overall assessment of trial robustness.

Relationship with previous studies

Our findings align with previously reported FI scores in various medical and surgical fields, such as peri-operative care (median, 2; IQR, 1–3) [60], otolaryngology (median, 3; IQR, 1-7.5) [61], anesthesia and critical care (median, 2; IQR, 1-3.5) [12], and emergency medicine (median, 4; IQR, 2–10) [62]. In contrast, common solid tumor trials had a higher median FI of 28 (range, 2–322) [13]. This discrepancy in trial robustness may be attributed to the number of events in the trials. Among the 30 trials included in our study, the number of events was notably smaller when compared to the FI review of anti-cancer drugs for solid tumors, where the median number of events was 4 versus 336.5. A larger number of events tends to be positively correlated with a higher FI score, a trend observed consistently in both our study and previous research. Apart from being influenced by the number of events, the FI is inherently linked to the sample size. Of interest, our findings are consistent with the previous results, of which the majority have shown a positive association between the FI and sample size [10, 11].

Few of the published studies on FI have analyzed the results from the perspective of study quality evaluation. Only one study on hand surgery [11] evaluated the risk of bias of the included trials, but since the study included only five eligible studies, the author did not analysis its relationship with FI. Consistent with our expectation, low risk assessment of bias especially in the domain of randomization process, deviations from intended interventions, and measurement of the outcome were positively associated with more robustness results. For RCTs with a higher risk of bias, the probability of generating misleading results exceeds 50%, even if the results are statistically significant. This highlights the importance of conducting rigorous study designs in randomization, avoiding bias through stringently monitored, and blinding outcome assessors as could as possible to detect meaningful effects. Additionally, the replication of findings in independent studies can further strengthen the validity and reliability of research outcomes.

In more than half the RCTs that clearly reported lost to follow-up, more participants were lost to follow-up than would be required to make the result nonsignificant based on the corresponding trial’s FI. The number of lost to follow-up was expected to have an influence on the robustness of result as the more patients lost to follow-up, the more outcomes would be missed or biased by the inadequate treatment and the result of patients who lost to follow-up might be negative lead to a smaller FI. When all participants in the study were considered, the statistical significance of the results may be reversed, raising concerns about the robustness and reliability of the findings. A previous study which reviewed the FI of 399 RCTs found that the feature of not reported the number of lost to follow-up in the results in RCTs was a significantly associated with larger FI28. However, the linear regression did not identify any significant difference in the FI on the proportion of lost to follow-up in this study. Considering the small sample size of included studies and there were over 1/3 RCTs that did not report the number of lost to follow-up, the nonsignificant association in this study may be influenced by inadequate methodological reported limitations.

Implications

Treatment decisions often start with the decision of whether a treatment effect is believed to exist. The FI serves as a significant supplementary indicator for P values that may assist clinicians in determining the confidence they should have in the result [63]. A fragile result of patient’s concerned outcome may influence clinicians to draw appropriate inferences regarding the low confidence in the effect of a specific CHM treatment effect for IBS. Presenting both P value and FI for a trial outcome can help clinicians to better understand not only the statistical significance of the outcome but also the fragility, that is sensitivity of the positive outcome to small changes of the number of positive events and missing data. The significance of fragility is emphasized by the proportion of RCTs that initially presented statistically significant results but were subsequently found to be either unsuccessful (16%) or demonstrated effects that were considerably lower than previously reported (16%) [64]. While a low FI indicates a fragile trial result, solutions that increase the robustness of trial result may be adequate sample size based on proper calculation, scientifically rigorous study design, and proper control of confounding factors. When fragile results are derived from low-quality RCTs, other clinical trial designs, such as objective performance criteria based single-armed trial, are worth exploring.

Interestingly in this study, there was a significant relationship between the inclusion criteria of patients and whether there was a TCM syndrome differentiation and the FI suggested that studies that implemented no TCM syndrome differentiation inclusion criteria tended to have more robust results. Although there were clear criteria in the included trials for the syndrome diagnosis of disease, no trials mentioned rigorous training and quality control for the diagnosis of the individualized syndrome, and it was unclear whether the accuracy and consistency of the identification of the syndrome were examined uniformly across different researchers before the start of the trial. A previous study [65] has shown that there was a low degree of consistency in TCM diagnoses and individualized prescriptions among TCM doctors even with the same qualifications (diagnosis: kappa = 31.7%, prescription: kappa = 35.0%). The inconsistent “TCM diagnosis” results may make trials more fragile. A detailed standard of practice “TCM diagnosis” and quality control calibration for consistency of TCM diagnostic results among clinicians before the implement of trial could be a potential strategy to address this issue.

TCM interventions need to adapt to the dynamic syndromes [66]. In clinical trials, such individualized treatment brings higher heterogeneity to the interventions [67, 68], which may increase difficulty of trial management. However, our findings demonstrated that there was no difference in the FI concerning whether personalized prescriptions were modified according to each patient’s specific symptoms or not. Not only TCM but also western conventional medicine applies individualized treatments to some extent even in a RCT with rigorous requirement for interventions due to medical needs. Our result provides evidence to relieve concern to the potential adverse impact of individualized treatment to the robustness of trial results. However, the relatively small amount of literature that included features of individualized interventions (n = 5, 16.67%) may lead to false-negative results. Besides, the lack of association between funding support and FI indicated that the robustness of trial was not affected by commercial conflicts of interest and highlighted the importance of the role that a rigorous protocol design play in the robustness of RCT.

Conclusion

The majority of CHM IBS RCTs with positive results were found to be fragile. Ensuring adequate sample size and low risk of bias should be addressed to increase the robustness of the RCTs. We recommend reporting the FI as one of the components of sensitivity analysis in future RCTs to facilitate the assessment of the fragility of trials.

Data availability

The data that support the findings of this study are available from the corresponding author, (Yutong Fei [E-mail: feiyt@bucm.edu.cn]), upon reasonable request.

References

  1. Expósito-Ruiz M, Pérez-Vicente S, Rivas-Ruiz F. Statistical inference: hypothesis testing. Allergol Immunopathol (Madr). 2010;38(5):266–77.

    Article  Google Scholar 

  2. Sarmukaddam SB. Interpreting statistical hypothesis testing results in clinical research. J Ayurveda Integr Med. 2012;3(2):65–9.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Lew MJ. Principles: when there should be no difference–how to fail to reject the null hypothesis. Trends Pharmacol Sci. 2006;27(5):274–8.

    Article  CAS  PubMed  Google Scholar 

  4. Lieberman JA. Hypothesis and hypothesis testing in the clinical trial. J Clin Psychiatry. 2001;62:5–8. discussion 9–10.

    CAS  PubMed  Google Scholar 

  5. Luo J. Primary question and hypothesis testing in Randomized Controlled clinical trials. Shanghai Arch Psychiatry. 2016;28(3):177–80.

    PubMed  PubMed Central  Google Scholar 

  6. Graves N, Barnett AG, Burn E, Cook D. Smaller clinical trials for decision making; a case study to show p-values are costly. F1000Res. 2018;7:1176.

    PubMed  PubMed Central  Google Scholar 

  7. Tsushima E. Interpreting results from statistical hypothesis testing: understanding the appropriate P-value. Phys Ther Res. 2022;25(2):49–55.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Walsh M, Srinathan SK, McAuley DF, et al. The statistical significance of randomized controlled trial results is frequently fragile: a case for a Fragility Index. J Clin Epidemiol. 2014;67(6):622–8.

    Article  PubMed  Google Scholar 

  9. Devereaux PJ, Yusuf S. The evolution of the randomized controlled trial and its role in evidence-based decision making. J Intern Med. 2003;254(2):105–13.

    Article  CAS  PubMed  Google Scholar 

  10. Evaniew N, Files C, Smith C, et al. The fragility of statistically significant findings from randomized trials in spine surgery: a systematic survey. Spine J. 2015;15(10):2188–97.

    Article  PubMed  Google Scholar 

  11. Ruzbarsky JJ, Khormaee S, Daluiski A. The Fragility Index in Hand Surgery Randomized Controlled Trials. J Hand Surg Am. 2019;44(8):698.e691-698.e697.

  12. Ridgeon EE, Young PJ, Bellomo R, Mucchetti M, Lembo R, Landoni G. The Fragility Index in Multicenter Randomized controlled critical care trials. Crit Care Med. 2016;44(7):1278–84.

    Article  PubMed  Google Scholar 

  13. Desnoyers A, Wilson BE, Nadler MB, Amir E. Fragility index of trials supporting approval of anti-cancer drugs in common solid tumours. Cancer Treat Rev. 2021;94:102167.

    Article  PubMed  Google Scholar 

  14. Walter SD, Thabane L, Briel M. The fragility of trial results involves more than statistical significance alone. J Clin Epidemiol. 2020;124:34–41.

    Article  PubMed  Google Scholar 

  15. Ma Y, Zhou K, Fan J, Sun S. Traditional Chinese medicine: potential approaches from modern dynamical complexity theories. Front Med. 2016;10(1):28–32.

    Article  PubMed  Google Scholar 

  16. Lu AP, Jia HW, Xiao C, Lu QP. Theory of traditional Chinese medicine and therapeutic method of diseases. World J Gastroenterol. 2004;10(13):1854–6.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Li Z, Xu C. The fundamental theory of traditional Chinese medicine and the consideration in its research strategy. Front Med. 2011;5(2):208–11.

    Article  PubMed  Google Scholar 

  18. Liu S, Zhu JJ, Li JC. The interpretation of human body in traditional Chinese medicine and its influence on the characteristics of TCM theory. Anat Rec (Hoboken). 2021;304(11):2559–65.

    Article  PubMed  Google Scholar 

  19. Ma Y, Sun S, Peng CK. Applications of dynamical complexity theory in traditional Chinese medicine. Front Med. 2014;8(3):279–84.

    Article  PubMed  Google Scholar 

  20. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ. 2009;339:b2700.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Drossman DA, Dumitrascu DL, Rome III. New standard for functional gastrointestinal disorders. J Gastrointestin Liver Dis. 2006;15(3):237–41.

    PubMed  Google Scholar 

  22. Alt F, Chong PW, Teng E, Uebelhack R. Evaluation of Benefit and Tolerability of IQP-CL-101 (Xanthofen) in the symptomatic improvement of irritable bowel syndrome: a Double-Blinded, randomised, placebo-controlled clinical trial. Phytother Res. 2017;31(7):1056–62.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Camilleri M. Editorial: is adequate relief fatally flawed or adequate as an end point in irritable bowel syndrome? Am J Gastroenterol. 2009;104(4):920–2.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Portincasa P, Bonfrate L, Scribano ML, et al. Curcumin and fennel essential oil improve symptoms and quality of life in patients with irritable bowel syndrome. J Gastrointestin Liver Dis. 2016;25(2):151–7.

    Article  PubMed  Google Scholar 

  25. Drossman DA, Camilleri M, Mayer EA, Whitehead WE. AGA technical review on irritable bowel syndrome. Gastroenterology. 2002;123(6):2108–31.

    Article  PubMed  Google Scholar 

  26. Sterne JAC, Savović J, Page MJ, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. 2019;366:l4898.

    Article  PubMed  Google Scholar 

  27. Lunt M. Introduction to statistical modelling: linear regression. Rheumatology (Oxford). 2015;54(7):1137–40.

    Article  PubMed  Google Scholar 

  28. Khan MS, Ochani RK, Shaikh A, et al. Fragility Index in Cardiovascular Randomized controlled trials. Circ Cardiovasc Qual Outcomes. 2019;12(12):e005755.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Maldonado DR, Go CC, Huang BH, Domb BG. The Fragility Index of Hip Arthroscopy Randomized controlled trials: a systematic survey. Arthroscopy. 2021;37(6):1983–9.

    Article  PubMed  Google Scholar 

  30. Wang G, Li TQ, Wang L, et al. Tong-xie-ning, a Chinese herbal formula, in treatment of diarrhea-predominant irritable bowel syndrome: a prospective, randomized, double-blind, placebo-controlled trial[J]. Chin Med J (Engl).2006;119(24):2114–9.

  31. Cappello G, Spezzaferro M, Grossi L, et al. Peppermint oil (Mintoil((R))) in the treatment of irritable bowel syndrome: A prospective double-blind placebo-controlled randomized trial[J]. Dig Liver Dis 2007;39(6):530–6.

  32. Zhang SS, Xu WJ, Chen Z, et al. Short-term and medium-term clinical effect of liver dispersing with spleenstrengthening on irritable bowel syndrome dominated by diarrhea[J]. J Cap Med Uni 2009;30(4):436–40.

  33. Saito YA, Rey E, Almazar-Elder AE, et al. A randomized, double-blind, placebo-controlled trial of St John’s Wort for treating irritable bowel syndrome[J]. Am J Gastroenterol. 2010;105(1):170–7.

  34. Li YM, Zhang YN, Cai G, et al. A randomized, double-blinded and placebo-controlled trial of Chang Ji Tai Granule in treating diarrhea-predominant diarrhea[J]. ShangHai J Tradit Chin Med 2010;44(12):33–6.

  35. Zhang SS, Wang HB, Li ZH, et al. A multi-center randomized controlled study on syndrome differentiation oftraditional Chinese medicine in the treatment of diarrhea-predominant irritable bowel syndrome[J]. Chin J Integr Med. 2010;30(1):9–12.

  36. Tang XD, Li ZH, Li BS, et al. A randomized, double-blind, placebo-controlled clinical study of Chang’an prescription 1 in the treatment of diarrhea-predominant irritable bowel syndrome[D]. Beijing Uni Chin Med. 2011.

  37. Zhang W, Zhang ZL, Li L. Clinical Research Irritable Bowel Syndrome with the Treatment of Liver-Softening andSpleen-Nourishing Decoction[J]. Hubei J Trad Chin Med. 2013;35(05):12–4.

  38. Fu Q, Xu DS, Jiang SQ. Shugan Liqi Zhixie Tang in Treatment of 68 Patients with Diarrhea-predominant Irritable BowelSyndrome[J]. Chin J Exp Trad Med Form. 2013;19(16):301–4.

  39. Li XL, Su ZZ, Gu XD. Observation on the efficacy of Changyiqing in the treatment of irritable bowel syndrome withspleen deficiency and dampness syndrome[J]. Chin J Inte Trad Chin West Med Dig. 2014;22(8):475–7.

  40. Portincasa P, Bonfrate L, Scribano ML, et al. Curcumin and fennel essential oil improve symptoms and quality of life in patients with irritable bowel syndrome[J]. J Gast Liv Dis. 2016;25(2):151–7.

  41. Liu ZW, Niu LJ, Su Q, et al. 100 cases of diarrhea-predominant irritable bowel syndrome treated with Anchangzhitongprescription combined with trimebutine[J]. Henan Trad chin Med. 2014;34(10):2015–16.

  42. Hu QP. Self Beneficial Intestinal Side Spleen Dampness Type of Adjuvant Therapy Clinical Observation IrritableBowel[J]. Mod Diag Treat. 2015(1):42–3.

  43. Cheng YY. Clinical study on the treatment of irritable bowel syndrome with liver stagnation and spleen deficiencywith Chaishao Tiaogan Decotion[D], Hebei Med Univ. 2015.

  44. Alt F, Chong P-W, Teng E, et al. Evaluation of benefit and tolerability of IQP-CL-101 (Xanthofen) in the symptomaticimprovement of irritable bowel syndrome: a double-blinded, randomised, placebo-controlled clinical Trial[J].Phyto Res: PTR. 2017;31(7):1056–62.

  45. Chen M, Tang TC, Wang Y, et al. Randomised clinical trial: tong-Xie-Yao-Fang granules versus placebo for patients with diarrhoea-predominant irritable bowel syndrome[J]. Ali Phar Thera. 2018;48(2):160–168.

  46. Tang XD, Li B, Li ZH, et al. Therapeutic effect of chang’an I recipe on irritable bowel syndrome with diarrhea: a multicenter randomized double-blind placebo-controlled clinical Trial [J]. Chin J Int Med. 2018;24(09):645–652.

  47. Weng MW, Chen YB, Cao J. Clinical Observation on 35 Cases of Treating Diarrheal Irritable Bowel Syndrome with LiverStagnation and Spleen Deficiency with Changning Prescription[J]. Journal of Traditional Chinese Medicine.2019;60(19):1663–1667.

  48. Zeng P. Clinical study on the treatment of diarrhea-type irritable bowel syndrome with Shaoyang taiyin cold andheat benefit with Chaihu Guizhi Ganjiang Decoction[D]. Fujian Univ Trad Chin Med. 2019.

  49. Chen X, Dai XT, Wu YL, et al. Clinical efficacy of An Chang Zhi Xie decotion in the treatment of diarrhea-type irritable bowel syndrome[J]. Inn Mongo J Trad Chin Med. 2020;39(9):8–9.

  50. Zhao XY. Clinical study on the treatment of diarrhea-type irritable bowel syndrome by adding flavors to Wei GuanDecoction[D]. Yunn Univ Trad Chin Med. 2020.

  51. Zheng HP, Zhang ZB, Wei XP, et al. Effect of Addition and Subtraction Therapy of Xiaoyaosan Combined withSimotang to Gut-brain Axis of Patients with Irritable Bowel Syndrome with Predominant Constipation and Syndromeof Stagnation of Liver Qi[J]. Chin J Exp Trad Med Form. 2020;26(22):53–58.

  52. Bordbar G, Miri MB, Omidi M, et al. Efficacy and safety of a novel herbal medicine in the treatment of irritable bowel syndrome: a randomized double-blinded clinical Trial[J]. Gastro Res Prac. 2020.

  53. Yu AP, Zhang HJ. Clinical study on the treatment of constipation-predominant irritable bowel syndrome with "yangmicro-knot" by Chaishao Zhizhu Decoction[D]. Fujian Univ Trad Chin Med. 2020.

  54. Li YX. Clinical analysis of Liuwei Nengxiao capsule in the treatment of constipation-predominant irritable bowelsyndrome[J]. J Prac Trad Chin Med. 2021;37(12):2010–2011.

  55. Kong WQ. Evaluation of the clinical efficacy of Changkang prescription in the treatment of constipation-predominantirritable bowel syndrome and research on its mechanism of action[D]. Nanj Univ Chin Med. 2021.

  56. Jin YL. Clinical study on the treatment of diarrhea-predominant irritable bowel syndrome with Xingpi HuashiDecoction[D]. Zhejiang Univ Trad Chin Med. 2021.

  57. Guo LX, Qin TT, Gao J, et al. Study on the regulatory effect of Shenbei Guchang Capsule on the brain-gut bacteriaaxis in the treatment of diarrhea-predominant irritable bowel syndrome[J]. J Chin Med Mat. 2022;45(10):2502–2506.

  58. Zou JY, Tan HC, Wu C, et al. Analysis of the influencing factors of irritable bowel syndrome and the therapeutic effect of spleen-invigorating and astringent therapy in treating diarrhea od spleen-kidney yang deficiency syndrome[J].World J Inte Trad West Med. 2022;17(1):192–195.

  59. Wang YR, Fu WB, Sun YQ, et al. Clinical efficacy of tongxie yaofang on diarrhea-predominant irritable bowelsyndrome (IBS-D) patients with liver depression and spleen deficiency[J]. Chin J Exp Trad Med Form. 2022;28(9):97–102.

  60. Bertaggia L, Baiardo Redaelli M, Lembo R, et al. The Fragility Index in peri-operative randomised trials that reported significant mortality effects in adults. Anaesthesia. 2019;74(8):1057–60.

    Article  CAS  PubMed  Google Scholar 

  61. Skinner M, Tritz D, Farahani C, Ross A, Hamilton T, Vassar M. The fragility of statistically significant results in otolaryngology randomized trials. Am J Otolaryngol. 2019;40(1):61–6.

    Article  PubMed  Google Scholar 

  62. Brown J, Lane A, Cooper C, Vassar M. The results of randomized controlled trials in Emergency Medicine are frequently fragile. Ann Emerg Med. 2019;73(6):565–76.

    Article  PubMed  Google Scholar 

  63. Chaitoff A, Zheutlin A, Niforatos JD. The Fragility Index and Trial significance. JAMA Intern Med. 2020;180(11):1554.

    Article  PubMed  Google Scholar 

  64. Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294(2):218–28.

    Article  CAS  PubMed  Google Scholar 

  65. Zhang GG, Lee W, Bausell B, Lao L, Handwerger B, Berman B. Variability in the traditional Chinese medicine (TCM) diagnoses and herbal prescriptions provided by three TCM practitioners for 40 patients with rheumatoid arthritis. J Altern Complement Med. 2005;11(3):415–21.

    Article  CAS  PubMed  Google Scholar 

  66. de Almeida Andrade F, Schlechta Portella CF. Research methods in complementary and alternative medicine: an integrative review. J Integr Med. 2018;16(1):6–13.

    Article  PubMed  Google Scholar 

  67. Sidani S. Rethinking the research-practice gap: relevance of the RCT to practice. Can J Nurs Res. 2004;36(3):7–18.

    PubMed  Google Scholar 

  68. Chow JT, Lam K, Naeem A, Akanda ZZ, Si FF, Hodge W. The pathway to RCTs: how many roads are there? Examining the homogeneity of RCT justification. Trials. 2017;18(1):51.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (grant No. 82074282): “Development of the Methodologies of Objective Performance Criteria Based Single-Armed Trials for the Clinical Evaluation of Traditional Chinese Medicine”. The funders played no role in the study design, data collection and analysis, decision to publish, or the preparation of this paper.

Author information

Authors and Affiliations

Authors

Contributions

Y.F., M.L., and J.L. set up the conception and design.Data acquisition: M.L., Y.L., Y.W., J.H., Z.L., M.L., R.C., and Y.T. extracted data.M.L., Z.L., and Q.C. wrote the main manuscript text.All authors reviewed the manuscript.

Corresponding author

Correspondence to Yutong Fei.

Ethics declarations

Ethical approval

An ethics statement is not applicable as this research was based exclusively on published literature.

Consent for publication

Not applicable.

Provenance and peer review

Not commissioned; externally peer reviewed.

Conflict of interest

The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, M., Huang, J., Wang, Y. et al. How fragile the positive results of Chinese herbal medicine randomized controlled trials on irritable bowel syndrome are?. BMC Complement Med Ther 24, 300 (2024). https://doi.org/10.1186/s12906-024-04561-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12906-024-04561-8

Keywords