Specifically, our first concern is with the stated objectives of the review: “…to evaluate the use of manual therapy for clinical conditions in the pediatric population, assess the methodological quality of the studies found, and synthesize findings based on health condition.” These objectives are vague and do not comply with accepted frameworks to formulate research questions to assess the effectiveness of interventions [2, 3] In particular, the clinical conditions should be identified in the research question since other conditions (e.g., cardiovascular disease, renal disease) were not included in the review. “Assess the methodological quality of the studies found” is not a research question/objective. Furthermore, “Synthesize findings based on health condition” is not clear, specific or representative, or reproducible.
Second, the search strategy presented in Table 1 is not comprehensive and does not meet the current standards for systematic review literature searching, as defined in the Peer Review of Electronic Search Strategies (PRESS)  Specifically, important search terms that could identify studies conducted in the pediatric populations were not included (e.g., newborn, infant, youth). Moreover, according to PRESS, the search strategy should be developed with the assistance of a health sciences librarian and reviewed by a second health sciences librarian using the PRESS checklist  Although the authors mention that three librarians were used, it is unclear whether and how the librarians reviewed each other’s work using the PRESS checklist. Furthermore, the authors claimed that they searched several databases (as listed in Table 1). However, we are concerned about the validity of this statement because ScienceDirect and McCoy Press are journal publishers, not databases. Therefore, we cannot exclude that the search strategy led to significant selection bias in the included literature.
Third, this review used cut-off points to classify the quality of evidence (i.e., “low quality study if the score was between 0 and 33.3%, medium quality if the score was between 33.4 and 66.6%, and high quality if the score was above 66.6%”). Although cut-points are commonly used to classify levels of evidence, it is important to understand the limitations of this methodology. One important limitation is that impact of biases identified by the items is not weighted (i.e., each item is given the same weight regardless of their impact on study results)  Therefore, a study with a high score can be deemed to be of high quality even if it suffers from a fatal methodological flaw (e.g., differential attrition rates leading to differential ascertainment of outcomes).
Fourth, the results of each study were categorized to indicate whether a treatment led to: 1) improvement; 2) no improvement; or 3) no difference in participants’ outcomes. However, the decisions used to make the judgements are not clear and cannot be critically appraised nor replicated. Specifically, what consideration was given to statistical significance versus clinical importance and to within- versus between-group differences? Also, the main descriptive results presented in data extraction tables (Tables 6–10) only provide a general summary of the results and the necessary information to interpret the results are lacking (e.g., control interventions, effect size, statistical significance, precision of estimates, clinical importance, etc.). We could not identify any supplemental tables providing such information about control interventions and effect size.