Skip to main content

Content validity of manual spinal palpatory exams - A systematic review



Many health care professionals use spinal palpatory exams as a primary and well-accepted part of the evaluation of spinal pathology. However, few studies have explored the validity of spinal palpatory exams. To evaluate the status of the current scientific evidence, we conducted a systematic review to assess the content validity of spinal palpatory tests used to identify spinal neuro-musculoskeletal dysfunction.


Review of eleven databases and a hand search of peer-reviewed literature, published between 1965–2002, was undertaken. Two blinded reviewers abstracted pertinent data from the retrieved papers, using a specially developed quality-scoring instrument. Five papers met the inclusion/exclusion criteria.


Three of the five papers included in the review explored the content validity of motion tests. Two of these papers focused on identifying the level of fixation (decreased mobility) and one focused on range of motion. All three studies used a mechanical model as a reference standard. Two of the five papers included in the review explored the validity of pain assessment using the visual analogue scale or the subjects' own report as reference standards. Overall the sensitivity of studies looking at range of motion tests and pain varied greatly. Poor sensitivity was reported for range of motion studies regardless of the examiner's experience. A slightly better sensitivity (82%) was reported in one study that examined cervical pain.


The lack of acceptable reference standards may have contributed to the weak sensitivity findings. Given the importance of spinal palpatory tests as part of the spinal evaluation and treatment plan, effort is required by all involved disciplines to create well-designed and implemented studies in this area.

Peer Review reports


Injury of the spine and back are classified as the most frequent cause of limited activity among people younger than 45 years [1, 2]. Approximately 10 percent of the adult population has neck pain at any one time [3], and 80% of the population will experience low back pain (LBP) at some time in their lives [4]. Five to 10 percent of the workforce is off work annually because of LBP. Indeed, LBP is second only to headache among the leading causes of pain. Approximately 80–90% of LBP is mechanical (non-organic musculoskeletal dysfunction) in origin [5]. Patients with mechanical spinal pain often seek and receive spinal manipulation by chiropractic, osteopathic and allopathic clinicians, physical therapists or other health care professionals [6].

Health care professionals have utilized spinal palpatory diagnostic procedures and manual manipulative treatment for several millennia to treat back injury and pain [7, 8]. Along with the history of illness and physical exam, examiners utilize specific spinal palpatory diagnostic tests in order to identify spinal neuro-musculoskeletal dysfunction. Spinal neuro-musculoskeletal dysfunction refers to an alteration of spinal joint position, motion characteristics and/or related palpable paraspinal soft tissue changes. The scientific committee of the International Federation of Manual Medicine has stated: "beneficial outcomes and effectiveness of spinal manipulative procedures rely on appropriate and skilled treatment that is based on an accurate diagnosis, which in turn depends upon the accuracy of the palpatory procedures used "[9].

Spinal palpatory procedures have been described in journals [1012] and textbooks [1320]. Static palpation of anatomical landmarks for symmetry, palpation of spinal vertebral joints before, during and after active and passive motion tests, spinal and paraspinal soft tissue palpatory assessment for abnormalities or altered sensitivity are most common.

Several narrative reviews of the literature on the validity and reliability of spinal palpatory diagnostic procedures have been published [2128]. However, most reviews are discipline-specific despite the fact that similar spinal palpatory procedures are used across disciplines. Only two systematic reviews of spinal palpatory validity studies have been published [29, 30]. One study was a limited review of chiropractic literature on palpatory diagnostic procedures for the lumbar-pelvic spine [29] and the other concentrates on validity studies at the sacroiliac joint [30]. An annotated bibliography [31] and a systematic review of the primary reliability research studies published between 1971 and 2001 are in progress.

Validity and reliability are concepts that are often used interchangeably, but the concepts are quite different. Validity is the accuracy of a measurement of the true state of a phenomenon [32], while reliability measures the concordance, consistency or repeatability of outcomes [25]. However, even if a measurement is consistent and reliable, it is not necessarily valid (e.g., an arrow may consistently hit the target area, but never hit the bulls-eye).

There are various types of validity studies. The concept of validity differs in qualitative and quantitative research [32]. Though it can be argued that palpatory diagnostic procedures are subjective and therefore qualitative, investigators in the field believe they can measure a physiological phenomenon that can be detected by objective means. They maintain that studies addressing the validity of spinal palpatory diagnostic tests are quantitative studies. The types of quantitative validity studies can be distinguished as follows: face validity; construct validity, criterion validity and content validity.

Face validity is the extent to which a test appears to measure what it is supposed to measure. In other words, whether the proposed test seems to provide a reasonable measure of the concept it is intended to measure. For example, spinal vertebral joint motion palpation tests, which aim to detect the presence of hypomobility, have face validity because they seem to be reasonable measures of the concept they are intended to measure [33]. Face validity studies have been criticized for being subjective, intuitive and unsubstantiated. Troyanovich and Harrison [33] pointed out that in spite of the common perception or belief that motion tests are valid and reliable for assessment of presence or absence of restricted vertebral motion, there was no evidence to support this concept. Thus, palpatory vertebral motion diagnostic tests are prime examples of tests accepted on face validity.

Construct validity is the extent to which a test identifies the concept or trait of that which is being measured. A construct is a hypothetical or conceptual idea that may be used to label or explain observed phenomenon [34]. For example, taking a dysfunctional vertebral joint as the concept, a test demonstrating the ability to identify the presence or absence of that concept or its related components, is said to have construct validity. Feinstein describes construct validity as an appraisal of the effectiveness with which a measure does its job in describing an existing or established construct; i.e. does the measure behave the way one would predict on the basis of the concept it represents? For example, Jull et al [35] compared cervical spinal static palpation to diagnostic nerve blocks with anesthesia. The construct is that tenderness upon provocative palpation is related to local nerve irritation and nerve conductivity. A local anesthetic nerve block of related spinal segments showed that the identified tender spots no longer elicited a pain response. Thus, they demonstrated that there is a high degree of correlation between the palpatory test that identified a tender spot and the ability of the anesthesia to reverse the results of the provocative test. Therefore, the pain provocative palpatory tests used were demonstrated to have high construct validity.

Construct validity, however, is an artificial framework that is not directly observable [27]. To establish construct validity of a test or measure, the researcher must determine the extent to which the measure correlates with other measures designed to measure the same thing and whether the measure behaves as expected. Construct validity studies do not measure the same phenomena that palpatory procedures are designed to measure (i.e., resistance to digital pressure or motion), but similar phenomena that are believed to be related to the palpable phenomena. Many construct validity studies on diagnostic spinal palpatory tests compare a test's results to another measurement of abnormal physiology in the same region. Studies using thermography [36], electromyography [37], and coronary angiography [38] fall into this category.

There are other examples of construct validity studies using instruments to measure skin temperature, electrical skin resistance and/or gross range of motion to discern a dysfunctional vertebral segment. These measurements are then compared to those obtained by another examiner who utilizes one or several palpatory procedures that assess resistance to joint motion or paraspinal soft tissue abnormalities to help to discern a dysfunctional vertebral segment. Or, one examiner uses pain provocation, and the other palpatory motion restriction sense to assess for a dysfunctional vertebral segment.

Criterion validity measures the extent to which an intervention allows a researcher to predict behavioral or pathological outcomes. Criterion validity studies, therefore, do not measure the phenomenon being palpated, but attempt to correlate the findings of a palpatory procedure (e.g.) with another measurable outcome like diagnosed visceral disease. For example, Beal [39] and Tarr [40] studied the ability of physicians using spinal palpatory procedures to identify, or predict, which patients had visceral disease related to the spinal findings of altered structure, motion and/or soft tissue.

Content validity is the extent to which a measure adequately and comprehensively measures what it claims to be measuring. Although Troyanovich and Harrison [41] consider face and content validity as synonymous, there is an important distinction: content validity studies employ a reference standard.

A reference standard (also called "gold standard") is a measure accepted by consensus of content experts as the best available for determining the presence or absence of a particular phenomenon. When there is no perfect reference standard, as in the case of measurement of a patient's sense of pain provocation, i.e., pressing on a "tender point" or "trigger point", then pragmatic criteria can be used as a reference standard [42]. The visual analog pain scale has been used as a pragmatic reference standard for palpatory pain provocation tests.

Ideally, content validity studies attempt to compare a test with a reference standard of the same phenomenon as that which is being palpated, i.e., palpable abnormalities in structure, motion and soft tissue. The Chiropractic Mercy Center Consensus Conference held in January 1993 identified and rated the value of various measurement instruments related to spinal joint functional assessment that could be used as reference standards [43]. Based on their critical review of the literature, Troyanovich and Harrison [44] suggested postural assessment instruments and radiographic measurement as valid, reliable and clinically useful objective measurement tools to help identify dysfunctional spinal vertebral joints.

Based on this brief review, it appears that construct and criterion validity studies do not measure the phenomenon being palpated. Instead they attempt to correlate the findings of a palpatory procedure with another measurable outcome. On the other hand, content validity studies measure the same phenomenon as that which is being palpated. Given how important it is to know whether the diagnostic tests used in palpatory exams are valid, we conducted a systematic review to assess the content validity of spinal palpatory tests used to identify spinal neuro-musculoskeletal dysfunction.


Study setting

The study was conducted at the Susan Samueli Center for Complementary and Alternative Medicine (University of California-Irvine [UCI]). A multi-disciplinary team of clinicians, researchers, a statistician, and a health sciences librarian participated in the systematic review. The clinicians represented content area expertise in osteopathic and chiropractic medicine, family medicine, and clinical research. In addition, the researchers had expertise and experience in evidence-based medicine, research design and methodology.

Inclusion / Exclusion Criteria

The study inclusion/ exclusion criteria were adapted and modified from those published previously by the Cochrane Collaboration [45] and others [46, 47]. Studies included in the review met the following four criteria: 1) the studies pertained to manual spinal (cervical, thoracic, lumbar, and surrounding para-spinal soft tissue but not the sacrum or pelvis) palpation procedures; 2) the studies included measurement of validity or accuracy of spinal palpation, where validity was defined as the capability of the manual spinal palpation procedure to do what it is supposed to do and accuracy was defined as a measure of how well it actually does that (content validity); 3) the studies were dissertations or a primary research studies published in a peer-reviewed journal; 4) the document could be written in any language; 5) the primary research must have been published or accepted for publication; and 6) all studies were made available between January 1, 1966 and September 30, 2002. Studies were excluded from the review based on the following criteria. First, the data pertained to non-manual procedure(s). Second, the studies included a whole regimen of tests or methods; without separate data for each test, and/or the data for spinal palpatory procedure could not be retrieved. Third, although the document retrieved was relevant to the subject matter, it was anecdotal, speculative, or editorial in nature. Fourth, the document retrieved was inconsistent with the inclusion criteria (see Additional file 2). After review of the retrieved papers, a secondary exclusion criterion, inappropriate statistical tests used, was applied. Appropriate statistical tests included: sensitivity and specificity, predictive value, likelihood ratio, diagnostic odds ratio, and Receiver Operating Characteristic curve (ROC curves) analysis.

Search strategy

A comprehensive strategy was designed to conduct a detailed search of pertinent literature that addressed the study question, "What is the content validity of spinal palpatory tests used to identify spinal neuro-musculoskeletal dysfunction?" Specifics on the search strategy are described in another paper [48]. In brief, our search strategy included both online and manual searches for appropriate literature. For the online search of literature, we defined a detailed search template, which we applied to appropriate databases. The basic search template included MeSH, Descriptors (from MANTIS, Biosis, etc.), Medical Subject headings from CINAHL, and related key terms generated by the investigators from the review team (see Additional file 3). This defined the research question into four key concepts: validity/validity findings, spine, palpation procedure, and neuro-musculoskeletal dysfunctions.

Limits for the search template included: human studies, publications in all languages, journal articles (research articles and conference proceedings if in press), dissertations, and publications between January 1, 1966 and September 30, 2002. We applied the search template, with minor modifications to optimize and enhance the search outcome of individual databases, to 11 databases that had a potential coverage for the areas of osteopathic medicine, allopathic medicine, chiropractic, and physical therapy. The databases accessed by the project included: PubMed MEDLINE, MANTIS, CINAHL, Web of Science, Current Contents, BIOSIS, EMBase OCLC FirstSearch, Cochrane, Osteopathic Database, and Index to Chiropractic Literature. The selection of databases was based primarily on the availability of online resources that we could access from our affiliated institution libraries.

In addition to the online literature search strategy, we used manual methods to identify appropriate literature. These manual methods included gleaning references that were cited in studies selected from the online search, consulting experts in the fields of chiropractic and osteopathic medicine, contacting authors of eligible conference abstracts, and manually searching bibliographies of osteopathic text-books and review articles on somatic dysfunction.

Review strategy

We used a three-step selection process to identify articles for the systematic review. First, we reviewed titles identified through the online search, and excluded those which gave no indication that the studies pertained to validity. Second, we reviewed the abstracts of all the remaining studies identified through the application of our search template, and excluded studies that did not meet the inclusion criteria. Third, we reviewed the complete paper and applied the inclusion/exclusion criteria to studies included at step two.

In all, based on the online and manual searches, 48 studies were fully reviewed. Five studies met the inclusion/exclusion criteria for the systematic review. The remaining 43 studies were excluded, because they did not study spinal palpation procedures, did not assess content validity, and/or did not use appropriate statistical tests (see Additional file 1). Several of the abstracts reviewed at step two of the selection process did not provide clarity towards a study's focus (spinal palpation, type of validity studied).

Review instruments

Two instruments were developed to extract the data and assess the quality of the studies reviewed. The instruments were developed taking into consideration previously published guidelines [49, 50], and instruments [5155]. To maximize objectivity in the evaluation of paper quality, a checklist of quality factors was developed and transformed into a quality assessment instrument. The factors were grouped into 7 major components of quality: study subjects, examiner characteristics, the reference standard used, palpatory test, study conditions, data analysis and presentation of results (see Table 2).

Table 2 Quality scoring criteria, total wieight and total score assigned.

Detailed information on the 7 components identified to denote internal validity and quality of a study were abstracted and scored. In terms of the subject characteristics, we considered criteria such as their socio-demographic description, presentation characteristics and severity of symptoms, selection criteria and sample size determination procedures, sample size and recruitment procedures. Information regarding the examiners pertained to their selection criteria, sample size, and background. The reference standard (if used) and palpatory procedure information pertinent to the quality scoring included a description of the tests, their reliability and expected outcomes, and definition of positive or negative test results. The study conditions were documented with regards to consensus on and description of the palpatory procedure, the training of examiners in the procedure, and blinding of examiners and subjects. For information on the data analysis and results, we abstracted information on the type of statistical procedure(s) used to assess validity and how the results were displayed and described.

The quality assessment instrument focused mainly on the internal validity, taking into consideration biases reported previously namely: selection, performance, measurement, and attrition bias. A weight was assigned to each criterion based on a group consensus. A maximum score of 100 points was set. In designing this instrument we differentiated between quality of an article (i.e. conduct of the trial and reproducibility) and validity, which relates to the ability of the study to answer the research question. The data extraction and quality assessment instruments were structured to mirror each other and facilitate the review and scoring.

Using the quality assessment instrument, each article was reviewed and scored on the seven major components, discussed above, by two-blinded reviewers (title, names of author(s) and journal were removed). The quality scores included an "absolute" score (i.e., total points received on all seven components of the quality assessment form) and a "relative" score (i.e., [absolute score/ total score that could be obtained] × 100). The relative score was especially important for studies wherein certain aspects of the quality scoring components were inapplicable (i.e., the subjects' criteria was inapplicable for studies which used mechanical models or measures). An article's score (absolute or relative) indicated its quality in terms of its internal validity criteria (whether conclusions drawn from study are likely to be unbiased) and the authors' explicit description of the study. Although important, the quality score does not imply a study's significance or impact (in terms of findings, relevance to the discipline). Based on prior recommendations, the overall quality of studies was assessed through the summary scores and the relevant methodological issues pertinent toward internal validity of a study were assessed individually and their influence explored [55].

A pilot test of the data extraction and quality assessment instruments was conducted on four articles randomly selected from the 48 studies evaluated during step three of the study selection process. After completion of the pilot test, we made changes to further clarify and simplify the instruments. For the final review, the articles were blinded to journal, title and author, and randomly assigned to a pair of reviewers. In all, six reviewers (three pairs) conducted the final review, abstracted pertinent data and scored each article based on the quality assessment instrument.

We used descriptive statistics on the quality assessment data to determine agreement/disagreement among a pair of reviewers, and to present the data. The descriptive statistics included standard deviation (S.D.) / Mean ratio, histogram and variability. To achieve a consensus between the pair of reviewers on the scoring of each article, we calculated the standard deviation (S.D.) to mean score percentage. Agreement on quality scores was defined as less than 10% variance (S.D./Mean ratio), in the paired reviewers' scores on each article. When the S.D./Mean ratio variance between the paired reviewers' score was equal to, or exceeded 10%, the pair of reviewers attempted to reach a consensus on each of the criteria where disagreement existed. When reviewers failed to arrive at a consensus on the quality score, two content experts reviewed and scored the topic in contention by consensus.


Study description

A total of five studies, from the 48 articles retrieved and reviewed, met our inclusion criteria for content validity and are discussed in this study (TABLE 3). The remaining 43 studies [5698] were retrieved, reviewed and excluded from our study because they either did not address manual palpation procedure(s), did not pertain to content validity but focused on either construct, predictive, or criterion validity, or used inappropriate statistics (see Additional file 1). Four studies were published in 4 different journals and the fifth study included is a dissertation [99]. Two studies were unfunded (1 dissertation [99] and 1 did not report any funding [100]). Two studies [101, 102], were funded by a Research Council and a liability insurance provider, and one study, [103] was funded by the Chiropractic Advancement Association.

Table 3 Included studies: details Examiner / Subject / Design / & Examiner blinding


The three motion palpation studies were done in the United Kingdom. All three studies utilized mechanical models as the study subjects as well as the reference standard. The two pain studies were done in Sweden. One study [101], recruited only pregnant female subjects (n = 200, representing a 90% response rate: 200/222), while the other study [102] recruited an entirely male population (n = 75, they failed to report the response rate) with acute (< 1 week) neck pain,


Senior chiropractic students and/or experienced (>3 yrs) practitioners were the examiners in the three motion palpation studies. One physical therapist was the examiner in the cervical spine pain provocation study. The lumbar spine pain provocation study [101], did not specify the background of the examiner(s).


All the studies used a prospective study design. In 4 studies the examiners were blinded to fixation levels or clinical presentation. In one pain study [101] blinding was not described.


Among the three studies using mechanical models, 2 [100, 103] looked at intersegmental motion restriction, and one [99] looked at the ability to determine fixation levels. The mechanical model was the reference standard used.

The two pain studies used digital pressure and percussion to elicit pain. Visual Analog Scale (VAS) and pain reported by subjects were used as reference standards. Reliability of the palpation procedure was not reported in any papers with the exception of 1 [103] looking at motion palpation in a mechanical model.

Quality Scoring Findings

In general the quality score would indicate the rigor with which the science was presented in the paper. Quality scores of included studies ranged from 45.5 to 82 out of a possible100. The overall quality of the included studies was good for those focusing on motion palpation (69.5 – 82), and fair for those looking at pain (45.5 – 55.5) (see Table 4). Discussion of examiners and study conditions were the two major areas where weakness was noted in the two pain studies, but not in the motion palpation studies. Statistical tests used were adequate for all studies (this was one of the inclusion criteria). All studies were done in the 1990's; hence the time factor was not felt to be contributive.

Table 4 Average Quality Scores given in each of the 7 major criteria and the total and relative scores for each included article.

Study findings

Motion Palpation Tests

The three studies examining motion palpation were similar in using a mechanical model as the reference standard and focusing on the lumbar spine only. While two studies used similar examiner groups and motion test, the third study [99] looked only at one group of examiners using two different motion test procedures.

Two studies [100, 103] looked at intersegmental motion restriction, using sagital and coronal motion as determined by two groups of chiropractic examiners with different experience levels (senior students and practitioners). Both studies presented data on sensitivity (ability of a test to detect correctly restricted motion segments) and specificity (ability of a test to detect correctly unrestricted motion segments). The sensitivity for both groups in each study varied between 0.510 and 0.636, and the specificity from 0.868 to 0.902, indicating less ability to detect restricted motion segments than unrestricted motion segments. The sensitivity for practitioners in both studies was poor (0.478 and 0.526). For students, the sensitivity was lower in the Harvey study (0.538) than the Jensen study (0.720).

Based on the data provided in each of the studies we calculated the positive and negative predictive (PPV; NPV) values and the likelihood ratio (LR) for each group. The PPV was less than 50.0% in both studies, for both groups (42.3–46.2%) and for each subgroup. While the NPV was greater than 80% (83–93.7%) supporting the above statement of better capability of these tests at detecting unrestricted than restricted motion (Table 5).

Table 5 Statistical analysis for Motion Palpation Studies using students and experienced practitioners

The third motion palpation study [99] looked at intersegmental motion restriction as determined by lateral flexion and posterior-anterior springing (PAS). Examiners were 50 senior chiropractic students. Sensitivity for lateral flexion was 41.2% and for PAS 42.8%, while specificity for lateral flexion was 61.5% and PAS was 62.2% indicating that the motion palpation procedures utilized were neither sensitive nor specific for detecting spinal segmental motion restriction. The calculated PPV (< 31.0%) and NPV (73.7% for both tests) supported this conclusion.

Pain Provocation

The two studies differed in procedure location (cervical vs. thoracic & lumbar), reference standard (VAS vs. subjective patient report), provocation test used and population studied.

The cervical study [102] assessed presence or absence of pain as reported by the subjects upon palpation of their facet joints. Sensitivity (ability of the test to identify presence of pain in subjects reporting pain symptoms) was 82% and specificity (ability of the test to identify the absence of pain in asymptomatic subjects) was 79%, the PPV was 62% and NPV was 91%. The results indicate that the test procedure, as performed, is moderately good at identifying subjects with neck pain and very good at identifying asymptomatic subjects.

The thoracic and lumbar spine study [101] used the VAS as the reference standard and assessed the relationship between the clinical back status and reported pain locations during and after pregnancy. Two types of pain provocation tests were used: digital pressure (within 5 cm of the midline) and lumbar percussion. In the thoracic region, digital pressure (DP) sensitivity was 17.8%, specificity was 98.5%, calculated PPV was 72.2% and NPV was 84.4%. In the lumbar region: DP sensitivity was 21.2%, specificity was 96.19%, calculated PPV was 61.76% and NPV was 80.83%; lumbar percussion sensitivity was 5.1%, specificity was 100%, calculated PPV was 100% and NPV was 78.4% (see Table 6). These results suggest that the thoracic DP test was better at identifying asymptomatic than symptomatic subjects. Both tests performed in the lumbar region were unable to discriminate adequately between subjects.

Table 6 Spinal focus of the study, Reference standard used, Primary outcome, Statistics, and Author's conclusion.


To the best of our knowledge, this is the first comprehensive systematic review of literature on the content validity of spinal palpatory procedures. To reiterate, it is imperative to focus on studies assessing content validity of procedures since, by definition, they attempt to measure the same phenomenon as that which is being palpated. Studies with a focus on other forms of validity (i.e., face, construct and criterion), although important, provide information which does not directly answer the question, "Does the procedure (i.e., palpation) measure (or assess) the phenomenon it is supposed to assess?" but attempt to correlate the findings of a palpatory procedure with another measurable outcome.

The systematic review revealed several methodological, reporting and research issues which severely constrained integrative, qualitative and quantitative evaluations such as systematic reviews and meta-analysis. The evaluation of the validity of spinal palpatory procedures has a number of methodological challenges. In particular, there is no agreed upon reference or "gold" standard measuring device for spinal palpatory procedures. A reference standard is the best available independently established test/procedure used to determine the presence or absence of a phenomenon. In the absence of well-established reference standards, one would use other research designs, such as pragmatic criteria (e.g., pain scales), independent expert panels, clinical follow-up (delayed type cross sectional study), standardized protocols or prognostic criteria [104]. One may also use the most reproducible and reliable test or the most experienced examiner as a reference standard. Some designs utilize invasive procedures, e.g., surgery, histopathology or angiography or a combination of tests to serve as a reference standard.

It is important to identify a reference standard to which a palpatory diagnostic test is compared to ensure that it actually measures what it purports to measure (i.e., that a test for resistance to motion actually measures resistance to motion). Spinal palpatory diagnostic procedures, like vertebral joint motion restriction assessment, are difficult to objectively measure in humans. The concept of a neuro-musculoskeletal spinal dysfunction that is corrected by non-invasive manual spinal manipulation has no agreed upon reference standard. Typically, a conglomerate of findings of altered position, motion characteristics and paraspinal soft tissue feel is necessary to make the diagnosis. X-rays can be validated by altered position. Altered motion has been difficult to validate due to the difficulty of finding a suitable reference standard. However, in order to assess an examiner's ability to discern resistance to vertebral joint motion, the plastic spinal model with an artificially fixed vertebral segment has been employed as a reference standard. Altered tissue feel can be validated in part by measuring skin moisture, temperature, friction, and resistance to pressure. A reference standard used for palpatory pain provocation tests has been the visual analog or numeric pain scale [105107].

Given that face, construct and criterion validity studies do not measure the phenomenon being palpated, but attempt to correlate the findings of a palpatory procedure with another measurable outcome, only content validity studies, which attempt to measure the same phenomenon as that which is being palpated were included in this systematic review.

Physicians (orthopedists, physiatrists, neurologists, emergency medicine, family medicine, sports medicine, etc.), chiropractors, massage therapists, osteopaths, and physical therapists use manual palpatory exams regularly in their practice. However very few studies (#5) have attempted to assess the content validity (as defined in this paper) of these widely used tests. Among the few validity studies identified, motion palpation tests were evaluated only by chiropractors and pain studies by physical therapists.

In this review 5 studies focused on three types of tests: fixation (#2), range of motion (#1) and pain (#2). The quality scores of motion palpation studies were good; however all the tests had poor sensitivity. This indicates that the motion palpatory tests (intersegmental, lateral flexion and posterior-anterior springing) are not able to identify areas of fixation or motion restriction. A poor positive predictive Value (PPV) supported this finding. The pain provocation studies reported good validity for evaluation of pain in the cervical region but not in the lumbar area. This result confirms the results of a previously published [108] study indicating a higher sensitivity for identifying pain in the cervical region compared to the lumbar spine.

Unfortunately, most of the research study results reported are not comparable due to variability in the palpatory tests, terminology, research design, methodology and statistical analysis utilized. These inconsistencies make it difficult to rate the relative value of their results. There is a worldwide concerted effort underway to rectify this problem. The International Federation of Manual Medicine (FIMM), an international organization of physicians and surgeons who practice manual medicine held their General Assembly in Chicago in July 2001. At that meeting, their Scientific Committee reported that their top priority is to promote validity, reliability, sensitivity and specificity studies of spinal palpatory diagnostic procedures. They recently developed guidelines ("Protocol Formats") on how to perform high quality validity and reliability studies of spinal palpatory procedures, which are available on their web site [9]. They recommend the use of valid palpatory tests so that homogeneous populations with spinal musculoskeletal dysfunction can be selected and treated as part of a controlled clinical trial. The results of these trials can subsequently be combined using meta-analysis and would help formulate guidelines for the practice of spinal manipulation.

It is difficult to translate these results into the clinical setting due to the limited number of studies, focused anatomical sites and populations studied. Also, argument could be advanced that the use of a mechanical model may not have external validity when applied to human subjects. All three-motion palpation studies used a mechanical model as the subjects and reference standard, and focused on the lumbar spine. Findings indicate poor validity of the motion palpation tests. The 2 pain studies are of fair to poor quality. One focused on examining pain in the lumbar spine of pregnant women, and the other on pain in the cervical spine among men with acute injuries.

To translate these results into the clinical setting, additional studies exploring the content validity of spinal palpatory exams, using accepted reference standards are needed. Identifying a perfect (error free) reference standard for each palpatory test is challenging. Even widely accepted reference standards are imperfect (e.g. histopathology)[109]. Therefore identifying a perfect reference standard is not as important as identifying an acceptable reference standard. Content experts in this field should come to an agreement on acceptable reference standards for spinal palpatory tests.

This review is unique a) by the cooperative work among a multidisciplinary team of researchers and content-experts; b) the review was not limited to any specific discipline or language; c) the focus on content validity is practical and clinically relevant to practitioners and researchers; d) a great effort and detail went into the development of the search strategy, inclusion/exclusion criteria and quality-scoring instrument.

The search strategy included 11 databases and was done three times using general and specific keywords and strategies to verify results. The quality-scoring instrument was developed taking into consideration strengths and weakness of published instruments, recommendations by the QUOROM [110] and CONSORT [111, 112] statements as well as the Cochrane criteria. In addition this study makes a contribution to the field of manipulation and medicine, in general, by highlighting the limited research and reference standards in this field. It also provides future researchers with a guideline to follow to design a successful content validity study.

As with a majority of reviews, this is a retrospective review, which makes it susceptible to potential sources of bias (publication quality). The focused definition used for content validity limits the studies that are included in this review. However, this strategy allowed more clarity since only content validity studies were included in this systematic review. Despite the number of safeguards used to be inclusive (multiple databases, hand search, review by experts, and multiple searches) in our search, a few studies published but not included in these databases could have been missed.

The quality assessment tool, used for this review, was developed by this team of researchers based on their evaluation of the literature, feedback from methodologists and statisticians. Although we feel that the instrument is well balanced and unbiased, it might have over or underestimated the quality of certain papers. When comparing the quality scores assigned to studies included in this paper to scores assigned to the same papers in another systematic review [27], one notes that our scores are consistently lower.


Despite the use of manual spinal palpation by many health care disciplines, very few studies investigated their ability to measure what they intend to measure (content validity). Given the high frequency of spinal pathology and the use of these diagnostic methods to investigate them, well-designed studies are needed. For the practice of evidence-based medicine, it is important to assess the efficacy and effectiveness of procedures usually and customarily used in clinical practice. To this end, established benchmarks for the validity and reliability of procedures are essential.

This comprehensive systematic review has highlighted serious gaps in our knowledge about the accuracy of spinal palpatory procedures. The findings have implications for research, clinical practice, and policy. From the research perspective, researchers across discipline need to incorporate more rigor towards the definition of the study questions, methods and measures, implementation procedures, and reporting. The absence of well identified reference standards and possible technical difficulties conducting these studies might have contributed to this scarcity.

From the clinical perspective, the findings suggest poor sensitivity of the range of motion and pain diagnostic tests in the evaluation of spinal dysfunction. From a policy perspective, given that manual procedures are a cornerstone towards diagnostic and therapeutic interventions across disciplines, professional societies and associations need to enact continuing medical education and research guidelines to address the efficacy of spinal palpatory procedures.


  1. 1.

    Andersson GB: Epidemiologic aspects on low-back pain in industry. Spine. 1981, 6: 53-60.

    CAS  PubMed  Google Scholar 

  2. 2.

    Loeser JD, Volinn E: Epidemiology of low back pain. Neurosurg Clin N Am. 1991, 2: 713-718.

    CAS  PubMed  Google Scholar 

  3. 3.

    Hadler NM: Illness in the workplace: the challenge of musculoskeletal symptoms. J Hand Surg [Am]. 1985, 10: 451-456.

    CAS  Google Scholar 

  4. 4.

    Andersson GB: Epidemiological features of chronic low-back pain. Lancet. 1999, 354: 581-585. 10.1016/S0140-6736(99)01312-4.

    CAS  PubMed  Google Scholar 

  5. 5.

    Deyo RA: Rethinking strategies for acute low back pain. Emergency Medicine. 1995, 38-56.

    Google Scholar 

  6. 6.

    Hart LG, Deyo RA, Cherkin DC: Physician office visits for low back pain. Frequency, clinical evaluation, and treatment patterns from a U.S. national survey. Spine. 1995, 20: 11-19.

    CAS  PubMed  Google Scholar 

  7. 7.

    Schiotz EH, Cyriax JH: Manipulation past and present : with an extensive bibliography. 1975, London: Heinemann Medical

    Google Scholar 

  8. 8.

    Lomax E: A Historical Perspective from Ancient Times to the Modern Era. The Research Status of Spinal Manipulative Therapy. Edited by: M G. 1975, Bethesda: Maryland: US Dept. of Health, Education and Welfare, 11-17.

    Google Scholar 

  9. 9.

    Patijn J, Editor: FIMM (International Federation for Manual/Musculoskeletal Medicine) Scientific Committee: Reproducibility and Validity Studies of Diagnostic Procedures in Manual/Musculoskeletal Medicine for Low Back Pain Patients. Protocol formats. []

  10. 10.

    Dinnar U, Beal MC, Goodridge JP, Johnston WL, Karni Z, Mitchell FL, Upledger JE, McConnell DG: Description of fifty diagnostic tests used with osteopathic manipulation. J Am Osteopath Assoc. 1982, 81: 314-321.

    CAS  PubMed  Google Scholar 

  11. 11.

    Walker BF, Buchbinder R: Most commonly used methods of detecting spinal subluxation and the preferred term for its description: a survey of chiropractors in Victoria, Australia. J Manipulative Physiol Ther. 1997, 20: 583-589.

    CAS  PubMed  Google Scholar 

  12. 12.

    Walker BF: The Reliability of Chiropractic methods used for the Detection of Spinal Subluxation. An Overview of the Literature. Aust Chiro Osteo. 1996, 5: 12-22.

    CAS  Google Scholar 

  13. 13.

    Ward RC: Foundations for osteopathic medicine. 2003, Philadelphia: Lippincott Williams & Wilkins

    Google Scholar 

  14. 14.

    Chaitow L: Palpation skills : assessment and diagnosis through touch. 1997, New York: Churchill Livingstone

    Google Scholar 

  15. 15.

    Bergman T, Peterson D, Lawrence D: Chiropractic technqiues. Principles and procedures. 1993, New York: Churchill-Livingstone

    Google Scholar 

  16. 16.

    Kaltenborn F: The spine: basic evaluation and mobilization techniques. 1993, Minneapolis, MN: OPTP

    Google Scholar 

  17. 17.

    Maitland G, ed: Maitland's vertebral manipulation. 2000, Oxford : Butterworth-Heinemann

    Google Scholar 

  18. 18.

    Lewit K: Manipulations therapy in rehabilitation of the locomotor system. 1985, Boston: Butterworths

    Google Scholar 

  19. 19.

    Maigne R, ed: Douleurs d'origine vertébrale et traitements par manipulations. English. Orthopedic medicine : a new approach to vertebral manipulations. 1972, Springfield, Ill.: C.C. Thomas

    Google Scholar 

  20. 20.

    Wadsworth C: Manual examination and treatment of the spine and extremities. 1988, Baltimore: Williams and Wilkins

    Google Scholar 

  21. 21.

    Johnston WL: Interexaminer reliability studies: spanning a gap in medical research – Louisa Burns Memorial Lecture. J Am Osteopath Assoc. 1982, 81: 819-829.

    CAS  PubMed  Google Scholar 

  22. 22.

    Gonnella C, Paris SV, Kutner M: Reliability in evaluating passive intervertebral motion. Phys Ther. 1982, 62: 436-444.

    CAS  PubMed  Google Scholar 

  23. 23.

    Panzer DM: The reliability of lumbar motion palpation. J Manipulative Physiol Ther. 1992, 15: 518-524.

    CAS  PubMed  Google Scholar 

  24. 24.

    Keating JC, Jacobs GE: Inter- and intraexaminer reliability of palpation for sacroiliac joint dysfunction. J Manipulative Physiol Ther. 1989, 12: 155-158.

    PubMed  Google Scholar 

  25. 25.

    Haas M: The reliability of reliability. J Manipulative Physiol Ther. 1991, 14: 199-208.

    CAS  PubMed  Google Scholar 

  26. 26.

    Russell R: Diagnostic palpation of the spine: a review of procedures and assessment of their reliability. J Manipulative Physiol Ther. 1983, 6: 181-183.

    CAS  PubMed  Google Scholar 

  27. 27.

    Alley J: The clinical value of motion palpation as a diagnostic tool. Journal of the Canadian Chiropractic Association. 1983, 27: 97-100.

    PubMed Central  Google Scholar 

  28. 28.

    Huijbregts P: Spinal motion palpation: a review of reliability studies. The Journal of Manual & Manipulative Therapy. 2002, 10: 24-39.

    Google Scholar 

  29. 29.

    Hestbaek L, Leboeuf-Yde C: Are chiropractic tests for the lumbo-pelvic spine reliable and valid? A systematic critical literature review. J Manipulative Physiol Ther. 2000, 23: 258-275. 10.1067/mmt.2000.106097.

    CAS  PubMed  Google Scholar 

  30. 30.

    van der Wurff P, Meyne W, Hagmeijer RH: Clinical tests of the sacroiliac joint. Man Ther. 2000, 5: 89-96. 10.1054/math.1999.0229.

    CAS  PubMed  Google Scholar 

  31. 31.

    Seffinger M, Adams A, Najm W, Dickerson V, Mishra S, Reinsch S, Murphy L: Spinal palpatory diagnostic procedures utilized by practitioners of spinal manipulation: Annotated bibliography of reliability studies. Journal of the Canadian Chiropractic Association. 2003, 47: 89-105.

    Google Scholar 

  32. 32.

    Winter G: A comparative discussion of the notion of 'validity' in qualitative and quantitative research. The Qualitative Report. 2000, 4: []

    Google Scholar 

  33. 33.

    Troyanovich SJ, Harrison DD, Harrison DE: it's time to accept the evidence. J Manipulative Physiol Ther. 1998, 21: 568-571.

    CAS  PubMed  Google Scholar 

  34. 34.

    Feinstein A: Clinemetrics. 1987, New Haven: Yale University Press

    Google Scholar 

  35. 35.

    Jull G, Bogduk N, Marsland A: The accuracy of manual diagnosis for cervical zygapophysial joint pain syndromes. Med J Aust. 1988, 148: 233-236.

    CAS  PubMed  Google Scholar 

  36. 36.

    Kelso AF, Grant RG, Johnston WL: Use of thermograms to support assessment of somatic dysfunction or effects of osteopathic manipulative treatment: preliminary report. J Am Osteopath Assoc. 1982, 82: 182-188.

    CAS  PubMed  Google Scholar 

  37. 37.

    Vorro J, Johnston WL: Clinical biomechanic correlates of cervical dysfunction: Part 4. Altered regional motor behavior. J Am Osteopath Assoc. 1998, 98: 317-323.

    CAS  PubMed  Google Scholar 

  38. 38.

    Kelso AF, Larson NJ, Kappler RE: A clinical investigation of the osteopathic examination. J Am Osteopath Assoc. 1980, 79: 460-467.

    CAS  PubMed  Google Scholar 

  39. 39.

    Beal MC: Palpatory testing for somatic dysfunction in patients with cardiovascular disease. J Am Osteopath Assoc. 1983, 82: 822-831.

    CAS  PubMed  Google Scholar 

  40. 40.

    Tarr RS, Feely RA, Richardson DL, Mulloy AL, Nelson KE, Perrin WE, Allin EF, Efrusy ME, Greenstein SI, Vatt RD: A controlled study of palpatory diagnostic procedures: assessment of sensitivity and specificity. J Am Osteopath Assoc. 1987, 87: 296-301.

    CAS  PubMed  Google Scholar 

  41. 41.

    Troyanovich SJ: The reliability and validity of chiropractic assessment procedures. Chiropractic Technique. 1996, 8: 10-13.

    Google Scholar 

  42. 42.

    Knottnerus J, ed: The evidence base of clinical diagnosis. 2002, London : BMJ Books

    Google Scholar 

  43. 43.

    Haldeman S, Chapman-Smith D, Peterson DJ, eds: Guidelines for chiropractic quality assurance and practice parameters. 1993, Burlingame, CA: Aspen Publishers

    Google Scholar 

  44. 44.

    Troyanovich SJ, Harrison DD: In reply to letter to the editor. J Manipulative Physiol Ther. 1999, 22: 182-183.

    Google Scholar 

  45. 45.

    Clarke M, Oxman A, eds: The Reviewers' Handbook 4.1.6. 2003, Oxford: The Cochrane Collaboration, 4.1.6

    Google Scholar 

  46. 46.

    Khan K, ter Riet G, Glanville J, Sowden A, Kleijnen J, eds: undertaking Systematic Reviews of Research on Effectiveness – CRD's Guidance for those Carrying Out or Commissioning Reviews – CRD Report Number 4. 2001, York Publishing Services, Ltd, 2

    Google Scholar 

  47. 47.

    Deville WL, Buntinx F, Bouter LM, Montori VM, De Vet HC, Van Der Windt DA, Bezemer P: Conducting systematic reviews of diagnostic studies: didactic guidelines. BMC Med Res Methodol. 2002, 2: 9-10.1186/1471-2288-2-9.

    PubMed  PubMed Central  Google Scholar 

  48. 48.

    Murphy L, Reinsch S, Najm W, Dickerson V, Seffinger M, Adams A, Mishra S: Searching biomedical databases on Complementary and Alternative Medicine: the use of controlled vocabulary among authors, indexers, and investigators. (In review).

  49. 49.

    Juni P, Altman DG, Egger M: Systematic reviews in health care: Assessing the quality of controlled clinical trials. Bmj. 2001, 323: 42-46. 10.1136/bmj.323.7303.42.

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Deeks JJ: Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and screening tests. Bmj. 2001, 323: 157-162. 10.1136/bmj.323.7305.157.

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Shekelle PG, Adams AH, Chassin MR, Hurwitz EL, Brook RH: Spinal manipulation for low-back pain. Ann Intern Med. 1992, 117: 590-598.

    CAS  PubMed  Google Scholar 

  52. 52.

    Cook DJ, Sackett DL, Spitzer WO: Methodologic guidelines for systematic reviews of randomized control trials in health care from the Potsdam Consultation on Meta-Analysis. J Clin Epidemiol. 1995, 48: 167-171. 10.1016/0895-4356(94)00172-M.

    CAS  PubMed  Google Scholar 

  53. 53.

    Koes B, Assendelft WJ, van der Heijden GJ, Bouter LM, Knipschild PG.: Spinal manipulation and mobilization for back and neck pain: a blinded review. BMJ. 1991, 303: 1298-1303.

    CAS  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Schulz KF, Chalmers I, Hayes RJ, Altman DG: Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995, 273: 408-412. 10.1001/jama.273.5.408.

    CAS  PubMed  Google Scholar 

  55. 55.

    Juni P, Witschi A, Bloch R, Egger M: The hazards of scoring the quality of clinical trials for meta-analysis. JAMA. 1999, 282: 1054-1060. 10.1001/jama.282.11.1054.

    CAS  PubMed  Google Scholar 

  56. 56.

    Beal MC, Kleiber GE: Somatic dysfunction as a predictor of coronary artery disease. J Am Osteopath Assoc. 1985, 85: 302-307.

    CAS  PubMed  Google Scholar 

  57. 57.

    Beal MC, Vorro J, Johnston WL: Chronic cervical dysfunction: correlation of myoelectric findings with clinical progress. J Am Osteopath Assoc. 1989, 89: 891-900.

    CAS  PubMed  Google Scholar 

  58. 58.

    Braun B, Schiffman EL: The validity and predictive value of four assessment instruments for evaluation of the cervical and stomatognathic systems. J Craniomandib Disord. 1991, 5: 239-244.

    CAS  PubMed  Google Scholar 

  59. 59.

    Breen A, Muggleton : Lumbar Spine Motion Palpation Compared with Objective Intervertebral Motion Analysis Preliminary Findings. In Process.

  60. 60.

    Brunarski DJ: Chiropractic biomechanical evaluations: validity in myofascial low back pain. J Manipulative Physiol Ther. 1982, 5: 155-161.

    CAS  PubMed  Google Scholar 

  61. 61.

    Bush KW, Collins N, Portman L, Tillett N: Validity and intertester reliability of cervical range of motion using inclinometer measurements. The Journal of Manual & Manipulative Therapy. 2000, 8: 52-61.

    Google Scholar 

  62. 62.

    Byfield D, Mathisasen J: A Preliminary Study Investigating the Accuracy of Bony Landmark Identification in the Lumbar Spine. European Journal Of Chiropractic. 1991, 39: 105-109.

    Google Scholar 

  63. 63.

    Cox JM, Gorbis S, Dick LM, Rogers JC, Rogers FJ: Palpable musculoskeletal findings in coronary artery disease: results of a double-blind study. J Am Osteopath Assoc. 1983, 82: 832-836.

    CAS  PubMed  Google Scholar 

  64. 64.

    Dott GA, Hart CL, McKay C: Predictability of sacral base levelness based on iliac crest measurements. J Am Osteopath Assoc. 1994, 94: 383-390.

    CAS  PubMed  Google Scholar 

  65. 65.

    Ebraheim NA, Inzerillo C, Xu R: Are anatomic landmarks reliable in determination of fusion level in posterolateral lumbar fusion?. Spine. 1999, 24: 973-974. 10.1097/00007632-199905150-00008.

    CAS  PubMed  Google Scholar 

  66. 66.

    Gracovetsky SA, Newman NM, Richards MP, Asselin S, Lanzo VF, Marriott A: Evaluation of clinician and machine performance in the assessment of low back pain. Spine. 1998, 23: 568-575. 10.1097/00007632-199803010-00009.

    CAS  PubMed  Google Scholar 

  67. 67.

    Gregory P, Hayek R, Mann-Hayek A: Correlating motion palpation with functional X-ray findings in patients with low back pain. Australasian Chiropractic & Osteopathy. 1998, 7: 15-19.

    Google Scholar 

  68. 68.

    Haas M, Peterson D, Hoyer D, Ross G: Muscle testing response to provocative vertebral challenge and spinal manipulation: a randomized controlled trial of construct validity. J Manipulative Physiol Ther. 1994, 17: 141-148.

    CAS  PubMed  Google Scholar 

  69. 69.

    Haas M, Panzer D, Peterson D, Raphael R: Short-Term Responsiveness of Manual Thoracic End-Play Assessment to Spinal Manipulation: A Randomized Controlled Trial of Construct Validity. Journal Of Manipulative And Physiological Therapeutics. 1995, 18: 582-589.

    CAS  PubMed  Google Scholar 

  70. 70.

    Jende A, Peterson CK: Validity of static palpation as an indicator of atlas transverse process asymmetry. European Journal of Chiropractic. 1997, 45: 35-42.

    Google Scholar 

  71. 71.

    Johnston WL, Vorro J, Hubbard RP: Clinical/biomechanic correlates for cervical function: Part I. A kinematic study. J Am Osteopath Assoc. 1985, 85: 429-437.

    CAS  PubMed  Google Scholar 

  72. 72.

    Jull G, Bogduk N, Marsland A: The accuracy of manual diagnosis for cervical zygapophysial joint pain syndromes. Med J Aust. 1988, 148: 233-236.

    CAS  PubMed  Google Scholar 

  73. 73.

    Jull G, Treleaven J, Versace G: Manual examination: is pain provocation a major cue for spinal dysfunction. Australian Physiotherapy. 1994, 40: 159-165.

    CAS  Google Scholar 

  74. 74.

    Kawchuk G, Herzog W: The reliability and accuracy of a standard method of tissue compliance assessment. J Manipulative Physiol Ther. 1995, 18: 298-301.

    CAS  PubMed  Google Scholar 

  75. 75.

    Keating L, Lubke C, Powell V, Young T, Souvlis T, Jull G: Mid-thoracic tenderness: a comparison of pressure pain threshold between spinal regions, in asymptomatic subjects. Man Ther. 2001, 6: 34-39. 10.1054/math.2000.0377.

    CAS  PubMed  Google Scholar 

  76. 76.

    Leboeuf-Yde C, Kyvik KO: Is it possible to differentiate people with or without low-back pain on the basis of tests of lumbopelvic dysfunction?. J. Manipulative Physiol Ther. 2000, 23: 160-167.

    CAS  PubMed  Google Scholar 

  77. 77.

    Leboeuf-Yde C, van Dijk J, Franz C, Hustad SA, Olsen D, Pihl T, Robech R, Skov Vendrup S, Bendix T, Kyvik KO: Motion palpation findings and self-reported low back pain in a population-based study sample. J Manipulative Physiol Ther. 2002, 25: 80-87. 10.1067/mmt.2002.122330.

    PubMed  Google Scholar 

  78. 78.

    Leclaire R, Esdaile JM, Jequier JC, Hanley JA, Rossignol M, Bourdouxhe M: Diagnostic accuracy of technologies used in low back pain assessment. Thermography, triaxial dynamometry, spinoscopy, and clinical examination. Spine discussion 1331. 1996, 21: 1325-1330. 10.1097/00007632-199606010-00009.

    CAS  Google Scholar 

  79. 79.

    Lucchetti CA: Palpation of the C7 Vertebral Spinous Process. An Inter- And Intra-examiner Reliability and Accuracy Study. Anglo-European College of Chiropractic. 1991-1992.

  80. 80.

    Lundberg G, Gerdle B: The relationships between spinal sagittal configuration, joint mobility, general low back mobility and segmental mobility in female homecare personnel. Scand J Rehabil Med. 1999, 31: 197-206. 10.1080/003655099444362.

    CAS  PubMed  Google Scholar 

  81. 81.

    Maher CG, Latimer J, Adams R: An investigation of the reliability and validity of posteroanterior spinal stiffness judgments made using a reference-based protocol. Physical Therapy. 1998, 78: 829-837.

    CAS  PubMed  Google Scholar 

  82. 82.

    Mayer TG, Kondraske G, Beals SB, Gatchel RJ: Spinal range of motion. Accuracy and sources of error with inclinometric measurement. Spine. 1997, 22: 1976-1984. 10.1097/00007632-199709010-00006.

    CAS  PubMed  Google Scholar 

  83. 83.

    McPartland JM, Goodridge JP: Counterstrain and traditional osteopathic examination of the cervical spine compared. Journal of Bodywork & Movement Therapies. 1997, 1: 173-178.

    Google Scholar 

  84. 84.

    Mennell J: The validation of the diagnosis "joint dysfunction" in the synovial joints of the cervical spine. Journal Of Manipulative And Physiological Therapeutics. 1990, 13: 7-12.

    CAS  PubMed  Google Scholar 

  85. 85.

    Nansel DD, Jansen RD: Concordance between galvanic skin response and spinal palpation findings in pain-free males. J Manipulative Physiol Ther. 1988, 11: 267-272.

    CAS  PubMed  Google Scholar 

  86. 86.

    Olson KA, Paris SV, Spohr C, Gorniak G: Radiographic assessment and reliability study of the craniovertebral sidebending test. Journal of Manual & Manipulative Therapy. 1998, 6: 87-96.

    Google Scholar 

  87. 87.

    Olson SL, O'Connor DP, Birmingham G, Broman P, Herrera L: Tender point sensitivity, range of motion, and perceived disability in subjects with neck pain. J Orthop Sports Phys Ther. 2000, 30: 13-20.

    CAS  PubMed  Google Scholar 

  88. 88.

    Osterbauer PJ, Long K, Ribaudo TA, Petermann EA, Fuhr AW, Bigos SJ, Yamaguchi GT: Three-dimensional head kinematics and cervical range of motion in the diagnosis of patients with neck trauma. J Manipulative Physiol Ther. 1996, 19: 231-237.

    CAS  PubMed  Google Scholar 

  89. 89.

    Phillips DR, Twomey LT: A comparison of manual diagnosis with a diagnosis established by a uni-level lumbar spinal block procedure... this study was presented in part at the 8th Biennial Conference of the MPAA, in 1993. Manual Therapy. 1996, 1: 82-87. 10.1054/math.1996.0254.

    CAS  PubMed  Google Scholar 

  90. 90.

    Simmonds MJ, Kumar S, Lechelt E: Use of a spinal model to quantify the forces and motion that occur during therapists' tests of spinal motion. Phys Ther. 1995, 75: 212-222.

    CAS  PubMed  Google Scholar 

  91. 91.

    Swerdlow B, Dieter JN: An evaluation of the sensitivity and specificity of medical thermography for the documentation of myofascial trigger points. Pain. 1992, 48: 205-213. 10.1016/0304-3959(92)90060-O.

    CAS  PubMed  Google Scholar 

  92. 92.

    Tarr R, Feely R, Richardson D: A controlled study of palpatory diagnostic procedures: assessment of sensitivity and specificity. Journal Of American Osteopathic Association. 1987, 87: 296-302.

    CAS  Google Scholar 

  93. 93.

    Verhagen AP, Lanser K, de Bie RA, de Vet HC: Whiplash: assessing the validity of diagnostic tests in a cervical sensory disturbance. J Manipulative Physiol Ther. 1996, 19: 508-512.

    CAS  PubMed  Google Scholar 

  94. 94.

    Viikari-Juntura E, Takala EP, Riihimaki H, Malmivaara A, Martikainen R, Jappinen P: Standardized physical examination protocol for low back disorders: feasibility of use and validity of symptoms and signs. J Clin Epidemiol. 1998, 51: 245-255. 10.1016/S0895-4356(97)00266-7.

    CAS  PubMed  Google Scholar 

  95. 95.

    Visscher CM, Lobbezoo F, de Boer W, van der Zaag J, Verheij JG, Naeije M: Clinical tests in distinguishing between persons with or without craniomandibular or cervical spinal pain complaints. Eur J Oral Sci. 2000, 108: 475-483. 10.1034/j.1600-0722.2000.00916.x.

    CAS  PubMed  Google Scholar 

  96. 96.

    Vorro J, Johnston WL: Clinical biomechanic correlates for cervical function: Part II. A myoelectric study. J Am Osteopath Assoc. 1987, 87: 353-367.

    CAS  PubMed  Google Scholar 

  97. 97.

    Vorro J, Johnston WL, Hubbard RP: Clinical biomechanic correlates for cervical function: Part III. Intermittent secondary movements. J Am Osteopath Assoc. 1991, 91: 145-146.

    CAS  PubMed  Google Scholar 

  98. 98.

    Vorro J, Johnston WL: Clinical biomechanic correlates of cervical dysfunction: Part 4. Altered regional motor behavior. J Am Osteopath Assoc. 1998, 98: 317-323.

    CAS  PubMed  Google Scholar 

  99. 99.

    Moruzzi S: Accuracy of Two Different Motion Palpation Procedures for Determining Fixations in the Lumbar Spine Using an Articulated Spinal Model. Anglo-European College of Chiropractic. 1992-1993.

  100. 100.

    Harvey D, Byfield D: Preliminary Studies with a Mechanical Model for the Evaluation of Spinal Motion Palpation. Clinical Biomechanics. 1991, 6: 79-82.

    CAS  PubMed  Google Scholar 

  101. 101.

    Kristiansson P, Svardsudd K: Discriminatory power of tests applied in back pain during pregnancy. Spine discussion. 1996, 21: 2337-2343. 10.1097/00007632-199610150-00006.

    CAS  Google Scholar 

  102. 102.

    Sandmark H, Nisell R: Validity of five common manual neck pain provoking tests. Scand J Rehabil Med. 1995, 27: 131-136.

    CAS  PubMed  Google Scholar 

  103. 103.

    Jensen K, Gemmell H, Thiel H: Motion Palpation Accuracy Using a Mechanical Spinal Model. European Journal Of Chiropractic. 1993, 41: 67-73.

    Google Scholar 

  104. 104.

    Knottnerus A: The evidence base of clinical diagnosis. 2002, London: BMJ Books, January

    Google Scholar 

  105. 105.

    Olson SL, O'Connor DP, Birmingham G, Broman P, Herrera L: Tender point sensitivity, range of motion, and perceived disability in subjects with neck pain. J Orthop Sports Phys Ther. 2000, 30: 13-20.

    CAS  PubMed  Google Scholar 

  106. 106.

    Price DD, McGrath PA, Rafii A, Buckingham B: The validation of visual analogue scales as ratio scale measures for chronic and experimental pain. Pain. 1983, 17: 45-56. 10.1016/0304-3959(83)90126-4.

    CAS  PubMed  Google Scholar 

  107. 107.

    Huskisson EC: Measurement of pain. J Rheumatol. 1982, 9: 768-769.

    CAS  PubMed  Google Scholar 

  108. 108.

    Keating L, Lubke C, Powell V, Young T, Souvlis T, Jull G: Mid-thoracic tenderness: a comparison of pressure pain threshold between spinal regions, in asymptomatic subjects. Man Ther. 2001, 6: 34-39. 10.1054/math.2000.0377.

    CAS  PubMed  Google Scholar 

  109. 109.

    Walter S, Les Irwig, Glaszious PP: Meta-Analysis of Diagnostic Tests With Imperfect Reference Standards. J Clin Epidemiol. 1999, 52: 943-951. 10.1016/S0895-4356(99)00086-4.

    CAS  PubMed  Google Scholar 

  110. 110.

    Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF: Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Quality of Reporting of Meta-analyses. Lancet. 1999, 354: 1896-1900. 10.1016/S0140-6736(99)04149-5.

    CAS  PubMed  Google Scholar 

  111. 111.

    Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I, Pitkin R, Rennie D, Schulz KF, Simel D, Stroup DF: Improving the quality of reporting of randomized controlled trials. The CONSORT statement. JAMA. 1996, 276: 637-639. 10.1001/jama.276.8.637.

    CAS  PubMed  Google Scholar 

  112. 112.

    Moher D, Schulz KF, Altman D: The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. JAMA. 2001, 285: 1987-1991. 10.1001/jama.285.15.1987.

    CAS  PubMed  Google Scholar 

Pre-publication history

  1. The pre-publication history for this paper can be accessed here:

Download references


This work is supported by a grant from the 41st Fund, and done at the Susan Samueli Center for Complementary and Alternative Medicine, University of California, Irvine.

The authors would like to thank D.V. Gokhale, Ph.D. for his statistical input, Joseph Vorro, Ph.D. and William L. Johnston, D.O. for their review and comments.

Author information



Corresponding author

Correspondence to Wadie I Najm.

Additional information

Competing interests

None declared.

Authors' contributions

All listed authors worked collectively on the design of the study and development of the review instruments. WIN, MAS, and SIM carried out article reviews, data extraction, analysis of the data, and drafted the manuscript. VD carried out paper reviews, data extraction and analysis. SR carried out article reviews and worked on the paper search methodology. LSM oversaw the database searches, and carried out all the article tabulation. AA participated in the design of the study, instrument development, and methodology. AFG participated in development of the review instruments and advised on statistical conduct of the study.

Electronic supplementary material


Additional File 1: Studies reviewed and excluded from this study organized by type of validity they were classified to. (DOC 134 KB)

Additional File 2: Inclusion and Exclusion Criteria for developed and used for this study review. (DOC 22 KB)


Additional File 3: Key terms and search strategy used to identify articles in the databases used in this review. (DOC 31 KB)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Najm, W.I., Seffinger, M.A., Mishra, S.I. et al. Content validity of manual spinal palpatory exams - A systematic review. BMC Complement Altern Med 3, 1 (2003).

Download citation


  • Positive Predictive Value
  • Content Validity
  • Spinal Manipulation
  • Lateral Flexion
  • Pain Provocation