Vert Mooney, MD

Vert Mooney died yesterday afternoon on his way home from work, apparently from a heart attack or stroke.  He was a pioneer in so many aspects of rehabilitation and one of the world’s foremost spine surgeons, a wonderful husband and father, and a friend and mentor whose absence will be deeply felt.

I woke up in the wee hours this morning feeling his absence.  His voice is still fresh for me, “Onward and upward, man!”

I’m certain that there are many aspects of Vert I will miss that will come to mind in the coming days, but the very first that I’ve noticed is how much I value his firm graciousness and his insistence on respect for all opinions.  As a pioneer in medicine, it wasn’t uncommon for him to be attacked by vested interests and by people whose cages he enjoyed rattling.  I recall a scientific meeting many years ago in which we presented several research papers to about 500 orthopedic surgeons and then took questions.  Our work was obviously controversial because we had scientifically demonstrated the efficacy of alternatives to expensive surgical procedures; not exactly what spine surgeons wanted to hear.  One of our group was so concerned about the reception of his paper that he actually fainted at the lectern and had to be revived.  After we presented, Vert was the moderator, taking questions from the floor.  Immediately he was hit with angry “questions” that were really diatribes by angry red-faced surgeons who were used to telling other people what was what.  Vert, with deep roots in the scientific and academic communities and as a founder and past president of all of the major pertinent professional associations simply responded with, “Thank you for your question” and asked for the microphone to be passed.  He was polite and not dismissive, allowing people to have their say, trusting that our findings, based on good research, would stand on its own, which was true.  As the diatribes diminished and actual questions began to surface, he encouraged all of us, but most especially the junior members of our research group to respond, which, with Vert having our back, we were able to do.  That was a very special moment for me and provided a template for how to be a mentor and senior scientist.  Being a pioneer is fun, but it is often difficult and the absolute best way to defuse difficult situations is with grace.  Vert was firm, not backing away from a fight, but always treating everyone in the conversation with grace and respect.

Today, I send prayers to Vert’s family and many friends, for our shared loss and thanking God for his gift of Vert’s presence.  Millions have benefited from his work, many of us directly, and the world is so much better because he led and inspired us.

Is FCE Valid?

This question must be addressed within the context of all issues that determine the worthiness of every functional evaluation procedure endorsed or recognized as crucial by the major professional associations and government bodies that have addressed the issue of functional evaluation, such as the American Psychological Association (American Psychological Association, American Educational Research Association, & National Council on Measurement in Education, 1999), the National Institute of Occupational Safety and Health (Chaffin, Herrin, & Keyserling, 1978), the American Physical Therapy Association (American Physical Therapy Association, 1997), the American Congress of Rehabilitation Medicine (Johnston, Keith, & Hinderer, 1992) , and the American College of Sports Medicine (American College of Sports Medicine, 2000) . These issues, presented in hierarchical order, are:

1. Safety – Given the known characteristics of the evaluee, the procedure should not be expected to lead to injury;
2. Reliability – The test score should be dependable across evaluators, evaluees, and the date or time of test administration;
3. Validity – The interpretation of the test score should be able to predict or reflect the evaluee’s performance in a target setting. The formal definition of validity is “the degree to which all of the accumulated evidence supports the intended interpretation of test scores for the intended purpose”. [3]
4. Practicality -The cost of the test procedure should be reasonable and customary. Cost is measured in terms of the direct expense of the test procedure plus the amount of time required of the evaluee;
5. Utility – The usefulness of the procedure is the degree to which it meets the needs of the evaluee and referrer.

This hierarchy requires that each of the factors presented earlier must be maintained as subsequent factors are addressed. For example, it is not permissible to sacrifice safety for the sake of practicality. In addition, the first four factors must be adequately addressed for the purpose of the evaluation to be achieved.
When applied to work disability, functional capacity evaluation has three primary purposes. The first purpose is to determine whether or not the evaluee is able to return to work at his or her usual and customary job and, if not, to identify what the evaluee needs to improve or the employer needs to modify before a return to work is reasonable. If there is not a job to which the evaluee can return, the second purpose is to identify functional abilities that could be used in alternate occupations. The third purpose is to quantify functional limitations in terms that are useful in the disability determination process.

By what standard can each of the individual factors be measured? In the past, the answer to this question was readily available, with simple statistical standards used for each factor. However, since the 1999 publication of the APA/AERA/NCME Standards and the adoption of the new definition of “validity” in the Standards by the United States Equal Employment Opportunity Commission (EEOC, 2008), a simple answer is no longer acceptable. Best practices in evaluation require a broad look at the test factors hierarchy, with a focus on validity that must now take into account a broad range of issues with the overriding goal being increased fairness in the evaluation process.

The reader is encouraged to obtain the APA/AERA/NCME standards for specific information. In the meantime, one approach to the consideration of the adequacy of the standards that can be considered is to rely on whether utility was achieved. Given that utility is the ultimate factor, requiring that all of the other factors be adequately addressed, if there is utility the factors in the hierarchy must have been handled appropriately. If there is not utility, one or more of the factors was not handled appropriately.

What this means in practice is that we must always start by asking whether there was a useful outcome. For example, in the 1986 workers’ compensation study (Matheson, 1986), functional capacity evaluation results were helpful in resolving several cases, after each injured worker had received a permanent and stationary rating with work restrictions. Although individual indicators for each of the factors in the hierarchy were available and presented in the original research, the ultimate determination of test worthiness is that the FCE process led to a useful outcome. Vert Mooney, M.D. is fond of saying that “the test result has got to tell you what to do next”. There are many tests that are reliable and valid in and of themselves that don’t guide the professionals in terms of what to do next and don’t have utility in terms of contributing to a useful outcome. Adoption of this broader view of the properties of tests, beyond their psychometric properties, is likely to lead to a new generation of functional capacity evaluation tests and test batteries that are more likely to meet the needs of evaluees and referral sources.

Impairment Ratings in California


In an otherwise excellent review and analysis of the effect of new case law in California on the use of the fifth edition of the Guides to the Evaluation of Permanent Impairment, the authors take a pot shot at Functional Capacity Evaluation.  They report, “Functional capacity evaluations (FCE) are of no value in rating permanent impairment or permanent disability within the context of workers’ compensation litigation; the results of these assessments are often unreliable.  FCE protocols vary in quality and in the context of determination of benefits and litigation it is common for examinees to under-demonstrate their capabilities.  Studies have conclusively shown that FCE performance does not predict sustained return to work in claimants with chronic back pain.  Scientific scrutiny has additionally demonstrated that work restrictions which are based on FCEs are harmful to the health of examinees.  FCEs are not a comparable method of evaluation of impairment and therefore do not rebut the Guides.”

The authors are arguing that impairment ratings based on the Guides should not be able to be rebutted with better information in the formula that the state of California uses to develop a disability rating.

I would like to open up a civil discussion that throws light on these important issues, beginning with a few principles on which we can likely agree.

The central principle is that everydisability rating system must fairly lead to compensation that reflects the actual work disability.

Another principle is that better decisions are made when the information provided is pertinent to the question.

Another principle is that there is always a trade-off in test-based decision-making between safety, reliability, validity, and practicality and, when the trade-off is handled poorly, utility suffers.


The argument put forth by Brigham and Uehlein goes to the distinction between reliability and validity, with the former putting a ceiling on the latter, but not providing a substitute for the latter.  With regard to the present issue, the Guides is unquestionably more reliable than ad hoc impairment ratings of a physician, even one who is very well trained.  That is the strength of the Guides.  Continuing, good reliability only sets the stage for good validity.  Measuring systems can be devised that are very accurate but produce information that contributes little to the final decision, which is an issue of validity.  Said another way, to the degree that information is related to the question the decision will be correct.  In the present case, everyone agrees that the Guides provides precursor information rather than pertinent information.  Impairment is not directly related to disability, but is pertinently related to functional limitation, which is directly related to disability.  To the degree that functional limitation information is reliable, its validity will always be superior to impairment information.

The Test Factors Hierarchy is a helpful thought tool.  In any testing situation, four factors presented in hierarchical order must be adequately addressed in order for utility to occur.  These factors in order are Safety, Reliability, Validity, and Practicality.  There is always a tension among these factors, with satisfaction of the senior factors more important, but all requiring consideration.  When working with a person who has a medical impairment, Safety is always an important consideration.  The interface of a particular person to a particular test may lead to a safety risk that is inappropriate to undertake; the test is thus not administered.  Reliability is the next factor in the hierarchy and is necessary to establish so that Validity can occur.  However, many strategies to improve reliability vitiate validity, causing dependable data to be meaningless to the issue at hand.  If the information is not pertinent to the question posed, there is no validity, even if the information is reliable.  Practicality is the fourth factor and is comprised of all the costs associated with the test.  In recent years, practicality has had inordinate importance in workers’ compensation decision-making, threatening the utility of information collecting processes such as functional capacity evaluation.

The authors of the current paper provide four peer-reviewed scientific references to support their critique of functional capacity evaluation.  Of course, many other studies that reach different conclusions have been published, including several in which I participated.  I would like to attempt to throw more light on this subject by describing my first peer-reviewed study in this area. 

In 1986, a study entitled “Evaluation of Lifting and Lowering Capacity” was published in the Vocational Evaluation and Work Adjustment Bulletin.  We presented data from 132 consecutive functional capacity evaluations, 84% of which were within the workers’ compensation context in California.  Although each of these injured workers was permanent and stationary, only four subjects had received recommendations concerning the amount and type of work that he or she should be able to perform, as distinguished from performance restrictions.  Even when work restrictions were provided they were not useful in planning a return to work, such as, “restricted from heavy lifting and repeated bending and stooping”.  A subset (24) of the evaluations compared the injured worker’s FCE performance to job demands, with 14 of these resulting in a finding that he or she would be able to perform the work indicated.  Based on average case costs at that time, we calculated a net of $2.15 saved for every dollar spent on functional capacity evaluation. 

Perhaps most importantly, the work restrictions of the physician were found to not predict function.  In the most conservative approach possible, only those 34 subjects who had injuries to the lumbosacral spine were considered, with their work restrictions, which resulted in a Spearman Rank Order Correlation of r = -.159 compared with the actual maximum safe and dependable lift demonstrated by the injured worker. 

This paper concludes with, “Use of the results of the evaluation procedures are suggested as an alternative to the physician’s report of work restrictions as a basis of vocational exploration and as a means of job assignment after injury.”  This recommendation is precisely on target 23 years later, and is reflected in the new case law.

As we move into the future, thoughtful people who are concerned about fairness are encouraged to use the Test Factors Hierarchy to evaluate test-based decision-making systems.  One way to look at the most recent case law is that it is an attempt to return to this hierarchy, in which the Practicality of the process is given subordinate importance.  In the paper, a quotation from a letter from Governor Arnold Schwarzenegger to the insurance commissioner begins, “Given our current economic environment…”.  Of course we must consider the economic environment, but we must also respect the tacit contract that is the foundation of all worker’s compensation systems that allows the injured worker to receive fair treatment if he or she is willing to give up the opportunity to sue the employer for damages.  Obviously, this is a system that can work better than the tort system, but requires fair and equitable treatment for all.


