Predictive validity of evidence-based practices in supported employment: a systematic review and meta-analysis

Publication Date12 December 2016
AuthorHelen Lockett,Geoffrey Waghorn,Rob Kydd,David Chant
SubjectHealth & social care,Mental health
Predictive validity of evidence-based
practices in supported employment:
a systematic review and meta-analysis
Helen Lockett, Geoffrey Waghorn, Rob Kydd and David Chant
Helen Lockett is a Doctoral
Student at the Department of
Psychological Medicine,
School of Medicine, The
University of Auckland,
Auckland, New Zealand
and Strategic Policy Advisor
at The Wise Group, Hamilton,
New Zealand
Geoffrey Waghorn is the Head
of Social Inclusion and
Recovery Research at
Queensland Centre for Mental
Health Research (QCMHR),
The Park Centre for Mental
Health, Brisbane, Australia and
Associate Professor (adjunct)
with the School of Applied
Psychology, Griffith University,
Brisbane, Australia.
Rob Kydd is a Professor at the
Department of Psychological
Medicine, School of Medicine,
The University of Auckland,
Auckland, New Zealand.
David Chant is a Consultant
Statistician based in
Launceston, Australia.
Purpose The purpose of this paper is to explore the predictive validity of two measures of fidelity to the
individual placement and support (IPS) approach to supported employment.
Design/methodology/approach A systematic review and meta-analysis was conducted of IPS
programs. In total, 30 studies provided information characterizing 69 cohorts and 8,392 participants.
Predictive validity was assessed by a precision and negative prediction analysis and by multivariate analysis
of deviance.
Findings Fidelity scores on the IPS-15 scale of 60 or less accurately predicted poor outcomes, defined as
43 percent or less of participants commencing employment, in 100 percent of cohorts. Among cohorts with
IPS-15 fidelity scores of 61-75, 63 percent attained good employment outcomes defined as 44 percent or
more commencing employment. A similar pattern emerged from the precision analysis of the smallersample
of IPS-25 cohorts. Multivariate analysis of deviance for studies using the IPS-15 scale examined six cohort
characteristics. Following adjustment for fidelity score, only fidelity score (χ
¼15.31, df ¼1, p o0.001) and
author group ( χ
¼35.01, df ¼17, p ¼0.01) representing an aspect of cohort heterogeneity, remained
associated with commencing employment.
Research limitations/implications This study provides evidence of moderate, yet important, predictive
validity of the IPS-15 scale across diverse international and research contexts. The smaller sample of IPS-25
studies limited the analysis that could be conducted.
Practical implications Program implementation leaders are encouraged to first focus on attaining good
fidelity, then supplement fidelity monitoring with tracking the percentage of new clients who obtain a
competitive job employment over a pre-defined period of time.
Originality/value The evidence indicates that good fidelity may be necessary but not sufficient for good
competitive employment outcomes.
Keywords Fidelity, Severe mental illness, Supported employment, Predictive validity,
Evidence-based practices
Paper type Literature review
The quality of program implementation can influence program effectiveness (Durlak and DuPre,
2008) and this is well known in the context of supported employment for people with severe
mental illnesses (Bond, 2007). There is a critical stage in developing any evidence-based
psychosocial rehabilitation program where it becomes necessary to specify the conditions and
requirements for good program implementation. Without such specifications, any previously
demonstrated program efficacy could erode through poor implementations to the point where,
poor outcomes are misconstrued as ineffectiveness (Bond, 2007). Sometimes implementation
Received 13 December 2015
Revised 28 March 2016
13 June 2016
Accepted 25 July 2016
DOI 10.1108/MHRJ-12-2015-0040 VOL. 21 NO. 4 2016, pp. 261-281, © Emerald Group Publishing Limited, ISSN 1361-9322
PAG E 26 1
specifications evolve into a formal measure of quality of adherence to original program principles
and practices. These are often described as measures of program fidelity (Mowbray et al., 2003).
A measure of program fidelity enables differences in program performance to be compared and
contrasted while c ontrolling for the quality of implem entation. Fidelity measures are a lso useful
for identifying way s in which programs ca n be developed furth er in order to be even mor e
effective. To be useful for these purposes, fidelity measures need to demonstrate adequate
reliability and validity. One particularly relevant property is predictive validity, which measures
the extent that higher program fidelity scores are uniquely associated with good program
outcomes and the extent that lower scores are uniquely associated with poor
program outcomes (Mowbray et al., 2003).
Examples of fidelity measures are found in the individual placement and support (IPS) approach
to supported employment. This is a specialized form of supported employment for people with
severe mental illnesses. IPS has demonstrated efficacy for attaining competitive employment
outcomes (Bond et al., 2012a, 2008). It is a well-defined and extensively researched program,
supported by findings from over 20 randomized controlled trials (Drake and Bond, 2014),
12 systematic reviews, including meta-analyses (Marshall et al., 2014), and a recent Cochrane
review (Kinoshita et al., 2013). These studies consistently show that IPS is significantly more
effective than alternative approaches to vocational rehabilitation in terms of the proportion of
participants commencing competitive employment.
Two IPS fidelity scales have been developed to date. Both scales, along with their supporting
documents, aim to provide a clear specification of the intervention and how to implement its
practices and principles. The first scale, a 15-item version (the IPS-15), was revised in 2008 to 25
items (IPS-25) also known as the supported employment fidelity scale. The latest 25-item version
is supported by an instruction manual on how best to assess the program, administer, and code
the scale (Swanson and Becker, 2013).
The IPS-15 and IPS-25 scales are sufficiently different to require separate examination. Both
scales have a similar structure and both have common and unique questions. In both scales each
item has a five-point (1-5) rating with anchor descriptions for each. A score of 5 represents good
implementation, and a score of 1 indicates poor implementation, or a failure to implement that
practice. The score range on the IPS-15 is 15-75, while the score range on the IPS-25 is 25-125.
The psychometric properties of the IPS-15 include good inter-rater reliability, and good
discriminant validity for distinguishing between IPS and non-IPS programs (Bond et al., 1997).
Predictive validity of the IPS-15 was examined in a meta-analysis of ten studies which reported
employment outcomes and total fidelity scores. Six of ten studies reported an association
between employment outcomes and fidelity. However, across all ten studies the proportion of
variance in employment outcomes explained by fidelity score ranged from 5 to 58 percent (Bond
et al., 2011). This result is problematic and suggests that the predictive validity of the scale is
either unstable or not sufficiently universal, through being too dependent on contextual factors.
It is possible that differences in how employment outcomes are measured account for the varied
strength of prediction. For example, some studies used a point in time measure of participant
employment status rather than tracking a defined cohort over time. The point estimates were
from quarterly reports of the proportion employed among the current caseload (Becker et al.,
2001, 2006). Another study used vocational rehabilitation case closure as a proxy measure for
employment outcomes (Hepburn and Burns, 2007), again without defining the cohort. Yet
another study measured the difference between the employment proportion in the IPS
intervention site and the employment proportion among controls, as a measure of strength of
effect (Burns et al., 2006, 2007; Catty et al., 2008).
The psychometric properties of IPS-25 were examined using a large data set collected from
79 ongoing IPS implementation sites across the USA. Because it was developed more recently to
address the limitations of the IPS-15 scale, the IPS-25 is expected to have greater predictive
validity than the IPS-15. Predictive validity was assessed using total fidelity scores and quarterly
reports of the proportion of the caseload employed at each site. A monotonic positive relationship
was found between total IPS-25 score and the point in time proportions employed (r¼0.34,
p¼o0.01), (Bond et al., 2012b). Although the IPS-25 scale had strengths it did not show better
VOL. 21 NO. 4 2016

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT