Journal of Food and Nutritional DisordersISSN: 2324-9323

Reach Us +1 850 754 6199
All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Research Article, J Food Nutr Disor Vol: 3 Issue: 2

A Systematic Review of Effect Modification in Trials of Behavioral Interventions to Increase Fruit and Vegetable Consumption

Di H. Cross1, Youngmee Kim2*, K.M. Venkat Narayan3, Lance A. Waller4, Rachel E. Patzer5 and Carol J. Rowland Hogue6
1Thomson Reuters, Custom Analytics, Rockville, USA
2University of Miami, Department of Psychology, Coral Gables, USA
3Emory University, Hubert Department of Global Health, Atlanta, USA
4Emory University, Department of Biostatistics and Bioinformatics, Atlanta, USA
5Emory University, Department of Surgery, Atlanta, USA
6Emory University, Department of Epidemiology, Atlanta, USA
Corresponding author : Youngmee Kim
Department of Psychology, University of Miami, 5665 Ponce de Leon Blvd, Coral Gables, Florida 33124-0751, USA
Tel: (305) 284-5439
E-mail: [email protected]
Received: October 12, 2013 Accepted: March 13, 2014 Published: March 17, 2014
Citation: Cross DH, Kim Y, Narayan KMV, Waller LA, Patzer RE, et al. (2014) A Systematic Review of Effect Modification in Trials of Behavioral Interventions to Increase Fruit and Vegetable Consumption. J Food Nutr Disord 3:2. doi:10.4172/2324-9323.1000135


A Systematic Review of Effect Modification in Trials of Behavioral Interventions to Increase Fruit and Vegetable Consumption

Although the health benefits of fruit and vegetable consumption (FVC) has been well documented, the evidence supporting the effectiveness of behavioral interventions to increase FVC has not been as consistent. This review aimed to identify individual and environmental level factors that may explain systematic differences in treatment effects in randomized controlled trials(RCTs) of behavioral interventions to increase FVC through effect modification (EM) and discusses the utility of examining EM in the context of a RCT.

Keywords: Effect modification; Randomized controlled trial; Fruit and vegetable consumption


Effect modification; Randomized controlled trial; Fruit and vegetable consumption


Although chronic disease risk is known to be related to health behaviors such as fruit and vegetable consumption (FVC) [1], poor health behaviors are pervasive. Less than one quarter of the US population engages in FVC at levels recommended for chronic disease prevention [2]. Furthermore, randomized controlled trials (RCTs) of interventions to increase FVC have had limited success [3], suggesting that there is a need to identify alternative ways of improving FVC behavior.
One area that has received increasing attention is the identification of effect modification [4-6] – also called moderation [6] and intervention- or treatment-effect heterogeneity [7,8]. Effect modification (EM), as it will be referred to in this review, is a difference in the association between a variable, E, and an outcome, O, across different levels of an effect modifier, X [9]. This is different from confounding of the relationship between E and O by X, where X is a predictor of the outcome, O, and also associated with the variable E [9,10]. In this latter case, the association between E and O is the same across levels of X. Confounding and EM are also distinct from mediation, where E causes X, and X in turn causes the outcome, O [6].
EM has typically been studied using two methods. The first is stratified analysis, where the association between E and O is reported for each level of X [9]. Meaningful, albeit subjectively determined, differences between these associations would suggest that EM is present. The second method is the examination of statistical interaction. In this case, a statistical model is used to study variation in the dependent variable O using E, X, and the product term between E and X as independent variables. In this method, EM is objectively defined as a statistically significant contribution of the product term between E and X [11].
While both methods have been used to investigate randomized trials, there have been few systematic evaluations of the motivation for such analyses and the subsequent findings. The purpose of this review is threefold: (1) to document EM observed in behavioral intervention trials to increase FVC, (2) to examine the rationale for EM analyses, and (3) to explore the utility of these analyses in improving intervention design and resource allocation.

Materials and Methods

Search Strategy
A search strategy (Appendix 1) was adapted from previous systematic reviews to identify RCTs of behavioral interventions to increase FVC [3,12-14]. Results were restricted to studies performed among adult, non-institutionalized populations, indexed in CINAHL, MedLine, EMBASE, and PsychInfo databases and the Cochrane Central Register of Controlled Trials, and published between January 1, 1990 and December 31, 2008.
The titles and abstracts of all articles identified from the search strategy were evaluated by two reviewers (DHC, REP). Eligibility was determined based on study aims, study population, and intervention type. Studies were excluded if the intervention was a pharmaceutical intervention, if there was an environmental modification or policy change component of the intervention, if the intervention was performed exclusively among smokers, and if the study was designed for participants with specific chronic disease conditions (hypercholesterolemia, diabetes or impaired glucose tolerance, hypertension, cancer survivors, etc.) except for overweight/obese. Studies that recruited from the general population and included some participants with chronic diseases were not excluded from the analysis. No restrictions were made based on participant recruitment or intervention delivery setting. Although most studies were randomized trials (individual- or cluster-randomized), studies with pre- and postintervention measurements only and no control group were not excluded.
Full texts of studies considered to be eligible based on abstracts were retrieved and examined to determine whether individual-level fruit and vegetable consumption were reported for baseline and follow-up evaluations in servings/day, grams/day, or energy-adjusted daily intake, and whether any modification of the intervention effect was examined. Articles reporting only differences in intervention effect across levels of baseline behavior or levels of intervention dose were not included in the analyses. References from included articles were searched for additional eligible articles. Articles that reported only baseline demographic information, baseline FVC, or described study design and motivations of otherwise eligible interventions were entered into the ISI Web of Science database (Thomson Reuters, New York, NY) to identify articles citing the original article and reporting follow-up data. Those that met the inclusion criteria were included in the review.
Data Extraction
Bibliographic information, intervention design and setting, study population and sample size, a description of the intervention and control (if applicable), evaluation time-points, intervention effect, effect modifier examined, and the direction of the effect modification were extracted from all eligible studies. Information reported in the text identifying different articles originating from the same study (study name, sample size, intervention description, etc.) were noted and study characteristics of such articles were reported together.
The direction of effect modification was denoted as (0) for no evidence of effect modification, (+) for a greater intervention effect associated with a greater value of the effect modifier or a greater intervention effect in the non-reference category in comparison to the reference category, or (-) for a smaller intervention effect associated with a greater value of the effect modifier or a smaller intervention effect in the non-reference category of the effect modifier in comparison to the reference category. For articles using statistical interaction to assess effect modification, statistical significance as reported by the authors was used to determine the presence of effect modification. Among articles employing stratified methods, intervention effects with a difference greater than two standard errors of one another were recorded as meaningful effect modification.
Study quality or risk of bias was assessed for each article by examining six sources of potential bias: participant blinding, assessment tools for FVC, randomization success, sample size, loss to follow-up and the appropriateness of statistical analyses. Since participant blinding to the intervention was not possible, and because all assessments of FVC were self-reported, the evaluation of the potential risk of bias focused on four sources of bias: randomization success, sample size, loss to follow-up, and statistical analysis. Studies likely to result in bias from two or more of the above sources were categorized as high risk of bias, while those with one potential source were categorized as medium risk of bias. Articles with little risk of bias from any of the four sources were categorized as low risk of bias.
No attempt was made to examine un-published studies or studies published in conference proceedings. Because of the diversity of study characteristics, and the variability in the effect modifiers measured, no attempt was made to arrive at any summary measures of association.


Full texts were retrieved for 468 eligible articles, of which 162 reported FVC. Among those reporting on FVC, 28 articles (17.3%) with data from 23 distinct trials reported examining EM (Table 1) [15-42].
Table 1: Characteristics of included studies. Note: EM: Effect modification; S: Stratified analysis; PT: Product term; Loss to f/up: Loss to followup; RCT: Randomized controlled trial; cRT: cluster-randomized trial; WIC: Women, Infants and Children; NCIS: National Cancer Information Service; HMO: Health Maintenance Organization; 1: No change in fruit vs. 0.2 pieces/day decline; 0.5 gm decline in vegetables vs. 10.4 gm/day decline in intervention and controls respectively.
Of the 23 trials, 17 were individually-randomized (73.9%) while six (26.1%) were cluster-randomized. Notable studies excluded were the Working Well Trial [43], the Healthy Directions–Small Business Study [44], and WellWorks-2 [45] which all included components of environment or policy change; High-5 [46], Gimme 5 [47], CATCH [48], and Take-5 [49] which examined FVC in children; and one study examining intervention effect across levels of rurality of study sites performed among recipients of congregate meals [50]. Eight studies were rated as low risk of bias, while 16 were rated medium, and four were rated as having a high risk of bias (Table 1).
In all, five trials reported no significant intervention effect at the end of the trial [29,40-42,51]. Of the remaining 18, 16 trials reporting a statistically significant intervention effect of which six reported an intervention effect of 0.5 srv/day or less; one trial did not compare the randomized groups and thus reported only change in FVC in the dietary intervention group [15]; and one trial reported a smaller decline in FVC in the intervention group than in the control group [39].
A total of 39,515 participants were recruited for participation (N range: [32,5,041], median: N=1,359) with loss to follow-up within studies ranging from 0.0% [15] to 73.9% [27]. Study settings included churches, clinics or primary care facilities, health management organizations, the internet, telephone directories, community-based organizations, and worksites. The majority of studies were performed in clinical settings (N=12, 52.2%). Study evaluation ranged from less than one month [42] to 24 months [17,19,21,51] and intervention duration ranged from those involving only a one-time exposure [29,39,42] to those administered over a period of 24 months [51]. With only two exceptions [28,37], interventions with evaluations at or beyond 12 months reported larger intervention effects. Eleven trials were conducted among at-risk populations (low-income, minority, etc.).
The majority of articles reporting on potential effect modifiers focused on demographic variables (Table 2), with 13 studies reported having examined change in FVC by education, 12 by sex, 11 by age, and eight by race/ethnicity.
Table 2: Potential effect modifiers examined: demographic characteristics. Note: 1: 1.2 srv/day greater increase among women in intervention than men in intervention. No comparison with control group. 2: Most effective among white (0.73 srv/day greater in intervention) and other (1.72 srv/day greater increase). May be due to unstable estimates in the “other” group. 3: Intervention most effective among those married (+0.81 srv/day), widow, divorced, or other (0.92 srv/day). Less effective among those single (+0.28 srv/day). 4: Intervention most effective among whites (+0.71 srv/day at 6 months), less effective among Hispanics (+0.28 srv/day at 6 months) and blacks (+0.55 srv/day at 6 months). 5: Alone or with others. 6: Retirement status. 7: More effective among those living alone or with adults (0.64 srv/day) than those with children (0.15 srv/day). 8: Blacks increased -0.39 srv/day in controls, 0.09 in intervention. Whites increased -0.02 srv/day in controls, 0.27 srv/day in intervention. 9: Poverty status. 10: Respondent’s and respondents’ parents’ country of birth. 11: Crowding in household.
Despite the number of articles examining differences by demographic variables, only three studies concluded there was EM by education [16,27,29] (23.1%), two by sex [15,30] (16.7%), five by age [16,19,23,27,40] (45.5%), and one by race/ethnicity [16] (12.5%). All studies concluding that there were meaningful differences in intervention effect were rated as having a medium or low risk of bias, while none rated with a high risk of bias reported significant effect modification. Of the four articles that used stratified analysis to examine demographic variables [15,16,19,27], all (100.0%) suggested differences in intervention effect across at least one variable. In contrast, 15 studies employed statistical significance of a product term to evaluate effect modification, and five [23,29,30,34,40] (33.3%) concluded there was at least one variable across which the intervention effect was significantly different. Although many studies using statistical interaction to examine EM investigated multiple variables (range one to five), only one article [23] found more than one significant source of EM.
After demographic variables, psychosocial variables were next most commonly examined. Stage of change was the most frequently evaluated, with analyses in eight articles [17,23,26-28,32,37,42] reporting on 10 distinct studies (Table 3).
Table 3: Potential effect modifiers examined: psychosocial, social-contextual, health, and food habits. Note: 1: Reported as significant in paper. However, estimated effects are within 2 SEs of one another. 2: About 1 srv/day increase among all stages except maintenance which had no change. 3: Greatest intervention effect among contemplators (~1.6 srv/day greater intervention effect) than pre-contemplators (0.4 srv/day) or preparation (~-0.2 srv/day). 4: 0.5 srv/day greater intervention effect among pre-action than action group.
Five studies [23,26,27,32,37] concluded that there was a meaningful differences in intervention (62.5%). Of these five studies, two [27,37] used stratified analysis. However, the direction of association was inconsistent. Three studies reported an increase in intervention effect with increasing stage of change (from precontemplator to maintenance) [23,27,32] and two reported the reverse [26,37].
There were generally no differences reported by baseline intention to change FVC behavior (examined in two studies) [18,23], and only one [36] of three studies examining baseline motivation [24,36,39] reported a significant difference in intervention effect. In addition, one study examined intervention effect by need for cognition [38] and another examined intervention effect by self-efficacy [34]. Neither found differential intervention effects. One study examining autonomy showed a significantly greater intervention effect with greater baseline autonomy [40].
In addition to demographic and psychosocial variables, studies also examined social-contextual variables and health-related indicators, albeit more rarely. One study examined social networks, social norms and food security and reported no difference in intervention effect observed across levels of these variables [34]. A four studies examined smoking status [16,23,27,31] and three examined baseline BMI status [23,39,51]. Of these studies, only one study reported any difference in intervention effect with a smaller intervention effect among smokers [27]. One study found a greater intervention effect with increasing level of participant food responsibility, but no difference by number of restaurant meals consumed [23].
Investigator-cited justification for examining effect modification varied by the effect modifier examined. Those studies examining psychosocial effect modifiers commonly cited validation of behavioral models as the motivation for the analysis [17,26,28,32,42]. Examination of demographic variables, when explicitly stated, was motivated in determining the generalizability of results [20,33,41], identifying subgroups for subsequent targeting of the intervention [27,41], determining public health importance or impact [30], or simply because such effect modification had been reported in previous studies [25]. One study, performed by Sorensen et al., cited findings from observational studies as the motivation for examining effect modification [34]. For a large proportion of articles examined [15,16,18,19,21,23,29,35,37,51] (N=10, 35.7%), there was no explicitly stated motivation for examining effect modification.


Despite increasing interest in studying EM in RCTs [4,5,7], there are few articles that report on such findings in the behavior intervention literature. To the authors’ knowledge, systematic examination of EM analysis in the effect of interventions to increase FVC has been performed in only one other review article, conducted by Oldroyd et al. [52]. There was no overlap in the articles included by Oldroyd et al. and in this study due to difference in inclusion criteria – in particular, their inclusion of studies conducted among children [46,53,54]. Oldroyd et al. concluded that few articles reported examining intervention effect modification, consistent with our finding that fewer than one out of every five studies reported examining EM.
However, it is possible that these analyses are more commonly performed than they are reported. This is likely because EM may not reach statistical significance. Many studies included in this analysis did indeed report no statistically significant effect modification. However, barring the circumstances where a homogeneous intervention effect may be expected for an intervention designed for and administered within a homogeneous sample, such a lack of heterogeneity may be counter-intuitive given the conventional wisdom that “one size does not fit all” [8,55]. This suggests that the assumption that interventions should have heterogeneous effects is wrong, or that the methods by which heterogeneity is quantified and identified is wrong.
Indeed, these findings suggests that less reliance on statistical significance in favor of careful examination of the magnitude of and patterns in the intervention effect across subgroups may be more likely to lead investigators to accurately conclude that there is meaningful modification of the intervention effect. Because analysis of EM is often conducted as part of a secondary aim rather than as a primary aim of an intervention study, lack of statistical significance may likely be due to inadequate statistical power. It is well known that stratified analysis is likely to be under powered for detecting a significant intervention effect [56]. As an additional disadvantage, stratified analysis requires multiple testing and introduces the possibility of obtaining false positive results5. These reasons are often stated as support for the alternative method: examining the statistical significance of the product term in a statistical model [5]. However, the power for detecting such an effect is dependent on the proportion of participants simultaneously exposed to the intervention and the potential modifier. Again, such an analysis would likely be under powered.
In addition, multiple testing remains a problem given the number of potential modifiers often examined. As such, both methods – examining effect modification through a statistical interaction operationalized as a product term in the model, or through stratified analysis – offer similar disadvantages in traditional trial analyses. However, while the direct statistical testing possible with the analysis of statistical interaction is often stated as an advantage of that method, the use of a single criterion of statistical significance – particularly where power is nearly guaranteed to be less than adequate – oversimplifies the picture. In contrast, stratified analysis demands examination of magnitudes of the intervention effects across subgroups precisely because direct comparison is not automatically made in the analysis. Given limited statistical power, stratified analysis may be more interpretable and therefore more accessible to broader audiences, and more likely to lead to conclusions useful for informing future studies.
However, beyond merely identifying effect modification, there is also a question of motivation. Why is the identification of EM important? What do these analyses tell us about the intervention? What, if anything, do they tell us about the effect modifier?
Although demographic variables were the most commonly studied effect modifiers, few authors explicitly stated a motivation for examining them. In addition, many investigators may have examined effect modification by demographic variables but did not report their results due to statistically null findings. Given null findings, investigators may not report having investigated EM because of the motivation for investigating EM. The most commonly cited reasons for investigating demographic variables were to identify subgroups among which the intervention had a different effect, or to determine whether the estimated intervention effect was generalizable. If no differences in intervention effect were found across demographic variables, then there would be no reason to report having examined effect modification.
If differences in intervention effect are identified across demographic variables, limited resources can then be spent most efficiently by administering the intervention to segments of the population among whom the intervention was most effective. However, this does not address the public health problem of pervasive, poor health behaviors because sub-populations among whom interventions are less effective – often including underserved, or highrisk populations – are precisely the segments of the population in most need of help. Perhaps motivated by these findings, targeted interventions have been designed for such populations often with moderate success. However, beyond targeting sub-groups, little else has been done with the information garnered from effect modification across demographic characteristics. Indeed, this would appear to make sense since these characteristics are not modifiable. However, investigators should be wary, as demographic characteristics may merely serve as proxies for other modifiable characteristics in a more complex causal mechanism [57,58].
Among investigators of psychosocial modifiers, the most commonly cited motivation was testing or validating a behavioral model. In this case, modifying variables are of primary interest – not only to identify groups that will benefit from the intervention in its current form, but also to improve the design of interventions in the future by understanding the mechanisms by which the intervention under investigation did and did not work. This approach is commendable, but has not been taken full advantage of in the literature, its application having been restricted to psycho-social variables. While behavioral models may be a resource for identifying variables predictive of behavior, increasingly recognized is the importance of factors predictive of behavior change [59]. In addition to individuals’ motivations and abilities to execute a behavior, external or environmental factors such as opportunities for behaviors are also important in changing behavior [59,60]. One resource for identifying such characteristics is the observational literature where environmental factors such as access to grocers [61], pricing of produce [62], presence of fast food chains [63], among others [58] have been studied. However, only one study included in this review [64] explicitly cited the observational literature as the motivator for examining potential effect modifiers. Furthermore, this was the only study that examined environmental-level effect modifiers.
Investigators must make better use of the observational literature as a resource for identifying modifiable, environmental-level variables which are potential modifiers of the effects of behavioral interventions. This would aid in designing studies and allocating resources. Furthermore, the identification of environmental-level characteristics may aid in addressing the public health problem of pervasive, poor health behaviors by identifying a means by which existing interventions can be made to be more effective on a population level. In much the same way that physicians may first address a modifying risk factor before prescribing a drug to an individual with elevated risks of adverse effects or attenuated benefits, public health policymakers may address modifiable, environmental-level risk factors on a population level before administering other individual- or policy-level interventions.
Based on these findings, we make four recommendations.
First, we recommend that investigators more consistently report when effect modification is investigated, even when findings are not statistically significant.
Second, we recommend that statistical significance should not be only one criterion by which meaningful effect modification is identified. In particular, patterns and trends in intervention effect or differences in magnitude should be given more weight in light of the necessary, external constraints on power to detect EM in RCTs. Such reporting will document apparent associations and allow for more focused hypotheses in future studies with appropriate power for studying EM. This is consistent with the findings in this review where the potential for EM is more often reported among studies employing stratified analysis (100%), rather than those relying exclusively on statistical significance of an interaction term (33%).
Third, the choice of variables for which effect modification is examined should be guided by specific causal hypotheses. Investigators should be aware of and report their motivation for investigating EM, and identify relevant potential modifiers based on that motivation.
Fourth and last, we recommend that investigators expand the list of potential modifiers to include environmental-level characteristics. This makes it possible not only to identify populations where interventions will be most effective, but also makes it possible to change populationlevel characteristics in order to make interventions potentially more effective in all segments of the population. This can most effectively be accomplished by incorporating the knowledge obtained from the observational literature with data obtained from randomized trials.


Track Your Manuscript

Share This Page