Introduction

Lung cancer remains a major public health challenge in North Carolina, with incidence and mortality rates among the highest in the United States. According to the Centers for Disease Control and Prevention (CDC), the age-adjusted rate of lung and bronchus cancer in North Carolina is 57.3 per 100,000 people, (95% confidence interval [CI]: 56.1–58.6). Lung cancer is the leading cause of cancer-related deaths in the state,1 with significant variation in incidence, stage at diagnosis, and mortality across North Carolina’s 100 counties. The proportion of lung cancer cases diagnosed at a distant stage range from 35.1% to 60.9%, and in 15 counties, more than 50% of cases diagnosed are at an advanced stage.1 This late-stage diagnosis is strongly associated with increased lung cancer mortality at the county level.2

Smoking remains the primary risk factor for lung cancer and is strongly associated with socioeconomic deprivation.3 Evidence suggests that individuals living in socioeconomically disadvantaged communities are more likely to initiate smoking,4 face barriers to cessation,5 and experience higher overall smoking prevalence.6 Socioeconomic disadvantage—characterized by low income, limited education, and poor living conditions—is closely related to social determinants of health (SDOH), which significantly influence health outcomes.7

Trends in lung cancer mortality have demonstrated a persistent annual decrease in lung cancer mortality from 1990 to 2020.8 However, individuals who reside in rural counties have not experienced a decrease in age-adjusted lung cancer mortality to the same degree as other demographic groups.8 Individuals who smoke and reside in rural counties with persistent poverty experience the highest lung cancer mortality.9 Research has rarely explored how the intersection of socioeconomic disadvantage, rurality, and smoking affect the incidence and mortality of lung cancer.

The objective of this study was to examine the association between county-level socioeconomic deprivation, rurality, and lung cancer incidence in North Carolina and to assess the extent to which adult smoking prevalence mediates this relationship. By quantifying the proportion of the deprivation–lung cancer association that can be explained by smoking, we aim to better understand how behavioral risk factors interact with structural determinants to shape the geographic distribution of lung cancer burden.

Methods

This study is a cross-sectional analysis utilizing publicly available county-level data from multiple sources. Demographic variables, including percent Black and percent female, were obtained from the Area Health Resource Files (AHRF, 2020–2021).10 Adult smoking prevalence was obtained from the Robert Wood Johnson Foundation County Health Rankings (2021).11 Cancer incidence data, as well as the rural-urban continuum variable (Urban or Rural, 2023),12 were obtained from the CDC State Cancer Profiles, which provides age-adjusted lung cancer incidence rates (per 100,000 people per year) for 2017–2021, along with average annual counts of total lung cancer cases and late-stage diagnoses in North Carolina. One county was excluded from the analysis due to suppression of late-stage cancer counts by the CDC due to a low number of cases, leaving a total of 99 counties for analysis.

Late-stage lung cancer was defined as cases diagnosed at the regional or distant stage, consistent with CDC and Surveillance, Epidemiology, and End Results (SEER) Program staging conventions.13 The Rural-Urban Continuum Codes (RUCC) variable obtained from the CDC State Cancer Profiles website is based on the U.S. Department of Agriculture’s nine-point Rural–Urban Continuum Codes, with RUCC codes 1–3 classified as Urban and codes 4–9 as Rural. All data used in this study are publicly available, and in accordance with guidance from the Office for Human Research Protections and the University and Medical Center Institutional Review Board of East Carolina University, this work was deemed not human subjects research under 45 CFR 46.

The primary exposure was the County Deprivation Index (CDI), a validated composite measure of socioeconomic disadvantage that incorporates indicators of income, education, and employment.14 Higher CDI values indicate greater socioeconomic deprivation. The CDI, recently validated in a national study, demonstrated strong predictive validity for multiple health outcomes—including cancer and lung cancer mortality—and outperformed other established indices when explaining county-level variation in health disparities.

The primary outcome variables were: 1) age-adjusted incidence rate of all-stage lung cancer, per 100,000 population (2017–2021); and 2) age-adjusted incidence rate of late-stage lung cancer (regional or distant stage), per 100,000 population (2017–2021). Both outcomes were modeled as continuous variables and reflect the overall burden of disease and the extent of delayed diagnosis at the county level. Age-adjusted rates account for differences in population-age structures across counties.

Continuous variables were summarized using the mean and standard deviation (SD), overall and stratified by urban versus rural counties. Group comparisons were conducted using general linear models. To estimate the association between socioeconomic deprivation and lung cancer incidence, we fit linear mixed-effects models using PROC MIXED. We included a random intercept for rural-urban classification (RUCC) to account for differences in baseline lung cancer incidence between rural and urban counties. All models were adjusted for percent Black and percent female and were fit separately for all-stage and late-stage lung cancer incidence.

We conducted an exploratory causal mediation analysis using PROC CAUSALMED to evaluate whether adult smoking prevalence statistically mediates the association between CDI and lung cancer incidence. Although labeled as “causal,” the analysis was not designed to establish a definitive cause-and-effect relationship. Rather, it serves as a statistical decomposition of the total effect into direct and indirect pathways under a set of strong assumptions, including no unmeasured confounding between the exposure, mediator, and outcome. Therefore, results should be interpreted as hypothesis-generating and descriptive of potential mechanisms, not as confirmation of mediation in a causal sense.

CDI was specified as the exposure, smoking prevalence as the mediator, and age-adjusted incidence as the continuous outcome. Models assumed a normal distribution and identity link for both the mediator and outcome. Covariates in both the mediator and outcome models included RUCC, percent Black, and percent female.

We estimated the total effect, natural direct effect (NDE), natural indirect effect (NIE), and the percentage of the total effect mediated by smoking prevalence. Confidence intervals were derived using 1000 bootstrap resamples with a fixed random seed. Results are presented, summarizing fixed-effect estimates from mixed-effects models and decomposed mediation effects for each outcome.

All statistical tests were two-sided, and significance was defined as P < .05. Analyses were performed using SAS Statistical Software version 9.4 (SAS Institute, Cary, NC).

Results

The average annual age-adjusted incidence rate of all-stage lung cancer across these counties was 66.5 cases per 100,000 (SD: 10.6), with rates ranging from 44.6 to 91.8 cases per 100,000. The average annual age-adjusted incidence rate of late-stage lung cancer was 45.4 per 100,000 people (SD: 4.8), with a range of 26.7 to 63.5 per 100,000 people. Table 1 summarizes and compares key county-level characteristics of the study cohort in North Carolina.

Table 1.Comparison of Rural and Urban Counties in North Carolina: Lung Cancer Burden, Demographic, Socioeconomic, and Healthcare Access
Variable Rural Counties Urban Counties P  value Overall
Number of counties 54 45 n/a 99
Population (2019 estimate)a 2,266,520 8,217,548 n/a 10,484,068
Mean (SD) Mean (SD) Mean (SD)
Annual age-adjusted incidence rate of lung cancer (per 100,000 people) 68.0 (10.5) 64.7 (10.6) .1242 66.5 (10.6)
Annual age-adjusted incidence rate of late-stage lung cancerb (per 100,000 people) 46.7 (7.6) 43.8 (8.6) .0794 45.4 (4.8)
Percent late-stage lung cancer 68.6 (4.8) 67.6 (4.9) .3107 68.1 (4.8)
Percentage of adult population who smokec 23.3 (2.4) 20.4 (2.7) < .0001 22.0 (2.9)
Percentage of the population that is female 50.8 (1.6) 51.1 (1.4) .3581 51.0 (1.5)
Percentage of the population that is Black/African American 22.2 (18.9) 18.3 (13.2) .2307 20.4 (16.6)
CDI (County Deprivation Index) 0.668 (0.697) –0.104 (0.662) < .0001 0.317 (0.781)

a. CAINC1 Personal Income Summary: Personal Income, Population, Per Capita Personal Income file, U.S. Bureau of Economic Analysis (BEA), Regional Economic Measurement Division. https://apps.bea.gov/regional/downloadzip.cfm
b. Late stage is defined as cases determined to be regional or distant.
c. County Health Rankings and Road Maps. 2021 County Health Rankings: State Reports. University of Wisconsin Population Health Institute; Robert Wood Johnson Foundation. Accessed March 18, 2025. http://www.countyhealthrankings.org/app/north-carolina/2019/overview

No statistically significant differences were observed between rural (n = 54) and urban counties (n = 45) in either all-stage (68.0 versus 64.7 cases per 100,000; P = .1242) or late-stage (46.7 versus 43.8 cases per 100,000; P = .0794). Demographic characteristics were similar across rural and urban counties. The smoking prevalence and socioeconomic conditions differ substantially. Rural counties had a significantly higher percentage of adults who smoke (23.2% versus 20.4%; P < .0001) and greater socioeconomic deprivation, as indicated by a higher mean CDI score (0.668 versus –0.104; P < .0001). Despite these differences, the proportion of lung cancer cases diagnosed at a late stage was similar between rural and urban counties: 68.6% (SD: 4.8%) versus 67.6% (SD: 4.9%), respectively (P = .3107). This suggests that although rural counties experienced a higher rate of late-stage lung cancer incidence, the proportion of cases diagnosed at an advanced stage did not differ significantly by rurality.

Figure 1 presents county-level choropleth maps of North Carolina depicting the spatial distribution of the CDI, adult smoking prevalence, and all-stage lung cancer incidence. As shown in the top panel, socioeconomic deprivation, measured using the CDI,14 was highest in the eastern and south-central regions of the state. These same regions also exhibited elevated smoking prevalence (middle panel), with many counties reporting adult smoking rates exceeding 25%. The final panel shows the geographic distribution of all-stage lung cancer incidence, with higher rates similarly concentrated in eastern and southeastern counties. A strong positive correlation was observed between CDI and smoking prevalence (Pearson r = 0.851; P < .0001), indicating that more socioeconomically disadvantaged counties had substantially higher smoking rates.

Figure 1
Figure 1.County-Level Choropleth Maps of North Carolina Depicting the Spatial Distribution of the County Deprivation Index, Adult Smoking Prevalence, and All-Stage Lung Cancer Incidence

Multivariable Regression

County-level socioeconomic deprivation, as measured by the County Deprivation Index (CDI), was positively and significantly associated with both all-stage (β = 10.54, SE = 1.58; P < .0001) and late-stage lung cancer incidence (β = 8.4, SE = 1.2; P < .0001). These findings indicate that more socioeconomically disadvantaged counties experienced higher lung cancer incidence rates.

When smoking prevalence was added to the model, the effect estimates for CDI were substantially attenuated and no longer statistically significant. For all-stage lung cancer, the CDI estimate decreased to β = 3.87 (SE = 3.16; P = .224), while smoking prevalence showed a positive and statistically significant association (β = 173, SE = 0.71, P = .0168). A similar pattern of attenuation was observed for late-stage incidence.

Two-way interactions between CDI and smoking prevalence were not statistically significant for either all-stage (P = .460) or late-stage models (P = .576), supporting a uniform association across levels of smoking.

These results are consistent with the hypothesis that smoking prevalence may lie on the causal pathway linking socioeconomic deprivation to lung cancer burden.

Mediation Analysis

Exploratory causal mediation analysis revealed that smoking prevalence explained a substantial portion of the association between socioeconomic deprivation and lung cancer incidence. For all-stage lung cancer, a one-unit increase in the CDI was associated with an increase of 10.95 cases per 100,000 (95% CI: 8.33 to 13.51; P < .0001). Of this total effect, 6.73 cases per 100,000 (95% CI: 1.14 to 13.44; P = .0126) were attributable to smoking prevalence, representing 61.5% of the total effect (P = .0177). The remaining direct effect of CDI was 4.22 cases per 100,000 (95% CI: –3.01 to 10.68; P = .171). Although not statistically significant, the upper bound of the confidence interval suggests that up to 10.68 cases per 100,000 may be attributable to deprivation independent of smoking.

For late-stage lung cancer incidence, CDI was associated with an increase of 8.81 cases per 100,000 (95% CI: 6.89 to 10.79; P < .0001). The mediated effect through smoking accounted for 5.36 cases per 100,000 (95% CI: 1.11 to 9.80; P = .0073), or 60.9% of the total effect (P = .0106). The direct effect of CDI was 3.45 cases per 100,000 (95% CI: –1.62 to 8.36; P = .131), again with an upper bound suggesting a potential independent contribution of deprivation to lung cancer burden beyond smoking.

Discussion

Principal Findings

This study examined the relationship between county-level socioeconomic deprivation, rurality, and adult smoking prevalence and lung cancer incidence in North Carolina. Because the incidence of lung cancer is highly correlated with individuals who reside in rural counties with persistent poverty,9 we sought to understand how smoking prevalence may act as a potential mediator in lung cancer incidence. We found that counties with higher deprivation, as measured by the CDI,14 experienced significantly higher rates of both all-stage and late-stage lung cancer. These patterns were consistent across rural and urban settings. Utilizing mediation analysis, adult smoking prevalence accounted for approximately 61.5% of the CDI effect on all-stage lung cancer incidence and 60.9% for late-stage incidence.

Although smoking is recognized as a major contributor to the development of lung cancer, the incidence of lung cancer has been demonstrated to be associated with persistent poverty9,15 and rurality,9,16 although the association of rurality is more controversial.16 To address the potential interactions among these major variables on the incidence of lung cancer, we utilized mediation analysis to disentangle the complex nature of the causal pathway(s). Mediation analysis is utilized to explore unrecognized causal pathways to explain an observed outcome variable. As anticipated, adult smoking prevalence explained a substantial portion of this association. Notably, however, the total effect of deprivation remained strong, and the upper bounds of the direct effect confidence intervals suggest that up to 10.68 cases per 100,000 for all-stage and 8.36 for late-stage disease may be attributable to deprivation independent of smoking. These findings underscore smoking as a key behavioral pathway through which structural disadvantage contributes to lung cancer burden, while also pointing to the broader, multifactorial influence of deprivation on cancer risk.

We found no significant interaction between CDI and smoking prevalence, supporting a uniform mediation pathway across levels of deprivation and strengthening the validity of our mediation estimates. The persistence of a potential direct effect, even after accounting for smoking, suggests that other mechanisms such as structural disadvantages—occupational hazards,17,18 residential environmental exposures,19 and housing instability20—along with barriers to early detection21 contribute to the incidence of lung cancer.

From a public health perspective, our findings reinforce the importance of targeted tobacco control efforts in socioeconomically disadvantaged areas. However, lasting progress will require interventions that address the upstream determinants of smoking behavior, including economic hardship,22,23 chronic stress,23 and limited access to cessation services.5 Structural intervention such as regulating tobacco retail density,24 enhancing Medicaid coverage for cessation programs,25,26 and investing in community-based prevention27 are likely to be most effective in reducing both smoking rates and lung cancer disparities.

In the present analysis, no statistically significant differences were observed in either overall or late-stage lung cancer incidence between rural and urban counties. The proportion of late-stage diagnoses was also similar, suggesting that disease stage at presentation is comparable across geographic settings. These findings indicate that geographic classification alone (rural versus urban) may not adequately capture the structural and socioeconomic factors driving lung cancer burden in North Carolina. Instead, the County Deprivation Index—a measure of area-level socioeconomic disadvantage—appears to provide a more sensitive indicator of disparities in lung cancer incidence. Interventions to reduce lung cancer burden should therefore prioritize addressing socioeconomic deprivation and tobacco exposure, rather than focusing solely on geographic location.

Study Limitations

This study has several limitations. First, its ecological design precludes individual-level inference. The associations observed at the county level may not reflect relationships at the individual level due to the potential for ecological fallacy—the error of making inferences about individual behaviors or traits based on group-level data—which can lead to incorrect conclusions about health outcomes. Accordingly, while we used causal mediation methods for analytic purposes, the results should be interpreted as statistical decompositions of potential pathways rather than evidence of individual-level causation.

Second, although the CDI is a validated measure of socioeconomic disadvantage, it may not fully capture all relevant structural determinants of health—such as housing conditions, environmental exposures (e.g., tobacco smoke or radon), or occupational risks (e.g., carcinogenic exposures)—that may also influence lung cancer incidence. We explored the availability of county-level data for these potential confounders but found that comprehensive, comparable data across all counties was unavailable.

Third, our causal mediation analysis assumes a unidirectional pathway from socioeconomic deprivation to smoking prevalence to lung cancer incidence. While supported by theoretical and empirical literature, this assumption may oversimplify a more complex interplay of shared upstream determinants, cultural norms, and policy environments. The indirect effect of smoking should therefore be interpreted as a statistical decomposition, not definitive proof of mediation.

Fourth, smoking prevalence was measured at the county level using data derived from population estimates, which may obscure within-county variation and mask local policy or cultural effects. Moreover, the county-level smoking prevalence utilized in this analysis may not accurately reflect historical smoking prevalence patterns, as the long latency period of tobacco-associated lung cancer means that current prevalence may differ substantially from exposure levels decades prior to diagnosis.

Fifth, our analysis focused on lung cancer incidence rather than mortality or survival, which may be differentially affected by treatment access, stage at diagnosis, and comorbid conditions. Future research should evaluate whether the relationships observed here extend to outcomes across the cancer care continuum, including stage-specific survival, treatment initiation, and mortality. Finally, although sensitivity analyses were not performed, future studies could examine whether these associations persist using alternative data sources or different years of smoking prevalence estimates to assess the temporal stability of our findings.

Despite these limitations, this study has several important strengths. We applied a validated, composite measure of socioeconomic deprivation14 and used robust linear mixed-effects and causal mediation models to examine the relationship between structural disadvantage, smoking prevalence, and lung cancer incidence at the county level. While smoking is a known risk factor, our analysis quantifies the extent to which it mediates the association between deprivation and cancer burden, revealing that over 60% of the effect may be explained by smoking—but also that a substantial proportion remains unexplained. To our knowledge, this is one of the first studies to apply causal mediation methods to ecological cancer incidence data, offering a novel approach to disentangle behavioral and structural drivers of cancer disparities. Our findings underscore the value of combining analytic rigor with policy-relevant geographic data to guide targeted, multi-level public health interventions.

Conclusion

These results highlight the dual importance of addressing both behavioral and structural determinants of lung cancer. While reducing smoking in high-deprivation counties remains essential, additional efforts are needed to understand and intervene upon non-smoking-related pathways. Future research should explore environmental, occupational, and health care system contributors to lung cancer risk and examine how these relationships influence outcomes such as stage at diagnosis, survival, and mortality. Such research may also help identify at-risk populations beyond current smokers in North Carolina who do not meet current US Preventive Services Task Force recommendations for lung cancer screening28 but nonetheless benefit from low-dose computed tomographic screening.29

The intersection of high smoking prevalence and high socioeconomic deprivation, as measured by the CDI, may warrant targeted interventions in North Carolina. Incorporating CDI data into state public health planning and Medicaid-expansion initiatives could guide allocation of screening resources, tobacco cessation efforts, and community outreach, ultimately reducing disparities in lung cancer outcomes across the state.


Financial support

This study was funded, in part, by the Brody Brothers Endowment Seed/Bridge Grant at East Carolina University, Greenville, North Carolina, USA.

Disclosure of interests

No relevant conflicts.