One of the earliest reports of occupational cancer was by Sir Percival Pott in London in 1775. Around the start of the Industrial Revolution, Sir Percival Pott attributed an excess incidence of squamous cell carcinoma of the scrotum among young chimney sweeps to soot. Later studies confirmed that the chemical agents in soot—polycyclic aromatic hydrocarbons, particularly benzo-(a)-pyrene—were associated with scrotal cancers.1 An identical syndrome was recognized among a group of coal tar and paraffin workers in the latter part of the 19th century. By the first half of the 20th century, there were several reports of respiratory cancer among those working in settings such as nickel refineries, coal carbonization processes, and asbestos products manufacturing.2 Since then, researchers have expanded the list of known or suspected human carcinogens, and widespread concern about the toxicity of asbestos has grown among asbestos workers and members of the general public.3–6 Increased concern regarding the association between the workplace and cancer has prompted innumerable investigations to determine whether specific cancer types among workers may have a common cause or may be a coincidental occurrence.7

In 2017, the International Agency for Research on Cancer (IARC) identified 47 occupational carcinogens as cancer hazards in the workplace, an increase from 28 occupational carcinogens in 2004.8 Cancers are more likely to develop after occupational exposures to the following carcinogens: polycyclic aromatic hydrocarbons (PAHs), heavy metals, tobacco smoke, silica and asbestos dusts, radiation, and chemical by-products of certain industrial processes.9 Several occupations are characterized as having high cancer risks, including construction work; firefighting; furniture making; painting; agriculture work; working in the iron, steel, coal, and rubber industries; and shoe manufacture or repair.9 Because of the specialized and repetitive nature of these types of work, occupational exposures often involve chronic and repeated contact with harmful substances at higher doses than the average consumer would normally encounter.7 For example, firefighters appear to have higher rates of respiratory, digestive, and urinary cancers when compared to the general US population, likely due to repeated exposure to combustion by-products and fire-extinguishing chemicals.10 According to the International Association of Fire Fighters, cancer was responsible for two-thirds of line-of-duty deaths from 2002 through 2019.11 In recognition of the association between the firefighting occupation and cancer, North Carolina Governor Roy Cooper signed House Bill 535 (known as The Firefighters Fighting Cancer Act of 2021) into law, sending much needed financial help to firefighters diagnosed with cancer.12 However, Florida’s cancer registry recently reported on the challenges of I/O reporting among firefighters, especially female firefighters.13

The Role of the North Carolina Central Cancer Registry

The North Carolina Central Cancer Registry (NC CCR) is a population-based reporting system that serves as the sole repository of complete cancer incidence data for the State of North Carolina. Population-based cancer surveillance is critical for cancer control activities aimed at reducing the morbidity and mortality of cancer, the second leading cause of death in the United States.14 The NC CCR collects, processes, and analyzes data on cancer cases diagnosed among North Carolina residents. This information furthers our understanding of cancer and is used to develop strategies and policies for its prevention, treatment, and control. The availability of data on cancer in the state allows health care providers, public health officials, epidemiologists, legislators, researchers, medical students, and others to analyze demographic and geographic factors that affect cancer risk, early detection, and effective treatment of cancer patients.

I/O data specifically are used to estimate cancer burden by industry and occupation, identify industries and occupations at high risk for cancer, generate hypotheses about occupational risk factors for further research, guide etiologic and intervention research and practices, serve as additional measures of socioeconomic status, and help identify worksite-related groups in which cancer screening or prevention activities may be beneficial. The data also help determine where early detection, educational, and other cancer-related programs should be directed. Partners developing these programs include the North Carolina Cancer Prevention and Control Branch, the North Carolina Advisory Committee on Cancer Coordination and Control, the North Carolina Comprehensive Cancer Control Program, and the North Carolina Breast and Cervical Cancer Control Program. The NC CCR plays an important role in developing data-driven objectives for the North Carolina Comprehensive Cancer Control Action Plan15 and continues to serve as the key source of population-based data to evaluate cancer control efforts in North Carolina. Cancer is a reportable disease in every state in the United States. In North Carolina, cancer has been a reportable disease since 1947. North Carolina General Statute, Chapter 130A, Article 7 specifies that all health care facilities, hospitals, physician offices, and providers that detect, diagnose, or treat cancer or non-malignant brain or central nervous system tumors must report eligible cases to the NC CCR within six months of diagnosis.16 The Cancer Registries Amendment Act, US Public Law 102-515 (1992), requires the collection of “information on the industrial or occupational history of the individuals with the cancers, to the extent such information is available from the same record”.17 For each diagnosis of cancer, the NC CCR collects detailed information on the diagnosis, such as the anatomic site of the tumor, stage at diagnosis, cell type of the cancer, and first course of treatment following initial diagnosis. The NC CCR also collects demographic information on each patient, such as age at diagnosis, gender, ethnicity, race, residence at diagnosis, place of birth, occupation, and industry.

The NC CCR has contributed data to identify and verify cancer cases in longitudinal cohort studies that examine environmental and workplace exposures that may be associated with cancer risk. Two such studies with widely published research are the Agricultural Health Study (AHS)18–20 and the World Trade Center (WTC) Health Program.21 AHS researchers noted links to kidney cancer and aggressive prostate cancers with pesticide use18–20 and suggested that exposure to aromatic amine pesticides such as imazethapyr may also increase the risk of colon and bladder cancers among farmers.19 After the WTC attacks on September 11, 2001, the WTC Health Program Registry was formed to provide an avenue for long-term research on individuals who self-identified as being exposed to the 9/11 disaster.21

Epidemiologic investigations are often designed around a suspected link between a specific occupation and certain types of cancer. Individuals within a given occupation are identified and then tracked to determine if cancer develops. Cancer registries are pivotal to supplying data on cancer diagnoses for these studies through linkages with the study cohort. Some health care providers understand the importance of collecting industry and occupation (I/O) data from patients and reporting this information to cancer registries. However, when I/O information is not reported, missing I/O data may impede the identification of other possible occupational cancer risks. If I/O data in cancer registries were more complete, researchers could use the surveillance data in place of a population-based retrospective cohort study to find out what occupations/industries are represented among patients in the cancer registry. Currently, it is difficult to conduct hypothesis-driven occupational health research using cancer registry data to study patients with known cancer.

Study Purpose

The objective of this paper was to examine and summarize industry and occupation (I/O) data reporting completeness of the top cancer types among patients in North Carolina. To our knowledge, no peer-reviewed study to date has used quantitative data to evaluate the completeness of I/O data collected by the NC CCR.

Methods

The patient’s usual industry and occupation (I/O) are required fields for every case reported to the NC CCR. Usual I/O are collected using two mutually exclusive free-text fields. Usual industry is defined as the primary activity conducted by that business. Data reporters are asked to identify the relevant component if an industry performs more than one component, such as a business that conducts manufacturing, wholesale, retail, and service activities. If the primary activity of a business where the patient worked is unknown, the name of the company is recorded. Usual occupation is defined as the kind of work performed during most of the patient’s working life before a diagnosis of cancer. If the usual I/O are not found in the medical record documentation, the current I/O may be recorded. If no information is available, the term “Unknown” is recorded.

The authors evaluated the percentage of missing I/O data as a measure of completeness. Industry or occupation data specified as “Unknown,” “No information,” “Not available,” “Not documented,” “Not on file,” “None,” “Not listed,” and blank data fields were considered as “missing” (I/O data not reported). The ‘Industry Reported’ frequencies and percentages (Table 1) were computed for both 2020 and 2021 diagnosis years. To determine the number of cases for which the industry was reported, the number of cases for which the industry was missing was subtracted from the total number of cases for the given year of diagnosis. A similar procedure was used for the ‘Occupation Reported’ variable using the frequency of ‘Usual Occupation.’

 

To promote consistency in the collection of I/O terms, the NC CCR requires that data reporters follow the guidelines specified in A Cancer Registrar’s Guide to Collecting Industry and Occupation.22 This guide specifies consistent terminology for special circumstances such as minors, homemakers, military personnel, self-employed, retired individuals, and when the usual occupation or industry is not known. The 2021 NC CCR I/O cancer data was evaluated to determine the percent of top 10 cancer types diagnosed that have complete I/O data. At the time of analysis, 2021 was the most recent diagnosis year finalized for data release and use. The cancer types evaluated were cancers of the colon, lung, prostate, breast, bladder, kidney, corpus uteri, and pancreas; melanoma; and non-Hodgkin’s lymphoma. Inclusion criteria included 1) individuals diagnosed with cancer and 2) individuals who were at least 15 years old, and therefore eligible to work according to North Carolina labor laws.23

North Carolina Vital Statistics started collecting I/O fields on death certificates in 2020. We conducted a subanalysis to examine the potential for using death certificate data to improve the completeness of cancer data. Cancer data from the NC CCR diagnosed from 1995 to 2021 were linked to death certificate data using LinkPlus, a probabilistic linkage software, for deaths occurring from 2020 through 2022. The key matching variables used were first name, last name, middle name, social security number, date of birth, race, and address at the time of diagnosis. I/O data were analyzed for completeness separately for each data source. All data analyses were conducted using SAS 9.4.

Results

Data completion was low, with 18% to 47% of patients having occupation data and 19% to 51% of patients having industry data. For the cancer types evaluated, the maximum age was 102 years, and only 0.22% were aged 18 years or younger, largely eliminating younger individuals who may not have entered the workforce as a contributor to missing I/O data. For 2020 and 2021 data, more than 50% of the cases for the cancer types evaluated in the NC CCR database were missing Usual Occupation data (Figure 1). The cancer type with the highest percentage missing in the Usual Occupation field was melanoma (82%). Similarly, greater than 50% of the cases were missing Usual Industry data for the cancer types evaluated. The exception was for female breast cancer (49%), as shown in Figure 1. Melanoma accounted for the highest percent (81%) of missing I/O data within the Usual Industry field.

 

When linking cancer records from NC CCR with death certificate records from NC Vital Statistics, 20,018 deaths (18%) out of 109,677 total deaths in 2020 were matched with cancer records. In 2021, 20,225 (16%) out of 122,066 total deaths matched; In 2022, 20,367 (17%) out of 116,406 total deaths matched with cancer records from NC CCR. I/O were included in about 41% of the 2020 death records, increasing to 99.9% in the 2021 and 2022 death records following the implementation of the North Carolina electronic death record system.

Discussion

Our evaluation of NC CCR data shows that the completeness of I/O data varies greatly by cancer type. Melanoma had the lowest percent of completeness. A large percentage of melanoma cases in NC CCR data are reported by physician’s offices only, where collecting I/O from the patient may not be priority. All health care providers can play a role in improving the completeness of I/O data by including this information in patients’ health records.

NC CCR’s ability to contribute to studies on occupational exposure and cancer risk relies on complete cancer case reporting and accurate I/O information. Though required for all cancer diagnoses, the collection of I/O data from the patient is not standard practice by health care providers, hindering the ability to accurately study the relationship between occupation and cancer risk. As I/O is not a standard data field in the patients’ health records, it is often not available to the data reporter for reporting to the NC CCR.

I/O information is usually extracted through a thorough review of physician or nurse notes within the health record. I/O data are more likely to be noted in a patient’s health record if the provider or patient believes that the cancer development is related to an occupational exposure. Selective reporting may create a biased picture of the actual relationship between cancer outcomes and occupation, where evidence for well-known associations between occupational exposures and cancer outcomes may be reinforced.

New etiological discoveries may be obscured for cancers with a rapid progression and high mortality such as pancreatic cancer. Long-term cancer cohort studies are very time consuming, expensive, and can be burdensome to already suffering patients. Meanwhile, if cancer registries can accurately collect I/O for all cancer patients, cancer researchers would have a wealth of data collected with minimal staff and patient burden. Having a complete picture of cancer patients’ exposures would allow researchers to gain a deeper understanding of potential causal factors for cancer types for which we have weak or uncertain causal information, such as for pancreatic cancer.

Collecting information on a patient’s I/O is also helpful for providers treating and preventing workplace exposures. For example, several researchers reported that shift work has been identified as a risk factor for some forms of cancer, specifically for breast cancer.24–26 An IARC Working Group reported data from 8 studies designed to assess the relationship between shift work and the risk of breast cancer. The data revealed a modest increase among women who worked night shifts for an extended period of time, as compared to women who worked daytime hours.25 Several occupations impacted by shift work include cosmetologists, flight attendants, nurses, and agricultural workers.26

By asking about a cancer patient’s I/O, we can get better insight into their exposure and the conditions under which it occurred, which may impact their treatment options. Even before a cancer diagnosis, gathering information on a patient’s I/O can help a provider understand where to focus patient education for cancer prevention and can lead to early detection by alerting the provider to a need to remain vigilant regarding the development of related cancers.

Resources and Solutions

Armenti and colleagues (2010) demonstrated that focused training for registrars on the importance of collecting I/O data improves cancer registry I/O data completeness [27]. The North Carolina Division of Public Health is offering free, archived trainings with contact hour credits for health care providers on the collection of I/O data fields (see the North Carolina Division of Health and Human Services, Division of Public Health’s Occupational and Environmental Epidemiology webinars and trainings at https://epi.dph.ncdhhs.gov/oee/trainings.html) through a partnership with the Wake Area Health Education Center (AHEC). Additionally, a tip sheet on the nuances of collecting this data is available (see the Division of Public Health’s Tip Sheet for Collecting Occupation and Industry Data at https://epi.dph.ncdhhs.gov/oee/oii/docs/TipSheet_IndOccupation_FINAL.pdf).

Electronic health records (EHR) vendors also have a role in this data quality improvement initiative; they can routinely include I/O as essential demographic fields in their EHR platforms. Having a dedicated place to enter this information not only emphasizes the importance of collecting these fields, but also saves time for the data extractors in locating the information without having to search unwieldy notes fields.

Limitations

The data presented here only examine the completeness of the top 10 cancer types, which the NC CCR focuses on for regular, in-depth analyses. Diseases such as mesothelioma and silicosis are highly associated with occupational exposure but are rare in North Carolina. Examining the completeness of I/O for mesothelioma, a cancer known to be related to occupational exposure to asbestos, may yield an interesting comparison for which one would expect near 100% compliance with collecting these variables.

In addition, since cancer cases are not reported to NC CCR until the patient has completed their diagnosis and treatment management plan, we only conducted the analysis on 2020 and 2021 cancer incidence data. Fortunately, I/O are now being collected on death certificates in North Carolina due to the 2020 implementation of the NC Database Application for Vital Events (DAVE), an electronic vital record reporting system. The National Institute for Occupational Safety and Health’s (NIOSH) National Occupational Mortality Surveillance (NOMS) program advocates for the reporting of I/O data on death certificates. However, a prior study suggested that 30%–50% of I/O data on death certificates were misclassified compared to I/O collected prior to death via interviews with cancer patients [28]. An examination of agreement between current employment at the time of death and employment at the time of cancer diagnosis would be an important next step to determine whether death certificate data are reliable substitutes in the absence of cancer registry data.

Usual or longest period of employment may be more relevant to disease development and progression than employment at time of death or even diagnosis. Other exposures may occur from hobbies, volunteer/ unpaid activities, and temporary or part-time work assignments, which may also impact a person’s exposure profile. Lastly, some missing information on I/O may be informative, such as whether the person is not working due to being retired, a homemaker, a student, disabled, or unemployed. On the other hand, a blank field gives no information about the person’s exposures.

Conclusion

For researchers to identify an increased risk of work-related, adverse cancer outcomes, I/O completeness in cancer registry data needs to be improved. Patient registration, clinical staff, and providers should routinely ask cancer patients about their I/O and record it in the EHR. Vendors should include I/O as essential demographic fields in their EHR platforms to promote uniform categorization and data retrieval. Data reporters should apply best practices to improve record review procedures and documentation completeness in their cancer data reporting. Training resources are now available to the healthcare community on how to collect I/O data. Linkages with external occupational data sources continue to be essential for conducting studies among at-risk populations. For researchers seeking supplemental data on I/O of people diagnosed with cancer, death records may be a valuable source of I/O data. A more complete understanding of population-level cancer trends by I/O can inform evidence-based occupational health interventions, regulations, and policies to prevent occupational cancers and make workplaces safer for everyone in North Carolina.


Acknowledgments

The collection and analysis of cancer registry data was supported by funding from the Centers for Disease Control and Prevention (CDC) National Program of Cancer Registries (NPCR) Award Number NU58DP007121-02. The project described was also supported by Award Number 5 NU50CK000530-04-00 from the National Institute of Occupational Safety and Health (NIOSH) through the CDC. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIOSH/CDC.

Declaration of interests

The authors declare no conflicts of interest related to this manuscript.