A New Method of Identifying Pathologic Complete Response After Neoadjuvant Chemotherapy for Breast Cancer Patients Using a Population-Based Electronic Medical Record System.

TitleA New Method of Identifying Pathologic Complete Response After Neoadjuvant Chemotherapy for Breast Cancer Patients Using a Population-Based Electronic Medical Record System.
Publication TypeJournal Article
Year of Publication2023
AuthorsWu G, Cheligeer C, Brisson A-M, Quan MLynn, Cheung WY, Brenner D, Lupichuk S, Teman C, Basmadjian RBarkev, Popwich B, Xu Y
JournalAnn Surg Oncol
Volume30
Issue4
Pagination2095-2103
Date Published2023 Apr
ISSN1534-4681
KeywordsAlgorithms, Breast Neoplasms, Electronic Health Records, Female, Humans, Neoadjuvant Therapy, Retrospective Studies
Abstract

BACKGROUND: Accurate identification of pathologic complete response (pCR) from population-based electronic narrative data in a timely and cost-efficient manner is critical. This study aimed to derive and validate a set of natural language processing (NLP)-based machine-learning algorithms to capture pCR from surgical pathology reports of breast cancer patients who underwent neoadjuvant chemotherapy (NAC).

METHODS: This retrospective cohort study included all invasive breast cancer patients who underwent NAC and subsequent curative-intent surgery during their admission at all four tertiary acute care hospitals in Calgary, Alberta, Canada, between 1 January 2010 and 31 December 2017. Surgical pathology reports were extracted and processed with NLP. Decision tree classifiers were constructed and validated against chart review results. Machine-learning algorithms were evaluated with a performance matrix including sensitivity, specificity, positive predictive value (PPV), negative predictive value [NPV], accuracy, area under the receiver operating characteristic curve [AUC], and F1 score.

RESULTS: The study included 351 female patients. Of these patients, 102 (29%) achieved pCR after NAC. The high-sensitivity model achieved a sensitivity of 90.5% (95% confidence interval [CI], 69.6-98.9%), a PPV of 76% (95% CI, 59.6-87.2), an accuracy of 88.6% (95% CI, 78.7-94.9%), an AUC of 0.891 (95% CI, 0.795-0.987), and an F1 score of 82.61. The high-PPV algorithm reached a sensitivity of 85.7% (95% CI, 63.7-97%), a PPV of 81.8% (95% CI, 63.4-92.1%), an accuracy of 90% (95% CI, 80.5-95.9%), an AUC of 0.888 (95% CI, 0.790-0.985), and an F1 score of 83.72. The high-F1 score algorithm obtained a performance equivalent to that of the high-PPV algorithm.

CONCLUSION: The developed algorithms demonstrated excellent accuracy in identifying pCR from surgical pathology reports of breast cancer patients who received NAC treatment.

DOI10.1245/s10434-022-12955-6
Alternate JournalAnn Surg Oncol
PubMed ID36542249
PubMed Central ID7299787

Weill Cornell Medicine
Department of Radiology
525 East 68th Street New York, NY 10065