A New Classification of Benign, Premalignant, and Malignant Endometrial Tissues Using Machine Learning Applied to 1413 Candidate Variables

Benign normal (NL), premalignant (endometrial intraepithelial neoplasia, EIN) and malignant (cancer, EMCA) endometria must be precisely distinguished for optimal management. EIN was objectively defined previously as a regression model incorporating manually traced histologic variables to predict clonal growth and cancer outcomes. Results from this early computational study were used to revise subjective endometrial precancer diagnostic criteria currently in use. We here use automated feature segmentation and updated machine learning algorithms to develop a new classification algorithm. Endometrial tissue from 148 patients was randomly separated into 72-patient training and 76-patient validation cohorts encompassing all 3 diagnostic classes. We applied image analysis software to keratin stained endometrial tissues to automatically segment whole-slide digital images into epithelium, cells, and nuclei and extract corresponding variables. A total of 1413 variables were culled to 75 based on random forest classification performance in a 3-group (NL, EIN, EMCA) model. This algorithm correctly classifies cases with 3-class error rates of 0.04 (training set) and 0.058 (validation set); and 2-class (NL vs. EIN+EMCA) error rate of 0.016 (training set) and 0 (validation set). The 4 most heavily weighted variables are surrogates of those previously identified in manual-segmentation machine learning studies (stromal and epithelial area percentages, and normalized epithelial surface length...
Source: International Journal of Gynecological Pathology - Category: Pathology Tags: PATHOLOGY OF THE CORPUS: ORIGINAL ARTICLES Source Type: research