Neural network model predicts death among patients on liver transplant waitlist
A type of machine learning algorithm using neural networks (NN) appears more accurate than the older model for end-stage liver disease (MELD) scoring in predicting mortality among patients on the liver transplant (LT) waitlist, according to a study.
“The MELD-Na score, which consists of four variables—serum total bilirubin, international normalized ratio, creatinine, and sodium level—was designed to predict the severity of liver disease… However, the current MELD-Na score-based allocation model has lots of limitations,” said co-author Shunji Nagai, a transplant surgeon at Henry Ford Hospital and Henry Ford Cancer Institute in Detroit, who presented the study at The Liver Meeting Digital Experience by the American Association for the Study of Liver Diseases (AASLD 2020).
“We have seen many liver cirrhosis patients whose MELD scores were low but suffered from life-threatening complications due to liver cirrhosis and actually could not have a chance of a liver transplant,” he added.
Nagai and his team used data from the United Network for Organ Sharing’s Organ Procurement and Transplantation Network (OPTN/UNOS) registry, which included records for 194,299 patients listed for LT between 27 February 2002 and 31 December 2018. They used data subsets to generate four separate NN models, which were created to predict mortality at different timeframes: 30, 90, 180, and 365 days. Patients who received LT before the outcome timeline, with liver cancer, who received MELD exceptions, and were listed for combined organ transplants other than kidney were excluded.
The researchers then combined the Liver Data and the Liver Wait List History files in the OPTN/UNOS registry to select a total of 44 variables, including recipient characteristics, trend of liver and kidney function during waiting time, UNOS regions, and registration year. The NN model did not include age, ethnicity, and gender to avoid assigning waitlist priority based on these factors.
Using random sampling, Nagai and colleagues split the data for each model into training, validation, and test dataset in a 60:20:20 ratio. They evaluated the performance of the models using area under receiver operative curve (AUC-ROC) and area under precision-recall curve (PR-AUC).
The AUC-ROC according to NN prediction models was 0.949 for 30-day, 0.928 for 90-day, 0.915 for 180-day, and 0.899 for 365-day mortality, while the PR-AUC was 0.689, 0.730, 0.769, and 0.823, respectively. [AASLD 2020, abstract 3]
The 90-day mortality NN model outperformed the MELD score for both AUC-ROC and PR-AUC, as well as for recall (sensitivity), negative predictive value (NPV), and F-1 score. Specifically, the 90-day mortality model predicted more waitlist deaths, with a higher sensitivity of 0.833 compared with 0.308 for the MELD score (p<0.001). Conversely, the MELD score showed better accuracy for specificity and precision.
Of note, the researchers compared the performance metrics by breaking the test dataset into multiple subsets based on age, gender, ethnicity, region, diagnosis group, and listing year. The 90-day mortality model performed significantly better than the MELD scores across all subsets of data for predicting waitlist mortality.
“In the future, if these advanced technologies are introduced into the liver allocation system, liver waitlist ranking would better reflect patients’ medical urgency and this should lead to lower waitlist mortality,” Nagai said.