Improving mortality prediction for intensive care unit patients using text mining techniques

author: Primož Kocbek, Fakulteta za zdravstvene vede, Univerza v Mariboru
published: Dec. 8, 2017,   recorded: October 2017,   views: 700


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Numerous severity assessment scores for estimation of in-hospital mortality in Intensive Care Unit (ICU) have been developed over the last 40 years. In this study, we predicted 1-month mortality in chronic kidney disease (CKD) patients using the open Medical Information Mart for Intensive Care III (MIMIC III) database. Additionally, we observed the improvement in predictive performance and interpretability of the baseline model used in ICUs to a more complex model using simple features such as unigrams or bigrams, as well as advanced features extracted from textual nursing notes. For the latter, MetaMap extraction tool was used to extract medical concepts based on the Unified Medical Language System (UMLS) terminology. We used a logistic regression based classifier, built using Simplified Acute Physiology Score II (SAPS II), age and gender, as a baseline model. The baseline model was then compared to regularized logistic regression based classifier built using simple and more complex additional features. The Area Under the ROC Curve (AUC) results for the baseline predictive performance improved from of 0.761 to 0.782 when frequency of unigrams and bigrams were used to build the model. In a similar scenario, where unigram and bigram frequency was replaced with Term Frequency–Inverse Document Frequency (TF-IDF) based feature values, AUC further increased to 0.786. This paper represents an opportunity in extracting new knowledge in the form of unigrams, bigrams or concepts extracted from textual notes accompanied by regression coefficient values that can be interpreted as relations between the features and the outcome. The combination of both can provide added value in decision support systems in ICU departments, where data is collected in electronic medical records (EMRs) in real-time.

See Also:

Download slides icon Download slides: sikdd2017_kocbek_mortality_prediction_01.pdf (805.9 KB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Denial, June 25, 2022 at 11:23 a.m.:

Thank you very much, this kind of practice can help to gain more experience for people who dream of getting a nursing degree. Also, I read in the article about different ways to get an education online

Write your own review or comment:

make sure you have javascript enabled or clear this field: