event thumbnail image
The 7th International Symposium on Intelligent Data Analysis

Novelty Detection in Patient Histories: Experiments with Measures Based on Text Compression

author: Ole Edsberg, Norwegian University of Science and Technology

Description

Reviewing a patient history can be very time consuming, partly because of the large number of consultation notes. Often, most of the notes contain little new information. Tools facilitating this and other tasks could be constructed if we had the ability to automatically detect the novel notes. We propose the use of measures based on text compression, as an approximation of Kolmogorov complexity, for classifying note novelty. We define four compression-based and eight other measures. We evaluate their ability to predict the presence of previously unseen diagnosis codes associated with the notes in patient histories from general practice. The best measures show promising classification ability, which, while not enough to serve alone as a clinical tool, might be useful as part of a system taking more data types into account. The best individual measure was the normalized asymmetric compression distance between the concatenated prior notes and the current note.

You might be experiencing some problems with Your Video player.
Slides
0:00 Novelty Detection in Patient Histories: Experiments with Measures Based on Text Compression
0:32 The Problem pt 1
1:41 The Problem pt 2
2:22 Evaluation Criterion
3:57 Example pt 1
4:09 Example pt 2
4:12 Methods
5:31 Preliminaries: Kolmogorov Complexity
6:51 Preliminaries: Information Distance
7:47 NCD - The Normalized Compression Distance
8:26 Our Proposal: Normalized Asymmetric Compression Distance (NACD)
10:33 How to Use these Distances to Rate Novelty
11:06 TF-IDF Vector Space Measure
11:27 Trivial Measures
11:36 Combined Measure
11:50 Results: Areas under the ROC Curves
13:21 Results: The ROC Curves
14:11 Conclusions
14:47 Future Work
15:46 Thanks for Listening!

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: