Combining Information Retrieval and Information Extraction for Medical Intelligence
Description
Global epidemic and medical surveillance is an essential function of Public Health
agencies, whose primary aim is to protect the public from major health threats. To
perform this function effectively one requires timely and accurate medical information
from a wide range of sources. In this work we present a system designed to monitor the
disease epidemics by analyzing textual reports, mostly in the form of news, available on
the Web. The system rests on two major components—MedISys, based on Information
Retrieval (IR) technology, and PULS, an Information Extraction (IE) system.
The Medical Information System, MedISys, is an automatic tool that gathers reports
concerning Public Health from thousands of Internet sources world-wide in 32
languages, classifies them according to hundreds of categories, detects trends across
categories and languages, and notifies users.MedISys compiles quantitative summaries
of latest reports on a variety of diseases, bioterrorism, toxins, bacteria, hemorrhagic
fevers, viruses, medicines, water contaminations, animal diseases, Public Health organisations,
etc.3 The system categorises all documents according to about 200 classes of
health threats, using pre-defined weighted boolean queries, or alerts. It uses statistical
procedures to detect a sudden increase in the volume of articles in any of the classes.
MedISys is part of the EuropeMediaMonitor (EMM) product family [2], developed
at the EC’s Joint Research Centre (JRC), which also includes NewsBrief,4 a live news
aggregation system, and NewsExplorer,5 a news summary and analysis system [1].
MedISys has already proved to be a useful and an effective tool, which attracts
thousands of users daily. IE technology is a natural direction for further enhancing the
functionality that MedISys offers. One reason for this is that IE is able to deliver information
about specific incidents of the diseases, whereas IR returns entire matched
documents (with an indication which alerts fired). Another reason is that IE could boost
precision, since keyword-based queries may trigger on documents which are off-topic
but happen to mention the alerts in unrelated contexts, while pattern matching in IE
assures that the keywords appear in relevant contexts only.
| Slides | |
| 0:00 | Combining Information Retrieval and Information Extraction for Medical Intelligence |
| 2:13 | Outline |
| 3:12 | Users and motivation |
| 5:36 | Information vs. Intelligence |
| 7:52 | Combination of Technologies |
| 8:21 | Outline - MedISys: Information Retrieval |
| 8:41 | Medical Information System - MedISys |
| 10:29 | Public vs. restricted MedISys |
| 12:40 | MedISys - Objective |
| 13:37 | Current Subscribers to MedISys Alerts and Reports include |
| 14:01 | MedISys categories and category types |
| 15:07 | Filtering of Public Health-related news |
| 16:08 | Filtering news by language and sources |
| 16:52 | Aggregation of the multilingual ‘alert’ statistics (1) |
| 17:42 | Aggregation of the multilingual ‘alert’ statistics (2) |
| 18:25 | Aggregation of the multilingual ‘alert’ statistics (3) |
| 18:42 | Alerting functions |
| 19:57 | Alerting functions (2) |
| 20:22 | Medical events in MedISys |
| 22:02 | Outline - PULS: Information Extraction |
| 22:20 | MedISys - Beyond IR |
| 23:58 | Event Extraction review |
| 24:11 | Core IE Engine and Knowledge bases (KBs) |
| 27:10 | Example: Event extraction |
| 30:36 | IE and Semantics: Reference resolution |
| 31:41 | IE and Semantics: Elided attributes |
| 32:57 | PULS |
| 34:26 | PULS |
| 42:29 | Outline - MedISys/PULS Integration |
| 42:35 | MedISys/PULS Integration |
| 44:08 | MedISys + PULS |
| 45:15 | Outline - Information Aggregation |
| 45:28 | Toward Cross-Document Aggregation |
| 47:51 | Distribution of attribute values |
| 51:15 | Confidence |
| 52:52 | Utilizing Confidence |
| 53:50 | Aggregation into Outbreaks |
| 57:02 | Outline - Performance |
| 57:59 | Performance: some preliminary numbers |
| 59:38 | Evaluation of Confidence |
| 61:22 | Evaluation of Outbreak Aggregation |
| 62:01 | Evaluation of Confidence |
| 62:03 | Evaluation of Outbreak Aggregation |
| 62:57 | Outline - Current work |
| 63:07 | Improvements |
| 67:46 | - Questions |
| 68:12 | - Questions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !



