event thumbnail image
NATO Advanced Study Institute on Mining Massive Data Sets for Security

Link Analysis and Text Mining : Current State of the Art and Applications for Counter Terrorism

author: Ronen Feldman, Bar Ilan University

Description

The information age has made it easy to store large amounts of data.The proliferation of documents available on the Web, on corporate intranets, on news wires, and elsewhere is overwhelming. However, while the amount of data available to us is constantly increasing, our ability to absorb and process this information remains constant. Search engines only exacerbate the problem by making more and more documents available in a matter of a few key strokes. Link Analysis is a new and exciting research area that tries to solve the information overload problem by using techniques from data mining, machine learning, Information Extraction, Text Categorization, Visualization and Knowledge Management.

You might be experiencing some problems with Your Video player.
Slides
0:00 Text Mining and Link Analysis
2:06 Background
2:17 The Information Landscape
3:18 The Information Landscape
5:22 Text Mining
5:54 Let Text Mining Do the Legwork for You
8:29 What Is Unique in Text Mining?
13:47 Document Types
16:46 Text Representations
21:42 How it Works
25:14 Components of IE System
26:29 Intelligent Auto-Tagging
30:41 Business Tagging Example
31:08 Business Tagging Example
31:35 Leveraging Content Investment
49:59 Link Analysis in Textual Networks
50:11 A Complete Link Analysis System
50:33 Types of Link Analysis Questions:
51:06 Sample LD Queries In the Terror Domain
51:14 9/11 example
51:30 Running Example
52:23 Kamada and Kawai’s (KK) Method
52:26 Running Example
52:34 Kamada and Kawai’s (KK) Method
55:08 Finding the shortest Path (from Atta)
55:55 A better Visualization
56:05 Applications of Centrality
56:13 Summary Diagram
59:42 Partitioning of networks
59:43 Cores of the Hijackers Graph
61:07 Structural Equivalence in the Hijackers
61:49 EDis between each pair of terrorists
62:13 Clustering based on structural equivalence
62:42 Block Modeling
62:43 What is Block Modeling
62:59 Visualization of the predicates
63:49 Block Model of 4 blocks
65:48 The related graph
65:56 Shrinking of the network
65:58 Block Model of 6 blocks
65:59 The related graph
66:05 Information Extraction - Theory and Practice
66:21 What is Information Extraction?
66:45 Approaches for Building IE Systems
69:05 Approaches for Building IE Systems
72:23 Mining Discussion Boards
75:07 Connections between Running Shoes
77:10 The Most Central Shoe
78:02 Connecting Cars and Terms
78:32 Clustering Cars
78:41 Clustering Results
79:17 MDS of Brands Lift
80:38 Dendogram on Brands Lift
80:45 Company Lifts - 6-cluster solution
80:50 Digging in Deeper – Main Stream
80:56 MDS of Main Stream Japanese Car Models -Lift
81:06 Digging in Deeper – Luxury Models
81:08 MDS of Luxury Cars Models (Lifts)
81:16 Self-Supervised Relation Learning from the Web
81:53 KnowItAll (KIA)
83:08 KnowItAll’s Relation Learning
84:48 SRES
86:53 SRES Architecture
88:13 Seeds for Acquisition
89:05 Major Steps in Pattern Learning
92:32 Positive Instances
92:49 Negative Instances II
92:51 Examples
92:58 Additional Instances
92:59 Pattern Generation
93:26 The Pattern Language
94:12 The Generalize Function
94:28 Example
95:04 Generating the Pattern
95:26 Post-processing, filtering, and scoring of patterns
95:35 Content Based Filtering
96:12 Scoring the Patterns
96:25 Sample Patterns - Inventor
96:38 Sample Patterns – CEO (Company/X,Person/Y)
96:42 Shallow Parser mode
97:30 Building a Classification Model
97:37 Building a Classification Model
97:39 Building a Classification Model
97:45 Sample Output
98:26 Cross-Classification Experiment
100:57 Building a Classification Model
101:40 Results!
102:28 More Results
102:30 Inventor Results
102:33 When is SRES better than KIA?
102:50 The Redundancy of the Various Datasets
102:52 True Recall Estimates
103:57 Under Estimation of the recall
104:00 True Recall Estimates
104:53 Conclusions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: