event thumbnail image
NATO Advanced Study Institute on Mining Massive Data Sets for Security

Machine Learning for Intrusion Detection

author: Pavel Laskov, Fraunhofer FIRST

Description

Intrusion detection is one of core technologies of computer security. The goal of intrusion detection goal is identi cation of malicious activity in a stream of monitored data which can be network trac, operating system events or log entries. A majority of current intrusion detection systems (IDS) follows a signature-based approach in which, similar to virus scanners, events are detected that match speci c pre-de ned patterns known as \signatures". The main limitation of signature-based IDS is their failure to identify novel attacks, and sometimes even minor variations of known patterns. Besides, a signi cant administrative overhead is incurred by the need to maintain signature databases. Machine learning o ers a major opportunity to improve quality and to facilitate administration of IDS. Supervised learning can be used for automatic generation of detectors without a need to manually de ne and update signatures. Anomaly detection and other unsupervised learning techniques can detect new kinds of attacks provided they exhibit unusual character in some feature space. In our contribution, kernel and distance based learning algorithms for network intrusion detection will be presented. The two essential parts of our approach are online learning algorithms and feature extraction. The major requirements on the algorithmic part are linear run-time, online learning and data type abstraction. Simple but e ective anomaly detection algorithms will be presented that satisfy these requirements (1). Feature extraction algorithms can be reduced to computation of similarity measures between sequential objects. In order to access the feature from the application-layer network protocols, in which most of modern remote exploits operate, similarity measures are computed directly over byte streams of TCP connections. Algorithms and data structures will be presented that allow e- cient computation of similarity measures in linear time with very low run-time constants and memory consumption (2)

You might be experiencing some problems with Your Video player.
Slides
0:00 Machine learning for intrusion detection
0:50 A short programming quiz
9:04 Exploit auctions: an example
10:37 What can we learn from these examples? (1)
10:46 What can we learn from these examples? (2)
11:04 What can we learn from these examples? (3)
11:26 What can we learn from these examples? (4)
11:37 What can we learn from these examples? (5)
12:27 Intrusion detection systems
13:16 Signature-based IDS (1)
13:25 Signature-based IDS (2)
13:44 Signature-based IDS (3)
13:58 Signature-based IDS (4)
15:03 Signature-based IDS (5)
15:05 Signature-based IDS (6)
15:09 Problems of signature-based IDS (1)
15:11 Problems of signature-based IDS (2)
15:21 Problems of signature-based IDS (3)
15:41 Problems of signature-based IDS (4)
16:58 Problems of signature-based IDS (5)
17:44 Problems of signature-based IDS (6)
18:06 Problems of signature-based IDS (7)
18:59 Problems of signature-based IDS (8)
19:02 Problems of signature-based IDS (9)
19:05 Problems of signature-based IDS (10)
19:39 Why machine learning? (1)
19:46 Why machine learning? (2)
20:06 Why machine learning? (3)
20:46 Why machine learning? (4)
21:17 Why not machine learning? (1)
21:20 Why not machine learning? (2)
21:38 Why not machine learning? (3)
22:27 Why not machine learning? (4)
22:46 Why not machine learning? (5)
22:54 Conceptual structure of a learning-based IDS (1)
22:55 Conceptual structure of a learning-based IDS (2)
23:22 Conceptual structure of a learning-based IDS (3)
23:27 Conceptual structure of a learning-based IDS (4)
23:29 Conceptual structure of a learning-based IDS (5)
23:58 Feature extraction from network packets
28:10 Embedding of sequences
30:18 Similarity measures for sequences
31:27 Abstract framework for similarity measures
31:48 Similarity measures for sequences
32:35 Computation of similarity measures (1)
33:18 Computation of similarity measures (2)
33:34 Trie data structure
35:02 Trie matching algorithm (1)
35:11 Trie matching algorithm (2)
35:15 Trie matching algorithm (3)
35:17 Trie matching algorithm (4)
35:27 Trie matching algorithm (5)
35:28 Trie matching algorithm (6)
35:29 Trie matching algorithm (7)
35:31 Trie matching algorithm (8)
35:50 Trie matching algorithm (9)
35:55 Trie matching algorithm (10)
36:25 Anomaly detection algorithms
38:37 Incremental centering in feature space
38:44 Incremental linkage clustering
38:46 Incremental Zeta algorithm
38:47 Evaluation
40:50 Impact of various algorithms
42:04 ROC curves: PESIM 2005 dataset
43:33 ROC curves: DARPA 1999 dataset
44:33 Conclusions and outlook (1)
45:15 Conclusions and outlook (2)
46:08 NIPS 2007 Workshop
46:45 Thank you!
47:02 - Questions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: