263-5200-00L Data Mining: Learning from Large Data Sets
Semester | Herbstsemester 2015 |
Dozierende | A. Krause |
Periodizität | jährlich wiederkehrende Veranstaltung |
Lehrsprache | Englisch |
Kurzbeschreibung | Many scientific and commercial applications require insights from massive, high-dimensional data sets. This courses introduces principled, state-of-the-art techniques from statistics, algorithms and discrete and convex optimization for learning from such large data sets. The course both covers theoretical foundations and practical applications. |
Lernziel | Many scientific and commercial applications require us to obtain insights from massive, high-dimensional data sets. In this graduate-level course, we will study principled, state-of-the-art techniques from statistics, algorithms and discrete and convex optimization for learning from such large data sets. The course will both cover theoretical foundations and practical applications. |
Inhalt | Topics covered: - Dealing with large data (Data centers; Map-Reduce/Hadoop; Amazon Mechanical Turk) - Fast nearest neighbor methods (Shingling, locality sensitive hashing) - Online learning (Online optimization and regret minimization, online convex programming, applications to large-scale Support Vector Machines) - Multi-armed bandits (exploration-exploitation tradeoffs, applications to online advertising and relevance feedback) - Active learning (uncertainty sampling, pool-based methods, label complexity) - Dimension reduction (random projections, nonlinear methods) - Data streams (Sketches, coresets, applications to online clustering) - Recommender systems |
Voraussetzungen / Besonderes | Prerequisites: Solid basic knowledge in statistics, algorithms and programming. Background in machine learning is helpful but not required. |