### Slides - Alex Berg

Machine Learning with Discriminative Methods
Lecture 02 – PAC Learning and tail bounds intro
CS 790-134 Spring 2015
Alex Berg
Today’s lecture
• PAC Learning
• Tail bounds…
Rectangle learning
-
Hypothesis H
-
-
+
+
+
+
-
-
Hypothesis is any axis aligned rectangle. Inside rectangle is positive.
Rectangle learning – Realizable case
-
Actual boundary is also an
axis-aligned rectangle,
“The Realizable Case”
(no approximation error)
Hypothesis H
-
-
+
+
+
+
-
-
Rectangle learning – Realizable case
-
Actual boundary is also an
axis-aligned rectangle,
“The Realizable Case”
(no approximation error)
Hypothesis H
-
-
+
+
+
+
-
-
-
A mistake for the
hypothesis H!
Measure ERROR by the probability of making a mistake.
Rectangle learning – a strategy for a learning
algorithm…
-
-
Hypothesis H
(Output of
learning algorithm
so far…)
-
+
+
+
+
-
-
Make the smallest rectangle consistent with all the data so far.
Rectangle learning – making a mistake
Current
hypothesis
makes a mistake
on a new data
item…
Hypothesis H
(Output of
learning algorithm
so far…)
-
+
+
+
+
+
-
-
Make the smallest rectangle consistent with all the data so far.
Rectangle learning – making a mistake
Current
hypothesis
makes a mistake
on a new data
item…
+
+
Hypothesis H
(Output of
learning algorithm
so far…)
-
+
+
Use probability
+
of such a
mistake (this is
our error
measure) to find
a bound for how
likely it was we
a training
example in this
region…
Make the smallest rectangle consistent with all the data so far.
Very subtle formulation…
R = Actual
decision boundary
R’ = Result of
algorithm so far (after
m sample)
From the Kearns and Vazirani Reading
From the Kearns and Vazirani Reading
PAC Learning
Flashback: Learning/fitting is a process…
Estimating the probability that a tossed coin comes up heads…
The i’th coin toss
Estimator based on n tosses
Estimate is within epsilon
Estimate is not within epsilon
Probability of being bad is inversely proportional to the number of samples…
(the underlying computation is an example of a tail bound)
From Raginsky notes
Markov’s Inequality
From Raginksy’s notes
Chebyshev’s Inequality
From Raginksy’s notes
Not quite good enough…
From Raginksy’s notes
For next class