Sheet 1 - dddd PDF

Title Sheet 1 - dddd
Course Statistical Foundations of Learning
Institution Technische Universität München
Pages 2
File Size 90 KB
File Type PDF
Total Downloads 80
Total Views 138

Summary

dddd...


Description

TECHNICAL UNIVERSITY OF MUNICH Department of Mathematics Prof. Dr. M. M. Wolf M. C. Caro

Mathematical Foundations of Machine Learning

Summer 2020 Sheet 1 (24.04.2020)

http://www-m5.ma.tum.de/Allgemeines/MA4801 2020S

The following exercises will be discussed in the week from April 27th to May 1st . Problems with an asterisk (∗ ) are optional, i.e. they are not crucial to following the lecture and the other exercises, but provide some additional exercise for you.

Homework Exercises H1.0. (Reading) Read the introduction as well as sections 1.1 and 1.2 in the lecture notes. H1.1. (Basic tail bounds) Let h : R → [0, ∞) be (Borel-)measurable, let X be a real-valued random variable (on a probability space (Ω, F, P )). (a) Show that inf{h(y) | y ∈ A}P[X ∈ A] ≤ E[h(X)] holds for every Borel set A ⊆ R. (b) Derive from (a) Markov’s inequality, i.e. show that P[|X| ≥ a] ≤

E[|X|] a

for all a > 0.

(c) Derive from (a) Chebyshev’s inequality, i.e. show that P[|X−E[X ]| ≥ a] ≤ for all a > 0.

E[(X−E[X ])2 ] a2

(d) Let (Xi )i∈N be a sequence of independent identically distributed (“i.i.d.”) real-valued random variables. Assume that E[X1 ] and Var(X1 ) are finite. Show that for every ε>0 #  " n 1 X    P  Xi − E[X1 ]  > ε → 0 as n → ∞. n  i=1

This is called the weak law of large numbers.

H1.2.



(Chernoff ’s inequality) Let (Xi )i∈N be a sequence of independent Bernoulli random variables with parameters pi , i.e. P[Xi = 1] = pi = 1 − P[Xi = 0]. For N ∈ N, write N P SN = Xi and µN = E[SN ]. i=1

(a) Show that for every N ∈ N and for t > µN P[SN ≥ t] ≤ e−µN

 eµ t N

t

.

This is called Chernoff ’s inequality. Hint: Start from P[SN ≥ t] = P[esSN ≥ est ] for s > 0 and use Markov’s inequality. (b) Assume now that the Xi are i.i.d. Bernoullis with parameter p. Compare (analytically or numerically) the tail bounds obtained from (a) with those obtained from Hoeffding’s inequality (see Lemma 1.1 in the lecture notes). Can you see (qualitatively) for which choices of p Chernoff’s inequality is better suited than Hoeffding’s inequality?

H1.3. (Convergence of the empirical mean to the true mean) Let (Xi )i∈N be a sequence of independent identically distributed (“i.i.d.”) real-valued random variables.

(a) Assume that a < b are s.t. P [X1 ∈ [a, b]] = 1. Show that for every ε > 0 " n #   1X 2nε2 . P Xi − E[X1 ] ≥ ε ≤ exp − (b − a)2 n i=1

Hint 1: Start from P



1 n

n P

i=1

   n P Xi − E[X1 ]) ≥ esε ] Xi − E[X1 ] ≥ ε = P[exp s( n1 i=1

for s > 0 and use Markov’s inequality. Hint 2: Use convexity of the exponential function to show: If Y is a random variable with E[Y ] = 0 and P [Y ∈ [a, b]] = 1, then E[esY ] ≤ e

s2 (b−a)2 8

holds for any s > 0.

(b) Assume that a < b are s.t. P [X1 ∈ [a, b]] = 1. Show that for every ε > 0  " n #    1 X 2nε2   . P  Xi − E[X1 ]  ≥ ε ≤ 2 exp − (b − a)2  n i=1

(c) Convince yourself that your proof in (a) can also be used (after slight modifications) to prove Hoeffding’s inequality (Lemma 1.1 in the lecture notes).

H1.4. (Bayes’ Theorem and Conditional Probabilities) Suppose that a fraction of q ∈ (0, 1) of the whole population suffers from the disease D. Now suppose that a medical research team develops a test for the disease D that comes with the following two promises: i) If the patient suffers from the disease D, the test will be positive with probability 0.95. ii) If the patient does not suffer from the disease D, the test will be negative with probability 0.90. (a) Use Bayes’ Theorem to compute the probability of a patient actually suffering from the disease D assuming that the test (performed once) gave a positive result. For which values of q is this probability ≥ 0.90? (b) Explain the result of (a) in your own words (e.g. by thinking about what happens to the probability in (a) as q → 0 or q → 1). (c) Determine the minimal n ∈ N (depending on q) s.t. the probability of a patient actually suffering from the disease D assuming that the test performed n times always gave a positive result is ≥ 0.90....


Similar Free PDFs