Unit1 Introduction to Statistics

What you learned

Lec1: What is statistics

Lec2: Probability Redux

Sample average
- estimatorとして使う
probabilistic tools
1. LLN(Laws(weak and strong) of large numbers)
  - a.s. convergence
  - Convergence in probability
2. CLT(Central limit theorem)
  - Convergence in distribution
3. Hoeffinding's inequality
  - sample size nが小さくても使える。（n=1でもいい）
  - CLTが使えない時の代り、ただし精度はCLTほどでない
4. Consistent estimator
5. Gaussian distribution
  - PDF, CDF
  - Affine transformation
  - Standardization
  - Symmetry
  - Table(CDF of Standard normal distribution)
  - Quantiles
6. Three types of convergence
  1. Almost surely(a.s.) convergence
  2. Convergence in probability
  3. Convergence in distribution
7. Addition, multiplication, division
  - Almost surely(a.s.) convergence and Convergence in probability
8. Addition, multiplication, division (Slutsky's theorem)
  - Convergence in distribution
9. Continuous mapping theorem

What you noticed

sample averageにCLTを適用することで、Gaussian distributionに分布収束する。その際sampleのr.v.はGaussianである必要はない、任意の分布のr.v.でも大丈夫
sample sizeが小さくてCLT適用できない時は、Hoeffinding's inequality
CLTもHoeffinding's inequalityもestimatorであるsample averageがunknownな母集団の期待値にどれくれい近いかを測るために使う

その他

線形代数の復習が必要
- 行列の積
- 内積、外積
- 一次独立、一次従属
- ランク、ランクの求め方
  - 面倒なときは、wolframalphaを使おう www.wolframalpha.com

参考文献

a.s. ja.wikipedia.org
Hoeffinding's inequality seetheworld1992.hatenablog.com
確率収束について kriver-1.hatenablog.com

Unit2 Parametric Inference

What you learned

Lec3: Parametric Statistical Models

Trinity of statistical inference
1. Estimation
2. Confidence intervals
3. Hypothesis testing
The goal of statistics is to learn the distribution of r.v
discrete r.v.s

ja.wikipedia.org

statistical model is a pair of sample space and a family of probilty distributions.
well specified
parametric
non-parametric
semi parametric is a hybrid model
- nuisance parameter (撹乱母数、迷惑母数)
Linear regression model (線形回帰モデル)
Cox proportional Hazard model (コックス比例ハザードモデル) 生存モデル
identifiable

Lec4: Parametric Estimation and Confidence Intervals

Definitions
- Statistic
  - Any measurable function of the sample
  - Rule of thumb : if you can compute it exactly once given data, it is measurable.
- Estimator of theta
  - Any statistic whose expression does not depend on theta(data)
- weakly (resp. strongly) consistent estimatorの条件
- asymptotically normalの条件
  - estimatorはr.v. そのestimatorも正規分布に近似できる。
  - 近似した際の、分散をasymptotic variance
Bias of an estimator
Risk (or quadratic risk)
- varianceとbiasを求めて、これを求めるという流れ
- MSEと同じ意味合いだけと思うけど、言葉は区別した方いいのかな
Confidence intervals(C.I.)
- confidence interval of level 1 - alpha for theta
  - any random interval whose boundaries do not depend on theta
  - true value theta が、interval内である確率が1 - alpha 以上のintervalのこと
- C.I. of asymptotic level 1 - alpha for theta
  - any random interval whose boundaries do not depend on theta
  - sample size nの極限を取った時に、上記のような条件を満たすintervalのこと
A confidence interval for the kiss example
- sample spaceの分布がBer(p)の場合
- CLTより、estimator(sample ave)を標準正規分布に近似がスタート
- 標準正規分布への近似だけでは、完璧なC.I.は求まらない。なぜならパラメーターに依存した形だから。（今回の場合は、true value p）
- 次の３つの方法で求める
  1. Solution 1. Conservative bound
  2. Solution 2. Solving the (quadratic) equation for p
    - 実際は、解の公式よりコンピューター計算
  3. Solution 3. plug-in
    - Slutskyより、true vale pの代りにestimatorをplug-inして求める

What you noticed

どの分布が適切かを選択するのが、statistical modelingの第一歩
その際に、離散な確率変数であれば「台」に注目するのもポイント。有限個なのか無限個なのか

Lec5: Delta Method and Confidence Intervals

C.I.の復習
- 95%,98%の区間があるからといって、必ずしも98%区間の方が広いわけではない
- 同じ50%のC.I.でも区間の広さは異なる。正規分布の形から区間の中点を正規分布の中心に持ってくる時、一番区間を小さくできる
- n → ∞にした時に成立するものをasymptotic confidence intervalと呼ぶ。（つまりn=1の時などは成立しない）
- [0.34, 0.57]が95% confidence interval、と言われた時どう捉えるか？
  - この区間にunknownなパラメーターpが入る確率は0,1。0.95ではない。
  - realizationしたC.I.には注意
  - それでも、[0.34, 0.57]を95%のC.I.と呼ぶので注意。
  - これはあくまでも1 - alpha = 0.95でrandom C.I.をdeterministicな区間にrealizationしただけ
Red line TのKenall stでの待ち時間のモデル（delta method）
- 電車の到着間の時間を計測する（つまり次の電車が来るまでの待ち時間）
- この各待ち時間をモデル化する
- 以下の様に仮定する
  - Mutually independent
  - パラメーターlambdaの指数分布
- この時、lambdaをestimateする
- lack of memory
  - why would I use exponential?
    - It's a very common distribution for inter-arrival times
    - main reason "lack of memory"
- exponentialのexpectationからわかるように、LLN -> CLTを適用しても、単純にsample aveをestimatorにしてただけではlambdaのestimateできない
- ここで、delta methodの登場

ja.wikipedia.org

delta method
- this is important
- 確率変数の列がthetaで正規分布に分布収束するとする
  - この時、この列をasymptotically normal around thetaと言う
- 次に、thetaでcontinuously differentiableな関数gを考える
- 上記の確率変数の列をこの関数に関しても、正規分布へ分布収束する
- delta methodの導出にはtaylor展開を使う
- 指数分布の場合は、estimatorをsample aveの逆数を取る。このestimateの時にLNN,CLTに加えてdelta methodを使う
frequentist interpretation
- 複数回試行を行ったとき、true value lambdaがC.I.に入る確率は95%
- 1111011101111..のような結果になる。

What you noticed

パラメーターに依存するrandom intervalは実際はC.I.ではない
3つのsolutionを用いて、数値化(realizations)したendpoint間のintervalがC.I.
このように、まずC.I.はrandomなのか、realizationしたdeterministicな区間なのかをまず区別する
HW2より。正規分布の確率変数の列の和も正規分布になる
HW2でガウス分布登場

ja.wikipedia.org

ただいま学び直し中

18.6501x Fundamentals of Statistics（Unit1-2）チェックリスト

Unit1 Introduction to Statistics

What you learned

Lec1: What is statistics

Lec2: Probability Redux

What you noticed

その他

参考文献

Unit2 Parametric Inference

What you learned

Lec3: Parametric Statistical Models

Lec4: Parametric Estimation and Confidence Intervals

What you noticed

Lec5: Delta Method and Confidence Intervals

What you noticed