0% found this document useful (0 votes)
27 views23 pages

Sec M

Sec_M. Sec_M. Sec_M.

Uploaded by

Alexander Qu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views23 pages

Sec M

Sec_M. Sec_M. Sec_M.

Uploaded by

Alexander Qu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
My ESL ch7 MoQEL SELECTION With Smpre P we con Pook Ao faw mocks oud meuke x hotce heed on P- valu, Ue should meade cotehows — Apot__rlnhe Asstency Bon heron. contin — A you vse Kk tests, use « = = oe whoe from”, We ering teen agen face Clone ot wom ffovke sOyetbein | Aue Hy Tee) 2 : BAA; ho FacerT 153 - ech A;\ = t= e(ug*) = 1-2 P(A) dh One) =e es ee ICON) Ss ae] For fore f Wr dow neh weile a wall, Model solbctins ofa toler to Gollowng structs boom Lynenk cpus if STRUCTORRL RISK Mein iMIZATION | 1) Bet et apeprrtee ef wrodels Mo CH, CM) -“Mk Aha C vackcots a fom af complordity wostines 7 a) Aw 2ectr model estrmake dh weeliction por form ante 8). Golbst the rhodal uth vy bul pieotichar ysonae, ee, BEST- soeser REGRESSION wit RP VNuReeS Je im _ STEP (1), For gach K=O) -- 6 icbatifhy Ho modol With oreactly % Norell cum Senalbs! RSS e IN SIE), UR om eoncdl tot wet “Ho keoce the Wsbl Ma Bias - variance Thaoe ofr ESL quap7 DEFINE Bee, = E(y-(@), the _pphtclietinn ann cusmagel cue New parka (Ky) eles ue rate beg, thee tretneg bey Ew 4 EM, F —— = Mover comPlexiry (Eq suaseT SIZE) Foe _any given Terming SET, EWy tone to clociare og we Lnctoae Comm ylirti-n ee Mek@ts a0 we Crussbsl the Apesncey Aaa Cam study tuo Aohertéon hy tooling oct By = 2. bo _ Scepypnre Y= ie) +e zn(0,0*) - evr Tia (fel = Bo Gods ber- os) | O Tot a EE (6H - be) zo? + Ex{ ia =e buy + ello e be H3 ~ oe thew (fy - 6, 609) aqueous bios ot x af atnaos Ie - EE, (6%) - é, fo) See of stm. ) _ a a = Ew = + E hos) fe) + & vir o) a z (ere DUM BLE INTEGRATED BIAS intregeereD Error UARHNCE x - Becu To Picture High 6 i Soe a Eve Low UARANCE a — Low Birs_ Pic king 8 rio DEL | 7 Aroont: TO & - _ BIBS - VARHrNCE TRADE OFF, comple % Ty EST UBT RO _ PREaction _ ECRoR Hy ae 7 = 7 Im PRRETICE , BSS = 4 2 (4c = 4). Www BE Binsed down wheO For Ere - Reneniess J!) BaP 4 ccovrectay To ose to Fm THIS pres [¢¢, Ric, Bic) ———= 2) Up on Externpe SePLE To ESTE Ew LosTion }_ me 055 in _ttomed for Eve (acromiey Ber, ) re Seyore Y= 6) 4% g~(0, *) A Brg she he trey dete, A 6 = Hy B(ass) =E y'(t-H)y = O(t-w)fs € etre i Nore : HECE WE PRE LooKiIng AT EYVor ot X, xy — OMET NES Arrep imsaneee ERR ot BV, 7 B(es5) 7 a be -Ebn] + é se sue £§ = Hey =f E g@-we 2 Ew e(rvje = twee") =O hjI-H] IE OH fit a hensay noc! urtth pp mowedeg Ths tb _(n=r) o? 2. E(8S) =e Wey 60d) + C= 8) QUE Oty 2 x 174 Linear Statistical Models Both functions are generic and compute the change in AIC (Akaike, 1974) AIC = ~2maximized [Link] + 2 # parameters Since the log-likelihood is defined only up to a constant depending on the data, this is also true of AIC. For a regression model with n observations, p parame- {ers and normally-distributed errors the log-likelihood is 1(8,02;y) = const — Fogo? ~ aially — XI and on maximizing over 6 we have 1B,0%:y) const ~ $ log? ~ :lsRSS ‘Thus if 0? is known, we can take tc = BSS + 2p + cons butif 0? is unknown, AIC = nlog(RSS/n) + 2p + const For known 0? itis conventional to use Mallows' C, °, R /o? + 2p—n (Mallows, 1973) and in this case addterm and droptera label their output as Cp. Foran example consider removing the four-way interaction from the complete ‘model and assessing which three-way terms might be dropped next. > [Link] < aov(log(Days + 2.8) ~ .74, quine) > [Link] <- update([Link], . ~ . ~ Bth:Sex:Age:Lrn) > droptera([Link], test = "F*) Single ter deletions Df Sun of Sq RSS_——AIC F Value Pr(F) 64.099 -68. 124 Eth 3 0.9739 65.073 -71.982 0.6077 0.61125 Eth 11,8788 65.678 66.631 2.9557 0.08816 Eth: 2 2.1984 66.227 ~67.415 1.998 0.14087 Sex:Age:Lrn ,2 1.4662 65.565 ~68.882 1.3725 0.25743 Clearly dropping Eth:Sex:Age most reduces AIC but dropping Eth:Sex:Lra ‘would increase it. Note that only non-marginal terms are included; none are sig nificant in a conventional F-test. Alternatively we could start from the simplest model and consider adding terms to reduce Cp; in this case the choice of scale parameter is important, since the simple-minded choice is inflated and may over-penalize complex models. ‘An Unbalance > quine.1o < > adazerm(qu Single term a Eth Sex Age Lra It appears that only needed since some 1 sum of squares Automated model se Our function stepa W requires a fitted final model is probe (most compiex) and a seale estimate, IF: scale arguments hi the process is proba, be supplied. (A furt the process should ¢ By default the fe we turn off here for The anova compor reduction in AIC or > [Link] scope trace > [Link] 8 1 2 > Bth:Sex: 3 ~ Sex:Age: At this stage we mig point of view. The argument, so in this marginal term for > droptera(q: t Sox:Age Eth:Sex:Lra EthiAge:Lmn Ms x = 7 - Now comnides shbtamnng a now Ye et ads ys EL ZAP booh so + 4S (Wer-eday + A vac $e) Ely, Nn " SimicAe T® DECONTESITION ON 5c 2eer 6%) Page Ho-3 Nowy ver 60%) a cou(6) 4m (Hou h7| oO abe a 2 = oF + AVE Bias + x Ry as ws E@s) 2 ot + AVE ARS — > Maicows 1973 J ahh) = t. 2 col yi,&) HG Gus - Variance Teaocoer Hef de a rn emai, med fice tens Mee [feo] ane] te) - fe] = {= fe-feo] + var fox), CO Bite \ARLANCE a a ot dos creelel n_conedt, te _f0)_2. 20 j__to_umbeonad oad Vo _cmminrmum _Apontonee _C-MongaT ath vvbeerel (Duss) etvunlinn o But 2... 4 Wtond sitemabors wth Smause NSE o Gorsredly bey reqyedenn anes Career, cherie coibllroy) tn Al ence Ua rote Wey dy VaKonue "oll (8 wasbevcael Qf se consspondrna _ictocre ep tas 2 nwetl — worthulule . E xpn pres SURSET SELECTION ADAE LAsso 2 Yn meats, model ote ge NeveR — conocd. Sota addition, Woe 9 am AR Sermcd MONE BIAS between CLOgesT MENRER oF inn “Aone cunss ono Teuret, MODELSPACE ba Lines noe TN POLY NOMIRLS Prostate Cancer G7] 0 BS B@uRTIONS — GB RREOICTRS iataaenrenes i Loa ¢ psa F Rawporiey _ Split DATA INTO. TRHN 67 oS Tes BO 28s | leph aleason HEtE ARE INE QE SULTS oF Pp VARIETY of METHODS Term LS Best Subset Ridge Lasso PCR PLS _ i Intercept 2.480 2495 2467 2477 2.513 2.452 Se leavol 0.680 0.740 0.389 0.545 0.544 0.440 _ f Iweight 0.305 0.367 0.238 0.237 0.337 0.351 — age 0.141 ~0.029 -0.152 -0.017 Than a Lbph 0.210 0.159 0.098 0.213 0.248 ——_ svi 0.305 0.217 0.165 0.315 0.252 lep 0.288 0.026 -0.053 0.078 = ———— gleason -0.021 0.042 0.230 0.003 =a pegs 0.267 0.123 0.059 -0.053 0.080 Test Error 0.586 0574 0.540 0.491 0.527 0.636 Std. Error* 0.184 0.156 0.168 0.152 0.122 0.172 = * standard error of test error estimate, Ms SusseT SELECTION ™\_e @£sT s086ET oo FoRWARD sTePuise + BAK wARD STEP USE _ esr SORSET SELECTION _ 1) FoR RCH Monee SIZE A Pick MODEL With SIMLEST a4 MSS ON TRAINING veTA, i Ak dausns a socpuua 3 é ° ee _ @ moda Mo, My, Marte “he 3.2 8) Pac to euminize ese “8 ( i SF PREDICTION MOR, of M3 q 3) REeTuEN MA o 4 oor os — * sens Ruoitions, Cin we une @55 to pace AL : © oubnse | mon ody do fre P~ 4O — ComBinmouhe ExRosien Aes mB) Foewrap STEPuISE SELECTION 2) ied sets Mo Hoon el _— 1) Aor @ 20,2, --, A Z Aug nent Mp. by inceoAing the Vorabte from tht P-A_NoT In MA the worecr the RS the mort, Yulong MAL 2) Prk qo obovg ad <_lalain MA = O(f) modelo Grritel —— Cyaactey cdlgorsithew = Woks ovum Yo P7N el usry fone, Brcewaen GS TEPWISE SELECTION As ove, Bur Sequeniniey a BNove voreetetles owe ect x ten, astorlny, rom fill es fot (Pen Poortead) Hq Hodge. ASSESSNENT ~ Both DeSr-SueseT _@nch FORWARO-STEPUNSE Cont brcwunen)_ ore Aofyeved un to tthe 2ubet size We cme THs @ “Tonma, PaeawereR Ts tee A CON PLEX ry KNOB. We ‘2 2 “Wel ere Udtoads, te Wodel edhodion hase such qe yoremtar whi we wall ck 1 _ We would Lie bs 1) pak a value Qa Us Torre yroronats ds ee once NPL 2) Bstemcte he prokuclann yaifenena af to Chern wmoclel MF Bet _ompywach iy tw howe om wdoprold TEST PATA SETS 4 dlostdey two 74d 5 one fev 1) oncl ondtby Ae o-) wily OFTEN WIE CANNOT arroyo sPpee OATA FoR A TEST SET. IN ths Core We Com T0% & : i) Ge wwhih astm piOcistion 51), [ frolsbs vad Ws? 2) CRoss- VRLIDBTION, a) _Leave-one-oot (400) ev = bo) 16— fob Cy

You might also like