MA7511 Cheatsheet PDF

Title	MA7511 Cheatsheet
Author	Keo Low
Course	Design and Analysis of Experiments
Institution	Nanyang Technological University
Pages	2
File Size	300.8 KB
File Type	PDF
Total Downloads	229
Total Views	284

Preview

CLICK TO PREVIEW PDF

Summary

Hirsh Index (of a scholar) -published h papers of which has been cited h times -compare with works in the same field -simultaneous measure of quality and sustainability of scientific output guess a value, h Ϯ. if # of paper Đited ≥ h tiŵe > h, h = h + if # of paper Đited ≥ h tiŵe < h, ...

Description

Sampling 



Central Limit Theorem 

⁄

or



for variance,

Hirsh Index (of a scholar) -published h papers of which has been cited h times -compare with works in the same field -simultaneous measure of quality and sustainability of scientific output 1. guess a value, h 2. if # of paper cited ≥ h time > h, h = h +1 if # of paper cited ≥ h time < h, h = h -1 Numerical Data Population Random Variables ∑ Mean E(x)=∑ Variance

∑

V(x)=∑

Discrete Data Distribution 1. uniform distribution (a,b are integers) √

2. binomial distribution (success (p) or failure) n-# trials, x-desired # success

3. Poisson distribution λ-# occurrence in an interval, x-desired # occurrence Continuous Data Distributions ∫

∫ ∫ ∫ 1. Normal distribution, 2. Standard N.D., Cumulative,

3. χ2 distribution,

4. t distribution, DoF = n-1 Sampling from a N.D.  - average of n observations 

⁄

5. F distribution ⁄ ⁄

⁄

,

∑ 

⁄ if σ is unknown

(DoF: n-1)

(

)

Decision (H0)\truth H0 true Accepted p = 1-α Rejected Type I, p = α Simple Comparative Experiment -2 independent sample populations -use t-test, H0: µ1-µ2=0 -variance sum law (DoF: n1+n2-2) √

 

H0 false Type II, p = β p = 1- β = power

 

√

or

Weighted average variance (pooled)

 

√

√

Heterogeneity of variance ( ) -insig. if one σ is less than 4 times larger than the other -insig. if -if sig., use t-distribution table, DoF = min[(n1-1),(n2-1)] Power of test, δ = d x f(n)   

{

√

Confidence Level    2 Related Samples -pair samples accordingly, 

⁄



⁄

DoF = DoF (MSE) n – replicates in the level -if , there’s no sig. difference between levels Bonferroni Procedure Probability of at least 1 Type I error < # test x α’ = cα’  α’ for each t test Choice of Sample Size(iterative) -use operating characteristic curves 1.Calculate mean of sample, µ 2.Calculate

Unbalanced experiment Randomized BlockDesign (known nuisance) General Factorial Design (e.g. 3 factors)



to reject H0

Designed Experiments Basics: # levels in each factor (a,b,c,…) Run (R)– 1 set of levels for all factors Replicate (n) R=a.b.c…, N=R.n Single Factor Experiment -balance out effect of nuisance variables by randomizing order of observations. Level n Observations Totals Average 1 y11 y12 … y1n y1. ӯ1. 2 y21 y22 … y2n y2. ӯ2. … … … … … … … A ya1 ya2 … yan ya. ӯa. y.. ӯ..= y../N Level mean, Random Error,



∑

∑

-if no ties,

Ri. – sum of ranks in ith level

Error Factor Block Unk. Error Total A B

AC ABC

(1) – (2) a-1

∑

∑ a-1

∑

∑

b-1 N-a-b+1

∑

abcn-1

∑

b-1

∑

a-1

∑

(a-1)(b-1)

∑

(a-1)(c-1)

∑

(a-1)(b-1) (c-1)

Err

abc(n-1)

2 Design

A B AB Err

Effect:

Curvature

Curve

Nested B(A) B under A

B(A)

k

∑

-if H>χ2 α,a-1, there’s sig. difference between levels Randomized Block Design -block runs according to different levels of a known nuisance factor -effect of known nuisance factor is nullified by spreading Block 1 Block 2 … Total Average y11 y12 … y1. ӯ1. y21 y22 … y2. ӯ2. … … … … … ya1 ya2 … ya. ӯa.

Error Factor

AB

∑

Ei – expected # freq. in interval i, k - # intervals Oi – observed # freq. in interval i, p - # parameters

Overall mean,  ith level effect,   Null hypothesis, H0:



ANOVA Table Mean Square, MS = SS/DoF (not applicable to Total) FA = MSA/MSErr or MSA/MSUnkown Err FA > Fα,u,v to reject H0 (Factor A has sig. effect on response) Sum of Squares DoF ∑∑  Idea (basic) Total N-1 (1) Factor ∑ ∑  a-1 (2) ∑∑  Error (1) – (2) Use (single Total N-1 (1) ∑∑ factor) Factor ∑ a-1 (2)

DoF: 3. 4.Guess n, and match with operating characteristic curves for β, until β reaches satisfactory result Kruskal-Willis Test -normality assumption not justified – no F test -rank observations in ascending order -in case of tie, use average rank

, n - # pairs, DoF: n-1

Goodness-of-Fit Test ∑

Error and Residual Error: Residual:  – (a1) Check on error: 1.Normal Distribution -rank residuals in ascending order: ek, ek+1, … -for each ek, Pk = (k-0.5)/N -plot (ek,Pk) on normal probability paper -straight line  residual is normally distributed 2.Independent Distribution -plot residual vs running order -random  independently distributed 3.Zero Mean Value, yes if (a1) is used 4.Constant Variance -plot residuals vs runs -range approx.. equal  constant variance Fisher’s Test (Least Significant Difference Test) -if overall F0 is sig., pairwise comparison between individual level means with two-tailed t test

∑

 

∑

1 a(b-1)

Basic Effects(blocking)/Generators(fractional design) -for 2k-1, choose the highest interaction # Block size/ # blocks factors # runs 5 8 4 ADE,BCE/ABD,ACE 6 16 4 ABCE,BCDF 8 8 ABD,ACE,BCF 7 32 4 ABCDF,ABCE 16 8 ABCE,BCDF,ACDG 8 16 ABD,ACE,BCF,ABCG

Pros of Methods Factorial Design 1)efficiency > 1-at-a-time method 2)avoids misleading conclusions when interaction may be present. 3)allows effects of a factor to be estimated at several levels of other factors, yielding conclusions that are valid over a range of experimental conditions 2k Design 1)substantially reduce total # runs for a given # factors – useful for screening out factors that are not effective at the early stages of experiment. 2)formulae for SS are simpler. 3)’effect’ can be used in place of ANOVA Fractional Design 1)sig. reduce # of runs. 2)eliminates negligible factors by projecting a fractional factorial design to a full factorial design – stronger experiment in the active factors that remains. 3)combining sequence of small fractional factorial designs, main effects and interactions can be isolated and the researcher can learn about the process as it goes along Factorial Design -several factors, all runs investigated -e.g.#runs for 2 factors = ab, total observations = abn 2k Design -each factor only 2 levels, k factors, #runs = 2k Run A B Response Total (1) 1.2, 1.3 2.5 a + 2.1, 2.2 4.3 b + 1.7, 1.8 3.5 ab + + 2.7, 2.8 5.5 -main effect A  effect of A at all levels of other factors -simple effect A  effect of A at particular levels of each of the other factors 1.calculate interaction 2.if interaction ≈ 0, effect of A = main effect A else, effect of A must be studied separately at different levels of other factors. -y+ - response at high level -y- - response at low level   Contrast of A, -matching sign under column A with respective row total Effect of A, Interaction AB, Project of 2k Design -reduce # factors (drop a factor) -merge rows with same run notation -simplify ANOVA Single Replicate 2k Design -# p-factor interaction = -p≥3, high order interaction (check for interaction effect) ∑ (similar for DoF) Center point in 2k Design k -check if 2 design is good enough (relationship is linear) -yc – observed value at center point -yF – interpolated result -estimate the errors -

∑



DoF = nc-1

Blocking in 2k Design -from +/- table, look at highest order interaction -put all runs with –ve sign into block 1 and +ve into 2 -preserves info of main effects -SSblock=SS of basic and induced effects Fractional Design -DO NOT CALCULATE MAIN EFFECT AS BEFORE -defining relation: (generator) -aliases  -joint effect (of A and BC)  -if aliased with high order interaction,

Response Surface Method -f determined by least square method -optimize response using Steps 1.Fit a response surface function to the data -use coded variables for 2k design

Taguchi’s Method -determines optimum control parameters for levels of uncontrollable factors -emphasizes reduction of variation -separates controllable and uncontrollable factors Steps 1.Decide on controllable and uncontrollable factors and their levels 2.Select appropriate array and the table, e.g.

2.If input far from optimum, use linear model to rapidly estimate the optimum point -steepest ascent/descent, -find gradient vector 3.Determine stepsize Δ based on β1/β2, translate to natural variable system and carry out experiment for n observations until the observation not increase/decrease 4.Carry out 2k experiment to determine new β1 and β2 -if |β1| and |β2| are large, decide new β1/β2 -if |β1| and |β2| are small, go to 2nd order search 5.2nd order search, get data from CCD Central Composite Design -2k factorial design -2k axial observations -axial spacing, -c center-point observations (3~5) -total observations, 2k + 2k + c 6.Find optimum point In matrix form, [

][

]

[

]

AX = B  X = A-B 7.Check stationary point -max, min, saddle, reflexive -by contour/surface plot or canonical analysis

-orthogonal array used 3.Calculate row average and row SN for each run

-

(



)

(

∑

∑ )



∑ ) { -larger SN  smaller variation 4.Use (a)primitive method, (b)marginal graph method OR (c)empirical optimization to decide best combination (

(a)max. SN, then optimum yield (b)marginal graph for SN and then yield (c)if marginal graphs for SN and yield contradict each other, work out which are control factors (sig. for SN)  highest SN, then which are signal factors (insig. for SN, sig. for yield)  opt. yield, then which are other factors (insig. for SN and yield  based on other considerations 5.If best run not included, do a confirmation test on it with n observations...