JMP help sheet PDF

Title JMP help sheet
Course Statistics
Institution Copenhagen Business School
Pages 8
File Size 611.8 KB
File Type PDF
Total Downloads 58
Total Views 166

Summary

A guide on how to use JMP which is used in Statistics...


Description

JMP

- Histogram : Analyze -> Distribution -> Desired Column as the Y-Variable -> Distribution header -> Stack

- Confidence Intervals : Header -> Confidence Interval -> Alpha Level - Barplot : Start with doing a histogram -> Right Click the header -> Histogram -> Separate Bars

- Count axis : Histogram Options -> Show percent - Ordering a barplot : Right click on the variable heading-> then choose values ordering -> with that you can do a pareto plot

-Another way of creating a pareto plot is Analyze -> Quality and process -> Pareto plot

-For adding a probability axis ->Header -> Histogram Options -> Axis Density

-Ordering columns : Right click on the column name and do like in the picture , Ascending means starting with the first one , Descending with the last one

- Exclude/Include : Select Rows which should get excluded/included -> Right Click on a selected row and choose Exclude/Include

- Histogram stratified by a variable : Analyse -> Distribution -> Variable for the histogram in Y -> Variable which separates the histograms in “by”

- Correlation : There are many ways to do it , one of them ( correlation matrix ) is -> Analyze -> Multivariate Methods -> Multivariate

1

- Bivariate/Oneway/Contigency/Logistic : Analyse -> Fit Y by X - Overlay : Graph -> Overlay -> Either Bivariate or Oneway are possible -> Group by variable decides how they are separated -> Header Overlay -> Overlay Groups

- Proportions Test : Analyse -> Distribution -> Qualitative variable on Y -> Quantitative variable on Freq -> Turn it into a barplot -> Header -> Test probabilities

- Confidence Interval for proportions : Header -> Confidence Intveral -> Alpha Level - Two sample t-test : Fit Y by X -> Qualitative variable X -> Quantitative variable Y -> Oneway -> Header -> t-test ( without assuming equal variance ) or Means/Anova/ Pooled t ( for assuming same variance )

- Pearson Chi-Squared test : Fit Y by X -> Both variables qualitative -> Mosaic Plot -> Look at the Pearson value at the bottom of the output -> Df in the row above

- Excluding single data: Right click on value in the scatterplot and choose exclude -> To fit the regression line again -> Header -> Script -> Redo analysis

2

- Predicted values from the regression : Creating Scatterplot -> Fit line -> Right click on the Linear Fit box header -> Save Predicts

- Prediction Limits : Right click on the Linear Fix box header -> Indiv Confidence Limit Formula (used for single observations )

- Mean Limits : -> Mean Confidence Limit Formula ( used for means )

- Change linear regression into polynominal regression : Right click on bivariate header -> Fit Polynom -> Choose power

- Check if the linear regression model fits the data : Linear Fit Header -> Plot Residuals -> Watch out if the residuals have a mean of zero and are normally distributed

- Alternative way of plotting the residuals : Linear Fit Header -> Save Residuals -> Graph -> Graph Builder -> Plot residuals (Y) against predicted values (X) -> Inspect for signs of systematic behaviour •To see if there is a relationship between a the dependent and independent variable , we have to look at the correlation which is the square root of R^2. It is important then do decide which sign the correlation has , therefore look at the line. Slope has the same sign as the correlation •To see if the effect of ,age in this case , is statistically different from zero , we look at the pvalue which has 26 df ( t- distributed ) and is clearly below 5 % , we indicate that there is one

- Multiple Regression : Analyze -> Fit Model -> Response variable to Y -> explanatory variables to X ( when an interaction effect is desired -> Mark columns of the explanatory variables -> Macros -> Full Factorial ) -> Emphasis -> Minimal report One can see the F-Ratio and also the pvalue below , it shows if the parameters are significant together (at least one of them) df = 2,76 Single parameters are tested below which were tested with a t-test Confidence Intervals of the parameters -> Header Response -> Regression Report -> Show all CI 3

- Check the model: Header response -> Row diagnostics -> Plot Residual by row - Alternative : Header response -> save columns -> Studentized Residuals -> do the same with predicted values which are also available in the header response -> Graph -> Graph Builder -> plot residuals against predicted values

-QQ-Plot : Analyze -> Distribution -> Residuals as Y variable -> Header -> Normal quantile plot

- Predicted values and CI for multiple regression : response header from the Fit model output -> save columns -> Prediction Formula ( for the predicted values from the regression ) -> also to get Prediction Intervals -> response header -> Indiv Confidence Limit formula

- Significant effect : Same procedure as if with the single parameter regression , to see if one independent variable has an effect we have to look at its p-value (better use Indicator Function Parameterisation) , in this case with df=25 the p-value is significant for occupation

- Interaction effect : If one want to see if the independent variable depends on the group with its effect on the dependent variable , we include the interaction effect to see if so One can see that the p-value of the interaction effect is not significant which leads us to the conclusion that age is not affected with its effect on blood pressure between the groups

- Oneway ANOVA (One quantitative / one qualitative variable ) : Analyze -> Fit Y by X > qualitative variable for X -> quantitative variable for Y -> header from oneway -> Mean/ANOVA

- F-Ratio and the related p-value test for significant difference between the qualitative variables of the group

-Degrees of freedom (3,48)

4

- Check for variance homogeneity (same standard deviation in the four groups) : Header Oneway -> Unequal Variance Levene test statistic is 0,7495 with df of (3,48) which leads us to a non significant p-value -> no evidence of different standard deviations

- Finding which group differ from the other : Header oneway -> compare means -> All Pairs , Turkey HSD -> completely at the bottom we find this No evidence that any group differs from another , one can see this from the p-value of each comparison If / else order in JMP: Create a new column -> Change Data Type to Character -> next choose Formula -> Conditional -> If / One have to use “” when using letters Twoway ANOVA : Analyze -> Fit model -> quantitative variable as Y -> qualitative variable as X variables -> ( if interaction effect desired -> Mark column and then click Macros -> Full factorial ) -> Emphasis -> Minimal report degrees of freedom for the f-test are found df1 at effect test and df2 at ANOVA->error Marginal and Sequential test are giving exactly the same conclusion when using them correctly " Marginal ( effect ) test • Read in any order • Significant effects are significant , means that they do not change if we refit the model • When concluding one effect is nonsignificant , drop it and refit the model till just significant effects are included • One has not to respect the hierarchy

5

Sequential test • Read the ANOVA table from the bottom up • Stop the first time you reach a significant effect • Respect the Hierarchy ! - H0: No interaction - H0: No effect on the second factor - H0: No effect on the first factor • If you found something significant , you can conclude that this factor is truly significant • If you want to test the rest of the parameters each model term has to be tested as if it were entered last into the model , so switch the order that the one who was first gets second now • If it is still nonsignificant we can drop it as well , now the whole parameters left are significant

Marginal

Example

Sequential

• Fit the Two-way ANOVA • The Effect test is directly shown

• Fit the Two-way ANOVA • Header->Estimates->Sequential

• We do not have to respect any hierarchy

• We respect the hierarchy and start by looking

• •

when doing the Marginal Model We can directly see that size is significant and we already know that this is going to stay like this Next we drop one nonsignificant parameter , we choose the interaction term

• Like predicted “size” is still significant • Next we drop “sector” because p > 5% • Now every model term is significant

at the interaction effect which is nonsignificant

• We can one up and see that “size” is •

significant so we stop there and have to refit the model to test the last one Doing this as treating “sector” as it were last entered and change the order

• Now we can see that sector is not significant • •

despite it looked like it in the first picture We drop also the non significant factor “sector” Now every model term is significant

Important to know is , that both results are equal , they are just different ways to get to the result !

- For the final model and for every stage we can look at the paramater estimates , we rather use the “Indicator” method ,which is found -> Header ->Estimates -> “Indicator”

- Confidence Interval can be created with right click on the “Indicator” output -> Column - The final output ( the slope of the model terms ) equals a two sample t-test with equal variances

- Indicator Function Parameterization is good to use when there is one explanatory variable which is categorical , otherwise the Parameter estimates are also fine "

6

Logistic Regression: Binary response -> Y variable binary ( values of 0 and 1)

- Analyze -> Fit Model -> Qualitative variable as Y -> explanatory variables as “Construct Model effects”

- When outcome in binary , change Personality into “Generalized Linear Model “ and choose binomial as distribution

-We are only interested in the pvalue which tells us if there is something significant in the model Which test to use Explanato 1 group ry variable

2 groups

more than two groups

1 covariate

more than 1 covariate

covariates + factors

Paired

Numerical one sample t->tdistributio test n ->Fdistributio n

2 sample t test , whether same standard deviation or not

oneway ANOVA

simple linear regression

multiple linear regression

two factors ->Twoway ANOVA covariates + factores> ANCOVA/ regression

Paired ttest

Categoric test one prob al ->Normal distributio n

test differences in prob also pearson chi squared is possible if required

Pearson chi squared

logistic regression two category outcome

Template for answers concerning test CI • Which test are you using • Hypothesis -> two sided only • Distribuiton • degrees of freedom • standard erro • test score • p-value • conclusion - Are the units correct ?…thousands , logs etc ? - Does the conclusion answers the question clearly

7

McNemar test

Things to remember: • When using regression , have you used Parameter estimates and indicator estimates correctly ?

• Are the different in colours visible when printing out black and white ?

• Conclusion are with 95 % confidence not 100 ! • Did you assume equal standard deviation when it was required to do so ?

8...


Similar Free PDFs