may 19 SPSS commands

SPSS

Week 1:

· Data cleaning

· Visualizing categorical ordinal data in a Frequency Table or Bar Chart:

o Analyze - - > Descriptive - - > Frequencies

o Analyze - - > Descriptive - - > Frequencies - - > Charts - - > Bar Chart

· Visualizing numerical continuous data in a Frequency Table or Histogram:

o Analyze - - > Descriptive - - > Frequencies - - > Statistics (choose Mean, Median, Mode, Quartiles, St Deviation, Min, Max, Range)

o Analyze - - > Descriptive - - > Frequencies - - > Charts - - > Histogram

· Visualizing numerical data in a Box Plot:

o Graphs - - > Boxplot - - > Simple - - > add variable of interest (Y) into Variables - - > add group variable into Category axis - - > OK

Week 2:

· Confidence Intervals:

o Analyze - - > Descriptive Statistics - - > Explore - - > put variable in Dependent list - - > Statistics

Week 3:

· One Sample T-test (means):

o Analyze - - > Compare means - - > One sample t-test - - > add variable of interest - - > add in the known test value - - > OK

§ Output: one table for Descriptive Stats, one table for t-test

· One Sample Chi-Squared (x^2) test (proportions):

o Analyze - - > Non-parametric tests - - > Chi-square

o Expected difference value suggested: Recall button - - > Chi square - - > (If expected difference is 20%, type in ‘Expected Values’ 80 and then 20) - - > OK

Week 4:

· Checking normality (numerical continuous):

o Analyze - - > Non-parametric tests - - > Legacy Dialogs - - > 1 Sample K-S - - > enter variables - - > OK

· Independent Samples T-test:

o Check suitability: Data - - > Split file - - > Split file by groups (ie Gender) - - > OK

§ SPSS can now show us frequencies for each gender separately

§ Recall - - > Frequencies OR Analyze - - > Descriptive Stats - - > Frequencies - - > Charts - - > Histogram

§ Re-unite the data after checking suitability: Recall - - > Split File - - > Analyze all cases

o Independent samples t-test: Analyze - - > Compare means - - > Independent samples t test - - > add variable of interest and grouping variable - - > Define Groups (use values that gender has been coded ie 0,1) - - > Continue - - > OK

· Paired Samples T-test:

o Check suitability:

§ Create a new variable for the difference between two variables: Transform - - > Compute Variable - - > give the new variable a name in Target Variable - - > Add the two variables in Numeric Expression separated by a subtract (-) sign - - > OK

§ Check suitability of new variable: Recall - - > Frequencies OR Analyze - - > Descriptive Stats - - > Frequencies - - > add new variable - - > Charts - - > histogram - - > OK

o Paired Samples T-test: Analyze - - > Compare Means - - > Paired Samples T-test - - > add variables of interest into Paired Variables - - > OK

Non-Parametric Chi-squared test:

· One Sample Chi Square test:

o Analyze - - > non-parametric tests - - > Legacy Dialogs - - > Chi-square

· Pearson’s Chi-Square test:

o Analyze - - > Descriptive Stats - - > Crosstabs - - > add variable to row and column - - > Statistics - - > Chi-square - - > Cells - - > Observed, Column - - > OK

§ Can also do Observed, Row to look at inversely

§ Cells - - > untick Observed, tick Expected - - > OK

· McNemar Chi-Square test:

o Crosstabs - - > add in variables to Rows and Columns - - > Statistics - - > McNemar - - > Cells - - > Observed, Total - - > OK

Week 5:

· Wilcoxon Signed Rank Test:

o Check Suitability: Analyze - - > Descriptive Stats - - > Frequences - - > Statistics, Charts - - > OK

§ Non-normal distribution - - > use non-parametric test, equality of medians

o Analyze - - > Non-parametric tests - - > One sample - - > Fields - - > Add variable of interest - - > Settings - - > Customize tests - - > compare median (enter hypothesized median) - - > Run

· Mann-Whitney U Test (Wilcoxon Sum Rank)

o Data - - > Split file (ie based on gender)

o Check suitability: Analyze - - > Descriptive stats - - > Frequencies - - > Charts (histogram) - - > OK

o Recall button to unsplit file - - > Analyze all cases

o Analyze - - > Non parametric test - - > Independent samples - - > Fields - - > Add variable of interest and grouping variable - - > Settings - - > Customized tests - - > Mann Whitney - - > Run

· Wilcoxon Matched-Pair Signed Rank Test:

o Analyze - - > Non parametric Tests - - > Related Samples - - > Fields - - > Add in matched pair variables - - > Settings - - > Customize tests - - > Wilcoxon Matched-pair signed rank (2 samples) - - > Run

Exact Test (assumptions violated)

· One Sample Chi square test: Analyze - - > Non parametric Test - - > Legacy Dialogs - - > Chi-square - - > add variable of interest - - > check coding and add values - - > OK

o If assumptions do not hold: Recall - - > Chi-square - - > Exact - - > Exact test - - > OK

· Pearson’s Chi-Square test:

o Create crosstabs (ie gender): Analyze - - > Descriptive Stats - - > Crosstabs - - > add variables of interest into Row and Column - - > OK

o Pearson Chi-Square test:

§ In crosstabs - - > Statistics - - > Chi-square - - > Cells - - > Observed, Column - - > OK

§ If assumptions do not hold: refer to Fisher’s exact test in output

· McNemar Chi Square test (Binomial Test):

o Crosstabs - - > add variables to Row and Column - - > Statistics - - > McNemar

§ Interpret McNemar Bowker test

📌 KEY REMINDERS:

Use t-tests when data is continuous and normally distributed
Use Wilcoxon/Mann-Whitney when data is not normal or ordinal
Use χ²/Fisher/McNemar for categorical data (like counts, yes/no)
Paired = same people, independent = different people

Week 6:

· Scatter plot:

o Graphs - - > Scatter/Dot - - > Simple scatter - - > enter X and Y - - > Label cases by “ID” - - > OK

§ Double click to add line of best fit

· Correlation Coefficient:

o Pearsons’ CC:

§ Analyze - - > Correlate - - > Bivariate - - > enter variables - - > Pearson’s - - > OK

o Spearman’s CC:

§ Check suitability of data: Analyze - - > Descriptive stats - - > Frequencies - - > St Dev, min, max, mean, median, mode - - > Charts - - > Histogram

§ Analyze - - > Correlate - - > Bivariate - - > enter variables - - > Spearman - - > OK

· Linear Regression Model:

o Analyze - - > Regression - - > Linear - - > add IV and DV, Case Labels ID - - > Statistics - - > Estimates, Confidence Intervals - - > Continue

· Predicting a variable:

o Add value into data set

o Analyze - - > Regression - - > Linear - - > add in IV and DV - - > Save - - > Unstandardized, Prediction Intervals (mean for for a group, individual for a given person) - - > Continue - - > OK

· Dummy Variables (when a categorical variable has more than 2 levels):

o Transform - - > Recode into Different Variables - - > new variable name and code - - > old and new values - - > code 0’s and 1 - - > Change - - > repeat for each new variable (n-1)

· Linear Regression with Dummy Variables:

o Analyze - - > Regression - - > Linear - - > Add IV and DVs , Case label - - > Statistics - - > Estimates, Confidence Intervals - - > Continue

Week 7:

· Multiple Linear Regression:

o Analyze - - > Regression - - > Linear - - > add DV and IVs - - > Statistics, Confidence Intervals - - > OK

· R squared:

o Analyze - - > Regression - - > Linear

· Checking assumptions for multiple linear regression model:

o Analyze - - > Regression - - > Linear - - > add DV and IVs - - > Plots - - > Histogram, Normal probability plot, Produce all partial plots, ZRESID (Y), ZPRED (X) - - > ok

· Predicting a value:

o Enter data

o Analyze - - > Regression - - > Linear - - > Save - - > Unstandardized, Individual - - > OK

o View predicted value and 95% CI (low and upper limit values) in the dataset

Week 8:

· Baron & Kenny Steps:

o Steps 1-2 (Simple Linear Regression)

o Steps 3-4 (Multiple Linear Regression)

Week 9:

· Creating an interaction term:

o Transform - - > Compute variable - - > write name of interaction term in Target variable - - > enter (___ * ___ ) in Numeric Expression - - > OK

· Estimating the Interaction Effect:

o Analyze - - > Regression - - > Linear - - > add DV and IV (and interaction term)

· Categorical variables with more than two categories:

o Recoding into Dummy variables:

§ Transform - - > Record into Different variables

o Create interaction terms

o Run multiple linear regression with DV and IV’s (including interaction terms)

· Sorting to find Outliers:

o Sort - - > Ascending

o Descriptives - - > min, max

o Graphs - - > Regression Variable Plot - - > DV in vertical axis, IV in horizontal axis, label

· Run a regression with and without outlier

o Select cases - - > remove outlier - - > re-run regression

o DFBETA and DFFIT:

§ Analyze - - > Regression - - > Linear - - > Add DV and IV - - > Save - - > Influence Statistics - - > Standardized DBETA, Standardized DFFIT - - > OK

Week 10:

· Contingency table:

o Analyze - - > Descriptive stats - - > Crosstabs - - > add variable of interest (DV or outcome) into Columns, add IV to Rows - - > Cells - - > Observed, Column - - > OK

· Pearson’s Chi Square:

o Recall - - > Crosstabs - - > Statistics - - > change Observed to Expected cell counts - - > OK

· Risk:

o Create a contingency table - - > Statistics - - > Risk - - > OK

· Binary Logistic Regression:

o Analyze - - > Regression - - > Binary Logistic - - > add DV and covariate - - > Categorical - - > add categorical covariate - - > Reference Category - - > “First”

o Options - - > CI for exp - - > Continue - - > OK

o Exp(B) is the odds ratio

o Nagelkerke R squared is the % variation that can be explained by model

· Goodness of Fit:

o (1) Classification table:

§ Binary logistic regression - - > Categorical variables - - > Options - - > Classification plots - - > classification cut-off (usually 0.5) - - > OK

· Shows sensitivity (true positives) and specificity (true negatives)

o (2) Hosmer and Lemeshow (only with multiple predictors):

§ Binary Logistic Regression - - > Options - - > Classification Plots, Hosmer Lemeshow goodness of fit - - > OK

· Non-significant p-value ( > 0.05)  Good fit

4 > categories = categorical data, 5< = continuous data

Symmetrical – report on mean and SD

Skewed – report on median and min/max and IQR

Type I error = false positive

Type II error = false negative

Power = probability of rejecting a false null

CI includes 0, not significant

Correlation – direction (+/-) and magnitude [-1,1]

Pearson’s Correlation Coefficient (parametric/normal)

Spearman’s Correlation Coefficient (nonparametric/skewed) – measures the monotonic relationship

R: degree of simple correlation

R^2 (goodness of fit): % of variation explained, does not indicate causation

R^2 adjusted: better indicator, higher one should be selected

Simple Linear Regression

Y= B0 +B1x + e

Homoscedasticity: variance of erros is equally distributed

Multiple Linear Regression

B0 + B1x1 + B2x2 + e

Fits a regression plane

Confounder: causes both IV and DV

Assumptions for Linearity:

1. The relationship between the DV and each continuous IV is linear.

2. Residuals (error terms) should be normally distributed. (histogram)

3. Homoscedasticity: stability in variance of residuals. (ZRESPID and ZPRED)

4. Independent observations

Mediator: third variable (x2), explains a portion of association

C’ : direct effect

a*b : indirect effect (mediated effect)

c : total effect = c’ + (a*b)

path a: M=B0 + B1x1 + e

path b: Y=B0 + B2M + B3x1 + e

path c: c=B3

Baron & Kenny Steps:

1. Test path C (x1 - - > Y) using simple linear regression to get B

2. Test path a (x1 - -> M) using simple linear regression to get B1

3. Test path b (M - - > Y, controlling for x1) using multiple linear regression to get B2

4. Test path c’ (x1 - -> Y, controlling for M) using multiple linear regression to get B3.

Step 1 – not essential for establishing mediation

Steps 2+3 – essential

Complete mediation: p>0.05, no assoc between x1 and y when controlling for M

Partial mediation: B3 is significantly different from 0 (p < 0.05) and c’ is smaller than c

Sobel Test of indirect effect: based on (z). If z > 1.96, reject null that there is 0 indirect effect

Modifier (moderator) (z): has an interaction effect on assoc between y and x1.

Establishing moderation:

1. Create a new variable (interaction term/cross product) (x1 * z)

2. Add new variable to linear regression model

a. Y= B0 + b1x1 + b2z +( b3x1*z)

3. Test coefficient B3

B1: effect of x1 on Y when z=0

B2: effect of z on Y when x1=0

B3: difference of effect of x1 on Y by levels of z

Effect of x1 = b1 + (b3 * z)

Effect of z = b2 + (b3 * x1)

Odds: ratio of with versus without event (# of times outcome occurs / # of times doesn’t occur)

Risk: probability of occurrence (# of times outcome occurs / total # of possible outcomes)

Relative risk: probability of outcome in exposure versus no exposure

Nagelkerke r2 explains variation

Odds = exp(L)

Probability = exp(L) / 1 + exp(L)

Sensitivity (true positives) = TP / TP + FN

Specificity (true negatives) = TN / FP + TN

Hosmer & Lemeshow test for goodness of fit, produces a chi-square statistic

Nonsignificant (P >0.05) means GOOD FIT

Search This Blog

SPSS commands

may 19 SPSS commands

Comments

Post a Comment