### FAQs

What are the most common ways of summarising a variable?

What are the most common measures of variability?

What is a normal distribution?

How do I produce a bar chart of a nominal variable?

How do I produce a histogram for a continuous variable?

I have heard a lot about Scatterplot. What is this and how can I produce it in SPSS?

How do I produce and modify a 2D pie chart in SPSS of a nominal variable?

How can I produce a clustered bar chart of two nominal variables?

How to produce a boxplots for several variables on the same plot?

How to produce a stacked bar for several variables on the same plot?

What is an Independent-samples t test (two-sample t test) and when can it be used?

What is paired-samples t test (dependent t test) and when can it be used?

How do I perform a normality test in SPSS?

<How do I perform an homogeneity of variance test in SPSS?

How do I perform an independent Samples T Tests in SPSS?

How do I perform a paired samples t test (Dependent T Test) in SPSS?

I have paired sample data that are not normally distributed. Which test should I use in SPSS?

How do I perform a One-Sample T Test in SPSS?

How do I produce the Pearson or Spearman Rho correlation coefficient in SPSS?

**In terms of statistics what is a scale of measurement? I have heard a lot about nominal, ordinal and interval scale of measurements. What are these?**

Scale of measurement is simply the various ways that you use numbers to collect data for analysis. For example in nominal scale you assign numbers to certain words e.g. 1=male and 2=female. Or rock can be classified as 1=sedimentary, 2=metamorphic or 3=igneous. The numbers are just labels and have not real meaning. The order as well does not matter. For ordinal scale the order matters as they describe order e.g. 1^{st}, 2^{nd} 3^{rd}. They can also be words such as ‘bad’, ‘medium’, and ‘good’. Interval scale refers to quantitative measurement such as temperature, weight, height. It is good to appreciate scale of measurement as they influence the type of analysis that you can do.

**What are the most common ways of summarising a variable?**

It depends on the scale of measurement of the variable. For nominal variable, usually a frequency table showing (counts are percentages) and mode (most common) are enough. For ordinal variable frequency table, mode and median are enough. For interval variable most statistical measure can be used.

**What are the most common measures of variability?**

The most common measure of variability is range, inter-quarter range, variance and standard deviation.

**What is a normal distribution?**

Normal distribution is a theoretical concept that is symbolised by the familiar bell-shaped curve. It is really a family of distribution. It plays an important role in statistical inference. Some statistical procedures in SPSS assume that your data is normally distributed. That is, your data is taken from a normal population.

**How do I produce a bar chart of a nominal variable? **

Bar chart is the correct graph for a nominal variable. To produce the bar chart of a nominal variable called *Employment Category*. From the menu bar select **Graphs ->Legacy Dialogs -> Bar**…click on **Simple**, click on **Define**. From the variables list, select *Employment Category [jobcat]* click the arrow (** >**) to transfer it under

**Category Axis:**. Then click

**OK**to generate the graphic.

** How do I produce a histogram for a continuous variable?**

To produce a **Histogram of Current Salary with Normal Curve** - from the menu bar select **Graphs -> Legacy Dialogs -> Histogram…**

From the variables list, select *Current Salary [salary]* click the arrow (** >**) to transfer it under

**Variable:**. Select

**Display normal curve**by a single click on the check box. Then click

**OK**to generate the graphic.

**I have heard a lot about Scatterplot. What is this and how can I produce it in SPSS?**

The existence of a statistical association between two variables is most apparent in the appearance of a diagram called a scatterplot. A scatterplot is simply a cloud of points of the two variables under investigation.

To produce scatter plot of *variable1* against *variable2*, from the menu bar select **Graphs -> Legacy Dialogs -> Scatter/Dot…** -> click on **Simple**, click on **Define**. From the variables list, select *variable1* click the arrow (** >**) to transfer it under

**Y Axis:**. From the variable list again, select

*variable2*, click the arrow (

**) to transfer it under**

*>***X Axis:**. Then click

**OK**to generate the graphic.

**How do I produce and modify a 2D pie chart in SPSS of a nominal variable?**

Pie chart is another chart that is suitable for a nominal variable. To produce 2D Pie Chart for *variable1* made up of 3 levels from the menu bar select **Graphs -> Legacy Dialogs -> Pie**…Select **Summaries for groups of cases** by a single click on the radio button. Click on **Define**. From the variables list, select *variable1* click the arrow (** >**) to transfer it under

**Define Slices by:**. Then click

**OK**to generate the graphic. Double-Click on the Pie in quick succession to make it editable. (the pie will be displayed in the

**Chart Editor**window). In this Window select

**Elements -> Show Data Labels**. Select

**Percent (or Count)**from the displayed dialogue box. Click on the green arrow on the right. Still on the dialogue box, click

**Apply**and then

**Close**. You will now be back on the Chart Editor window, from the menu bar select

**File -> Close**.

**How can I produce a clustered bar chart of two nominal variables?**

From the menu bar select **Graphs ->Legacy Dialogs -> Bar -> Clustered -> Define**.From the variable list transfer *variable1* to **Category Axis:. **From the variable list again transfer *variable2* to**Define Clusters by: **Then click **OK** to generate the graphic.

**How to produce a boxplot for one continuous (interval) variable by one categorical variable? That is, for each level of the categorical variable I want to see a boxplot on the same plot.**

- From the menu bar select
**Graphs -> Legacy Dialogs -> Boxplot**… click on**Simple**, click on**Summaries for group of cases**, and click on**Define** - From the variables list, select the variable e.g.
*Current Salary [salary]*click the arrow () to transfer it under*>***Variable:**. - From the variable list again, select a categorical variable e.g.
*gender*, click the arrow () to transfer it under*>***Category Axis:**. - Then click
**OK**to generate the graphic

**How to produce a boxplots for several variables on the same plot?**

- From the menu bar select
**Graphs -> Legacy Dialogs -> Boxplot**… click on**Simple**, click on**Summaries for separate variables**, and click on**Define** - From the variables list, select the variables e.g.
*Current Salary [salary]*click the arrow () to transfer it under*>***Variable:**. Select and transfer more variables as necessary. - Then click
**OK**to generate the graphic.

**How do I produce survival curves in SPSS? If your data is already in SPSS follow the instructions below to produce survival curves. Note that in the instruction below variable means your own variable in the data file. If you don't have any factor (grouping) variable for comparison ignore instructions 7-10.**

- To run a
**Kaplan-Meier Survival Analysis**, from the menus choose:**Analyze → Survival → Kaplan-Meier...** - Select
*variable*as the**Time**variable. - Select
*variable*as the**Status**variable. - Click
**Define Event**. - Under
**Value(s) Indicating Event Has Occurred**type*1*in the text area next to**Single value:**. - Click
**Continue**. - Select
*variable*as a**Factor**. - Click
**Compare Factor**. - Select
**Log rank**,**Breslow**, and**Tarone-Ware**. - Click
**Continue**. - Click
**Options**in the**Kaplan-Meier**dialog box. - Select
**Quartiles**in the**Statistics**group and**Survival**in the**Plots**group. - Click
**Continue**. - Click
**OK**in the**Kaplan-Meier**dialog box.

**How to produce a stacked bar for several variables on the same plot?**

- From the menu bar select
**Graphs -> Legacy Dialogs -> Bar**… click on**Stacked**, click on**Summaries for separate variables**, and click on**Define** - From the variables list, select the variables e.g.
*var1*click the arrow () to transfer it under*>***Bars Represent:**. Select and transfer more variables as necessary. The default statistics is the**mean**. You can change to any statistic by clicking on**Change Statistics...**when it is active. If not active click on any variable under**Bars Represent:** - Note that you also need a variable under
**Category Axis**such as gender for example. - Then click
**OK**to generate the graphic.

**What is an ****Independent-samples t test (two-sample t test) and when can it be used? **

This is one specific example of a group of test in statistics know as t test. This is used to compare the means of one variable for two groups of cases. As an example, a practical application would be to find out the effect of a new drug on blood pressure. Patients with high blood pressure would be randomly assigned into two groups, a placebo (control) group and a treatment (experimental) group. The placebo group would receive conventional treatment while the treatment group would receive a new drug that is expected to lower blood pressure. After treatment for a couple of months, the two-sample t test is used to compare the average blood pressure of the two groups. Note that each patient is measured once and belongs to one group. You use this test for normally distributed data and when the variances between the two groups are equal.

**What is ****paired-samples t test (dependent t test) and when can it be used? **

This is one specific example of a group of test in statistics know as t test. This is used to compare the means of two variables for a single group. The procedure computes the differences between values of the two variables for each case and tests whether the average differs from zero. For example, you may be interested to evaluate the effectiveness of a mnemonic method on memory recall. Subjects are given a passage from a book to read, a few days later, they are asked to reproduce the passage and the number of words noted. Subjects are then sent to a mnemonic training session. They are then asked to read and reproduce the passage again and the number of words noted. Thus each subject has two measures, often called before (pre) and after (post) measures.

An alternative design for which this test is used is a matched-pairs or case-control study. To illustrate an example in this situation, consider treatment patients. In a blood pressure study, patients and control might be matched by age, that is, a 64-year-old patient with a 64-year-old control group member. Each record in the data file will contain responses from the patient and also for his matched control subject. You use this test for normally distributed data.

**What is one-sample t test and when can it be used?**

This is one specific example of a group of test in statistics know as t test. This is used to compare the mean of one variable with a known or hypothesised value. In other words, the One-sample t tests procedure tests whether the mean of a single variable differs from a specified constant. For instance, you might be interested to test whether the average IQ of some 50 students differs from an IQ of 125; or how the average salary in Newcastle compares to the national average.

I hear a lot about p-value. What is it and how do I interpret it?

The p stands for probability therefore the p-value is a probability value between zero and one. It helps you to draw conclusion about statistics you perform. The three common situations are:

- If the p-value is greater than 0.05, the null hypothesis is accepted and the result is not significant.
- If the p-value is less than 0.05 but greater than 0.01, the null hypothesis is rejected and the result is significant beyond the 5 percent level.
- If the p-value is smaller than 0.01, the null hypothesis is rejected and the result is significant beyond the 1 percent level.

The null and alternative hypotheses are simply two opposing statements that you make concerning your research question.

**How do I perform a normality test in SPSS?**

Follow these steps to perform the normality test:

- From the menu bar select
**Analyze -> Descriptives Statistics -> Explore…**. - Transfer a continuous variable e.g.
*blood pressure*[*bloodpres*] to**Dependent List:**. - Transfer a grouping variable
*gender*to**Factor List:**. - From
**Display**click on**Plots**. Then click on**Plots…**. - Under
**Descriptive**deselect**Stem-and –leaf**. - Select
**Normality plots with tests**. - Click on
**Continue**. Click on**OK**

Examine the result on the table Tests of Normality. For a small sample size (n≤50) use the Shapiro-Wilk statistic. For large sample size (n>50) use the Kolmogorov-Smirnov statistic. Note that p values (usually under the column Sig.) >0.05 means that your data is normally distributed.

**How do I perform an homogeneity of variance test in SPSS?**

Follow these steps to perform the homogeneity of variance test:

- Select
**Analyze -> Compare Means -> One-Way ANOVA…**. - Transfer a continuous variable e.g.
*blood pressure*[*bloodpres*] to**Dependent List:**. - Transfer a grouping variable
*gender*to**Factor List:**. - Click on
**Options**and select**Homogeneity of variance test**. - Click
**Continue**and click**OK**

Exaine the table **Test of Homogeneity of variance**. Note that if the p value is >0.05 then the variances between the two groups are equal. Ignore the table **ANOVA** which is also produced as part of this procedure.

**How do I perform an independent Samples T Tests in SPSS?**

This test is suitable only for normally distributed data and when the variances between the two groups are equal. Follow these steps to perform the test:

- Select
**Analyze -> Compare Means -> Independent-Samples T Test…**. - Transfer the continuous variable e.g.
*blood pressure*to**Test Variable(s):**. - Transfer the grouping variable e.g.
*gender*to**Grouping Variable:**. - Click on
**Define Groups**. Beside**Group 1:**type*1*. Beside**Group 2:**type*2*. - Click on
**Continue**and click on**OK**.

If you are not sure how to interpret the output recall the dialogue box via **Analyze -> Compare Means -> Independent-Samples T Test…**. Click on **Help**. Then click on **Show me** on the displayed window to go through a case study. The case study uses a different data file to explain every bit of the output.

**My data is not normal. Which test should I use to find out if there are significant differences between groups?**

You should use the Mann-Whitney U Test. Follow these steps:

- Select
**Analyze -> Nonparametric Tests -> 2 Independent-Samples T Test…**. - Transfer the continuous variable e.g.
*blood pressure*to**Test Variable(s):**. - Transfer the grouping variable e.g.
*gender*to**Grouping Variable:**. - Click on
**Define Groups**. Beside**Group 1:**type*1*. Beside**Group 2:**type*2*. - Click on
**Continue**and click on**OK**.

**How do I perform a paired samples t test (Dependent T Test) in SPSS?**

This test is also suitable for normally distributed data. There is no need for homogeneity of variance test because we are dealing with the same group. To do the actual test, follow these steps:

- From the menu bar select
**Analyze -> Compare Means -> Paired Samples T Test…**. - Click on
*variable1*and click on the arrow. - Click on
*variable2*and click on the arrow. Click**OK**.

If you are not sure how to interpret the output recall the dialogue box via **Analyze -> Compare Means -> Paired -Samples T Test…**. Click on **Help**. Then click on **Show me** on the displayed window to go through a case study. The case study uses a different data file to explain every bit of the output.

**I have paired sample data that are not normally distributed. Which test should I use in SPSS?**

You should use Wilcoxon Signed Ranks test. Follow these steps:

- From the menu bar select
**Analyze -> Nonparametric Tests -> Paired Samples T Test…**. - Click on
*variable1*and click on the arrow. - Click on
*variable2*and click on the arrow. Click**OK**.

**How do I perform a One-Sample T Test in SPSS?**** **

To do the test, follow these steps:

- From the menu bar select
**Analyze -> Compare Means -> One-Sample T Test…**. - Select the continuous variable e.g.
*Intelligence Quotient*[*iq*] and click on the arrow. - Type a value e.g.
*125*besides**Test Value:**. - Click
**OK**.

If you are not sure how to interpret the output recall the dialogue box via **Analyze -> Compare Means -> One -Samples T Test…**. Click on **Help**. Then click on **Show me** on the displayed window to go through a case study. The case study uses a different data file to explain every bit of the output.** **

A correlationis a statistic used for measuring the strength of a supposed linear association between two variables. The most common correlation coefficient is the **Pearson** correlation coefficient, use to measure the linear relationship between two interval variables that are normally distributed. Generally, the correlation coefficient varies from -1 to +1. If your data is ordinal or not normally distributed you use the **Spearman Rho**. If you have two nominal variables and want to find the relationship (association) between them you will use Chi-Square.

**How do I produce the Pearson or Spearman Rho correlation coefficient in SPSS?**

To produce the correlation coefficient select: **Analyze -> Correlate** -> **Bivariate…**

This will open the **Bivariate Correlation** dialog box. Transfer the two variables to the **Variables** text box. Select the correlation you want by a single click on the dialog box.

**I hear a lot about one-tailed test and two-tailed test. I don’t know what they are and how to apply them. Any advice will be appreciated.**

One-tailed test applies in situation where the researcher knows the direction the results should point. For example, when testing a new drug against a placebo, a researcher would want to know whether the new drug is better than the placebo. On a family of normal distribution curves a one-tailed test can be in one direction only, positive or negative. Before you make your conclusion you must divide the p value by 2 because you are doing a one-tailed test.

Two-tailed test applies in situation where the researcher does not know or is interested in both directions of the results. The two-tailed test is more common than the one-tailed test.

You must decide before you collect your data whether you are doing a one-tailed or two-tailed test.

Data Entry and Manipulation

Graphs

Statistical Inference and Significant Testing