Report

Psychology 242 Introduction to Research Revised 4/5/10 1 Dr. McKirnan, Psychology 242 Introduction to statistics # 2 How do we “know” about the world? Plato’s Allegory of the Cave. Was our hypothesis supported? The critical ratio and the logic of the t-test. The central limit theorem and sampling distributions "The Allegory of the Cave" by Allison Leigh Cassel Correlations and assessing shared variance Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 2 Plato’s Allegory of the Cave. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Plato’s Allegory of the Cave 3 Plato, Republic, Book VII, 514a-c to 521a-e) Plato's Allegory of the cave, Engraving of Jan Saenredam (1565-1607) after a painting of Cornelis Corneliszoon van Haarlem (1562-1638) Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Plato’s Allegory of the Cave 4 Plato, Republic, Book VII, 514a-c to 521a-e) Socrates: And now, I said, let me show in a figure how far our nature is enlightened or unenlightened : "Behold ! , human beings living in a underground den, which has a mouth open towards the light and reaching all along the den. Here they have been from their childhood, and have their legs and necks chained so that they cannot move, and can only see before them, being prevented by the chains from turning round their Plato's Allegory of theheads.” cave, Engraving of Jan Saenredam (1565-1607) after a painting of Cornelis Corneliszoon van Haarlem (1562-1638) Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 5 Plato’s Cave, 2 “Above and behind them a fire is blazing at a distance, and between the fire and the prisoners there is a raised way; and you will see, if you look, a low wall built along the way, like the screen which marionette players have in front of them, over which they show the puppets.“ Glaucon: "I see". "And do you see", I said, "men passing along the wall carrying all sorts of vessels, and statues and figures of animals made of wood and stone and various materials, which appear over the wall ? Some of them are talking, others silent.“ Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 6 Plato’s Cave, 3 Glaucon: "You have shown me a strange image, and they are strange prisoners". "Like ourselves", I replied. "And they see only their own shadows, or the shadows of one another, which the fire throws on the opposite wall of the cave ?" Glaucon: "True", he said. "How could they see anything but the shadows if they were never allowed to move their heads?" Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 7 Plato’s Cave, 4 And of the objects which are being carried in like manner they would only see the shadows?" Glaucon: "Yes", he said. "And if they were able to converse with one another, would they not suppose that they were naming what was actually before them?" Glaucon: "Very true.“ Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 8 Plato’s Cave, 5 "And suppose further that the prison had an echo which came from the other side… would they not be sure to fancy then one of the passersby spoke that the voice which they heard came from the passing shadow ?" Glaucon: "No question", he replied. "To them", I said, "the truth would be literally nothing but the shadows of the images". Glaucon: "That is certain." Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 9 Plato’s Cave, 6 What does Plato’s Allegory of the Cave tell us about scientific reasoning? We cannot observe “nature” directly, we only see its manifestations or images: We are trapped in a world of immediate sensation; Our senses routinely deceive us (they have error). We cannot get outside our limited sensations to see the underlying “form” of nature Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 10 Plato’s Cave and Scientific 1.Reasoning: Theories (knowledge structures) Core limitations of our address hypothetical knowledge about the world. Plato’s Cave, 7 constructs… We infer their forms 2. We study samples of people & places, and try to generalize to the larger population Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 11 Plato’s Cave, 8 1. We study hypothetical constructs; basic “operating principles” of nature e.g., evolution, gravity, learning, motivation… Processes that we cannot “see” directly… …that underlie events that we can observe. We use rational analysis – theory – to deduce what the “form” of these processes must be, and how they work. We collect evidence to test whether our theory is correct. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 12 Plato’s Cave, 9 Why can’t we just observe “nature” directly? We can only observe the effects of hypothetical constructs, not the processes themselves. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 13 Plato’s Cave, 9 Why can’t we just observe “nature” directly? We can only observe the effects of hypothetical constructs, not the processes themselves. Our theory helps us develop hypotheses about what we should observe if our theory is “correct”. We test our hypotheses to infer how nature works. Our inferences contain error: we must estimate the probability that our results are due to “real” effects versus chance. The link from hypothetical constructs to empirical evidence can be deductive (“top-down”) or inductive (“bottoms-up”). Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Deductive: 14 The link between theory & data We begin with a well articulated theory …then to data collection: Theory Deductive Hypothetical Constructs Research methods Inductive (operational definitions) Empirical observations Not directly observed Specific hypotheses Inductive: we begin with empirical observations, then: formulate a theory that may account for them develop further testable hypotheses gather more data in a specific hypothesis-testing process . Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 15 Generalizability 2. Theories are tested with samples, not the entire population. Just as we infer the hypothetical constructs underlying our observations, we infer how well those results generalize to: The larger population our sample is drawn from Other physical or social settings Other forms of the Independent Variable Other outcomes or forms of the Dependent Variable. As with all inferences, our generalizing beyond the experiment is probabilistic and is subject to error. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 16 Statistics # 2 Plato’s Allegory of the Cave Was our hypothesis supported? The critical ratio and the logic of the t-test. The central limit theorem and sampling distributions Correlations and assessing shared variance Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research The Critical Ratio is central to statistical reasoning Critical ratio = The strength of the results (experimental effect) The amount of error variance (“noise”) in the data. The error in our estimation of what underlies what we see… Dr. David McKirnan, [email protected] 17 Statistics Introduction 2. What we can actually see or measure Back Home Page Next Psychology 242 Introduction to Research The Critical Ratio is central to statistical reasoning 18 The strength of the results (experimental effect) Critical ratio = Amount of error variance (“noise” in the data) = = Dr. David McKirnan, [email protected] Statistics Introduction 2. X(score) – M(group mean) S (Standard deviation) Mgroup1 – Mgroup2 Standard error of the M Back =Z =t Home Page Next Psychology 242 Introduction to Research The Critical Ratio is central to statistical reasoning 19 The strength of the results (experimental effect) Critical ratio = Amount of error variance (“noise” in the data) In an environment with a lot or error (many “chance” events) we have trouble ensuring that what we see is “real” rather than just chance. In an environment with less error we can be more confident that an event is “real” rather than chance alone. Even in an errorful environment a very strong occurrence (a dramatic event) is likely “real”. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 20 The t-Test t-test: are the Ms of two groups statistically different from each other? Control Group M Experimental Group M Dr. David McKirnan, [email protected] Statistics Introduction 2. In any experiment the Ms will differ at least a little. Does the difference we observe reflect “reality”? … i.e., really due to the independent variable. Statistically: is the difference between Ms more than we would expect by chance alone? Back Home Page Next Psychology 242 Introduction to Research 21 M differences and the Critical Ratio. We judge research results using the critical ratio: Difference between Ms for the two groups Variability within groups (error) Mgroup2 Mgroup1 Within-group variance, group1 control group Dr. David McKirnan, [email protected] Within-group variance, group2 experimental group Statistics Introduction 2. This reflects Plato’s Cave: Does what we see represent reality (a real experimental effect) or mere chance (simple error). Back Home Page Next Psychology 242 Introduction to Research 22 M differences and the Critical Ratio. A statistical use of the critical ratio: The experimental effect Error variance Mgroup2 Mgroup1 Within-group variance, group2 Within-group variance, group1 Distribution of scores for the control group Distribution of scores for the experimental group Dr. David McKirnan, [email protected] Statistics Introduction 2. Variance between groups What we would expect by chance given the variance within groups. Back Home Page Next Psychology 242 Introduction to Research The Critical Ratio in action 23 All three graphs have = difference between groups. They differ in variance within groups. The critical ratio helps us determine which one(s) represent a statistically significant difference. Low variance Medium variance High variance Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 24 Clickers! A = All of them B = Low variance only C = Medium variance D = High variance E = None of them Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 25 Critical ratio and variances, 1 Critical ratio: Gets larger as the variance(s) decreases, given the same M difference….. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 26 Critical ratio and variances, 2 Critical ratio: …also gets larger as the M difference increases, even with same variance(s) Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 27 What Do We Estimate; experimental effect Experimental Effect Difference between group Ms Error variance M difference (between control & experimental groups) is the same in both data sets Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 28 What Do We Estimate: error term Experimental Effect Error variance Variability within groups Variances differ a lot in the two examples High variability Low variability Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Assigning numbers to the critical ratio: Experimental Effect Error variance = 29 numerator Difference between group Ms Variability within groups (Mgroup1 - Mgroup2 ) - 0 = High variability Low variability Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Experimental Effect Error variance 30 Assigning numbers to the critical ratio: denominator Difference between Ms = Variability within groups Mgroup1 - Mgroup2 = Standard error: Variance n grp1 + Variance grp2 n grp2 High variability Low variability Dr. David McKirnan, [email protected] grp1 Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Critical ratio 31 Experimental effect “adjusted” by the variance. Yields a score: Z, t, r, etc. Positive: grp1 > grp2 …or Negative: grp1 < grp2. Any critical ratio [CR] is likely to differ from 0 by chance alone. Even in “junk” data two groups may differ. Cannot simply test whether Z or t is “absolutely” different than 0. We evaluate whether the CR is greater than what we expect by chance alone. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research A large CR is likely not due only chance – it probably reflects a “real” experimental effect. The difference between groups is very large relative to the error (within-group) variance A very small CR is almost certainly just error. 32 When is a critical ratio “statistically significant” Any difference between groups is not distinguishable from error or simple chance: group differences may not be due to the experimental condition (Independent Variable). A mid-size CR? How large must it be to assume it did not occur just by chance? We answer this by comparing it to a (hypothetical) distribution of possible CRs. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 33 Distributions of Critical Ratios ➔ Imagine you perform the same experiment 100 times. You randomly select a group of people You randomly assign ½ to the experimental group, ½ to control group You run the experiment, get the data, and analyze it using the critical ratio: = Mgroup1 - Mgroup2 Variancegrp1 Variancegrp2 + ngrp1 ngrp2 Dr. David McKirnan, [email protected] =t Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 34 Distributions of Critical Ratios ➔ Imagine you perform the same experiment 100 times. Then … You do the same experiment again, with another random sample of people… And get a critical ratio (t score) for those results… = Mgroup1 - Mgroup2 Variancegrp1 Variancegrp2 + ngrp1 ngrp2 Dr. David McKirnan, [email protected] =t Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 35 Distributions of Critical Ratios ➔ Imagine you perform the same experiment 100 times. And you get yet another sample… And get a critical ratio (t score) for those results… = Mgroup1 - Mgroup2 Variancegrp1 Variancegrp2 + ngrp1 ngrp2 Dr. David McKirnan, [email protected] =t Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 36 Distributions of Critical Ratios ➔ Each time you (hypothetically) run the experiment you generate a critical ratio (CR). For a simple 2-group experiment the CR is a t ratio It could just as easily be a Z score, an F ratio, an r… ➔ These Critical Ratios form a distributionCR CR CR This is called a Sampling Distribution. CR CR CR CR CR CR CR CR -3 Dr. David McKirnan, [email protected] Statistics Introduction 2. CR -2 CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR 0 +1 +2 +3 Critical ratios (Z, t…) -1 Back Home Page Next Psychology 242 Introduction to Research 37 Distributions of Critical Ratios ➔ Imagine you perform the same experiment 100 times. ➔ Each experiment generates a critical ratio [Z score, t ratio…] ➔ These Critical Ratios form a distribution This is called a Sampling Distribution. Most Critical Ratios will cluster around ‘0’ M=0 Progressively fewer are greater or less than 0. With more observations the sampling distribution becomes “normal” More extreme scores are less likely to occur by chance alone. CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR -3 Dr. David McKirnan, [email protected] -2 -1 0 1 2 Critical ratio (Z score, t, …) Statistics Introduction 2. Back 3 Home Page Next Psychology 242 Introduction to Research 38 Distributions We can have a distribution of critical ratios just like we can have a distribution of scores. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 39 Distributions & inference: raw scores, exam 2 This is the distribution of some 242 exam raw scores. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 40 Exam 2: z scores Here are the same scores, shown as Z scores. What are the odds that these scores were by chance alone? Z scores are a form of Critical Ratio They are Standardized: Mean, median, mode = .00 Standard Deviation (S) = 1.0 Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 41 Exam 2: z scores How about these scores? Here are the same scores, shown as Z scores. Z scores are a form of Critical Ratio They are Standardized: Mean, median, mode = .00 Standard Deviation (S) = 1.0 Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 42 Distributions of Critical Ratios After we conduct our experiment experiment and get a result (a critical ratio or t score) our question is… CR CR What are the odds that these results are due to chance alone? CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR larger – CRs Dr. David McKirnan, [email protected] Statistics Introduction 2. CR CR CR CR CR 0 CR CR CR larger + CRs Back Home Page Next Psychology 242 Introduction to Research 43 Distributions & inference We infer statistical significance by locating a score along the normal distribution. A score can be: An individual score (‘X’), A group M, A Critical Ratio such as a Z or t score. More extreme scores are less likely to occur by chance alone. M of sampling distribution Progressively less likely scores Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 44 Statistical significance & Areas under the normal curve, 1 A Z or t score that exceeds + 1.98 would occur by chance alone less than 5% of the time. The probability of t < -1.98 a critical ratio +1.98 is low enough [p<.05] that it likely indicates a “real” experimental < 2.4% of cases effect. t > +1.98 < 2.4% of cases 95% of cases -3 -2 -1 0 +1 +2 Z or t Scores +3 (standard deviation units) Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 45 Statistical significance & Areas under the normal curve, 2 If Z is < 1.98 the results may occur by chance alone > 5% of the time. (i.e., “statistically significant”) Z = -1.0 Z = +1.0 About 68% of cases The probability of Z = 1 occurring by chance is too high for us to conclude that the experimental results are “real” Occurs about 16% of the time by chance -3 -2 -1 0 …about 16% by chance +1 +2 +3 Z Scores (standard deviation units) Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 46 Statistical Hypothesis Testing Null Hypothesis Any difference between the M for the experimental group and the M for the control group is by chance alone. Mexp – Mcontrol = 0, except for chance (error variance) Research Question: In our study is Mexp - Mcontrol >< 0 by more than we would expect by chance alone? Test by Calculating the Critical Ratio = Dr. David McKirnan, [email protected] Statistics Introduction 2. (t test) (Mgroup1 - Mgroup2) - 0 Variancegrp1 Variancegrp2 + ngrp1 ngrp2 Back =t Home Page Next Psychology 242 Introduction to Research 47 Statistical Hypothesis Testing The concept underlying the t test is the critical ratio: How strongly did the independent variable affect the outcome? How much error variance [“uncertainty”, “noise”] is there in the data = (Mexp - Mcontrol) - 0 =t Varianceexp Variancecontrol + nexp ncontrol For a t-test: The experimental effect is the difference between the Ms of the experimental & control groups The error variance is the square root of the summed variances of the groups, similar to a two-group standard deviation. Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research t= 48 t-test Difference between groups standard error of M (Mgroup1 - Mgroup2) - 0 = Variancegrp1 Variancegrp2 + ngrp1 ngrp2 ➔ How strong is the experimental effect? (Mgroup1 - Mgroup2) - 0 t= SSgrp1 dfgrp1 ngrp1 SSgrp2 + ➔ How much error variance is there dfgrp2 ngrp2 (Mgroup1 - Mgroup2) - 0 t= X - M grp1 X - M grp2 2 ngrp1 Dr. David McKirnan, [email protected] 2 n - 1grp1 + n - 1grp2 ngrp2 Back Home Page Next Psychology 242 Introduction to Research t= 49 t-test Difference between groups standard error of M (Mgroup1 - Mgroup2) - 0 t= SSgrp1 dfgrp1 ngrp1 SSgrp2 + dfgrp2 ngrp2 = (Mgroup1 - Mgroup2) - 0 Variancegrp1 Variancegrp2 + ngrp1 ngrp2 Standard error: ➔ Calculate the variance for for group 1 ➔ Sum of squares ➔ Divided by degrees of freedom (n-1) ➔ Divide by n for group 1 ➔ Repeat for group 2 ➔ Add them together ➔ Take the square root Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research 50 Compute a t score (Mgroup1 - Mgroup2) - 0 t= 2 X M grp1 ngrp1 n - 1grp1 2 X M grp2 + n - 1grp2 ngrp2 Compute the Experimental Effect: Calculate the Mean for each group, subtract group2 M from group1 M. Compute the Standard Error Calculate the variance for each group Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research X 7 6 2 1 4 1 7 4 2 6 51 Calculate the Variance using the box method: M 4 4 4 4 4 4 4 4 4 4 n = 10 Σ= 40 M = 40/10 = 4 X-M 3 2 -2 -3 0 -3 3 0 -2 2 Σ=0 Dr. David McKirnan, [email protected] (X - M)2 9 4 4 9 0 9 9 0 4 4 Σ = 52 1. Enter the Scores. 2. Calculate the Mean. 3. Calculate Deviation scores: Simple deviations: Σ (X – M) = 0 Square the deviations to create + values: Σ Squares = Σ(X - M)2 = 52 4. Degrees of freedom: df = [n – 1] = [10 – 1] = 9 5. Apply the Variance formula: S 2= å(X-M)2 df = 52 = 5.8 9 Back Home Page Next Psychology 242 Introduction to Research 52 Compute a t score (Mgroup1 - Mgroup2) - 0 t= 2 X M grp1 ngrp1 n - 1grp1 2 X M grp2 + effect error n - 1grp2 ngrp2 Compute the Experimental Effect: Calculate the Mean for each group, subtract group2 M from group1 M. Compute the Standard Error Calculate the variance for each group Divide each variance by n for the group Add those computations Take the square root of that total Compute t Divide the Experimental Effect by the Standard Error Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research The effect of error variance on t 53 Critical ratio: any effect – e.g., the difference between group Ms – is attenuated when there is more error variance… M = 2.5 M=4 t = t = M = 2.5 Dr. David McKirnan, [email protected] This is reflected in different values of t. M1 – M2 = 4 – 2.5 = 1.5 Standard error = .75 M1 – M2 = 4 – 2.5 = 1.5 Standard error = 1.75 = = 1.5 = 2 .75 1.5 1.75 = .86 M=4 Back Home Page Next Psychology 242 Introduction to Research M = 2.5 54 Clicker! M=4 Why does this have a t value = 2? Dr. David McKirnan, [email protected] Statistics Introduction 2. a. The difference between the group means is large relative to the variance within each group b. The variance within each group is large relative to the difference between the group means. c. The M of the larger group = 4 and there are 2 groups d. t is a random number Back Home Page Next Psychology 242 Introduction to Research M = 2.5 55 Clicker, 2 M=4 Why does this have a t value = .86? Dr. David McKirnan, [email protected] Statistics Introduction 2. a. The difference between the group means is large relative to the variance within each group b. The variance within each group is large relative to the difference between the group means. c. The M of the larger group = 4 and there are 2 groups d. t is a random number Back Home Page Next Psychology 242 Introduction to Research 56 Sampling distribution & statistical significance Any 2 group Ms differ at least slightly by chance. Any t score is therefore > 0 or < 0 by chance alone. We assume that a t score with less than 5% probability of occurring [p < .05] is not by chance alone We calculate the probability of a t score by comparing it to a sampling distribution Sampling distribution of t scores Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 57 t scores and statistical significance, 1 t = M1 – M2 = 4 – 2.5 Standard error = 1.5 .75 = 2 t = 2.0 Comparing t to a sampling distribution: About 98% of t scores About 98% of t values are lower than 2.0 Sampling distribution of t scores Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 58 t scores and statistical significance, 1 t= M1 – M2 = 4 – 2.5 Standard error t = .88 About 81% of scores = 1.5 1.75 = .86 About 81% of the distribution of t scores are below .88. (area under the curve = .81) Sampling distribution of t scores Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 59 Between v. within group variance: t-test logic The difference between Ms is the same in the two data sets. t = .86 t = 2.0 Since the variances differ… We get different t values About 98% of t scores; p < .05 We make differ judgments about whether these t scores occurred by chance. About 81% of scores Sampling distribution of t scores Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research 60 Clicker!! How unlikely must an effect be for us to consider it “statistically significant” or not simply due to chance? A = .001 B = .01 C = .05 D = .5 E=5 Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 61 Clicker!!! Are these effects statistically significant? # 1: t = .88 #2: t = 2.0 A = both B = neither C = #1 D = #2 Sampling distribution of t scores Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 62 Statistics # 2 Plato’s Allegory of the Cave The critical ratio and the logic of the t-test. The central limit theorem and sampling distributions Correlations and assessing shared variance Dr. David McKirnan, [email protected] Statistics Introduction 2. Abraham de Moivre, French Hugenot refugee in London, originator of the Central Limit Theorem Back Home Page Next Psychology 242 Introduction to Research 63 Central limit theorem The Central Limit Theorem Our evaluation of a t score for statistical significance depends on sample size: Larger samples yield more “normal”, tighter distributions (less error variance…). With smaller samples we use more conservative assumptions about the sampling distribution. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 64 The normal distribution Here is the Sampling Distribution. This is the normal distribution, segmented into t units (similar to Z units or Standard Deviations). Each t unit (e.g., between t = 0 and t = 1) represents a fixed percentage of cases. 34.13% 34.13% of of cases cases Central Limit Theorem: our assumptions about t values have to change, depending upon the size 2.25% of our sample. of 13.59% of cases 13.59% of cases 2.25% of cases cases -3 -2 -1 0 +1 +2 +3 t Scores Dr. David McKirnan, [email protected] Psychology 242, Dr. McKirnan Back Home Page Next Psychology 242 Introduction to Research 65 The Central Limit Theorem; small samples Central Limit Theorem: How well does a sample of individual scores represent the “true” population? True Population M <-- smaller Dr. David McKirnan, [email protected] Statistics Introduction 2. “True” normal distribution M larger ---> Back Home Page Next Psychology 242 Introduction to Research 66 The Central Limit Theorem; small samples Central Limit Theorem True Population M “True” normal distribution With few scores in the sample a few extreme or “deviant” values have a large effect. The distribution is “flat” or has high variance. Score Score Score Score Score Score <-- smaller Dr. David McKirnan, [email protected] Statistics Introduction 2. M Score Score Score Score Score larger ---> Back Home Page Next Psychology 242 Introduction to Research 67 The Central Limit Theorem; larger samples Central Limit Theorem True Population M “True” normal distribution With more scores the effect of extreme or “deviant” values is offset by other values. The distribution has less variance & is more normal. Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score <-- smaller Dr. David McKirnan, [email protected] Statistics Introduction 2. M larger ---> Back Home Page Next Psychology 242 Introduction to Research 68 The Central Limit Theorem; large samples Central Limit Theorem With many scores “deviant” values are completely offset by other values. The distribution is normal, with low(er) variance. The sampling distribution better approximates the population distribution True Population M Score Score Score “True” normal distribution Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score Score <-- smaller Score M Score Score Score larger ---> Pascal’s quincunx demonstration is at http://www.mathsisfun.com/data/quincunx.htm l Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 69 Central limit theorem & evaluating t scores The same logic applies with samples we use to test hypotheses. 1. If the groups are small, the M score for each group reflects a lot of error variance. 2. This increases the likelihood that error variance, not an experimental effect, led to differences between Ms. 3. Since smaller samples (lower df) = more variance, t must be larger for us to consider it statistically significant (< 5% likely to have occurred by chance alone). 4. We evaluate t vis-à-vis a sampling distribution based on the df for the experiment. 5. Critical value for t with p <.05 thus goes up or down depending upon sample size (df) Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 70 The Central Limit Theorem; small samples Central Limit Theorem applied to a sampling distribution: How well do small samples reflect the “true” population? M of sample Ms (approximates population M) Imagine we calculate the M for each of 50 samples, each n=10 M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) Many sample Ms may be far from the M of sample Ms M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) Since small samples have a lot of error, a distribution of small samples is relatively “flat” (lot of variance)… M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) M(n=10) Dr. David McKirnan, [email protected] M(n=10) M(n=10) <-- smaller M 2. Statistics Introduction M(n=10) M(n=10) M(n=10) M(n=10) larger ---> M(n=10) M(n=10) M(n=10) Back Home Page Next Psychology 242 Introduction to Research 71 The Central Limit Theorem; larger samples Central Limit Theorem & sampling distributions, larger samples ‘True” M of sample Ms Now we collect another 50 samples, but each n=25 M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) The M for each sample has less error (since it has larger n), so the distribution will be “cleaner” and more normal. M(n=25) M(n=25) M(n=25) It is less likely that any individual sample M would be far from the M of sample Ms M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) M(n=25) Dr. David McKirnan, [email protected] M(n=25) M(n=25) <-- smaller M 2. Statistics Introduction M(n=25) M(n=25) M(n=25) M(n=25) larger ---> M(n=25) Back Home Page Next Psychology 242 Introduction to Research 72 The Central Limit Theorem; larger samples Central Limit Theorem & sampling distributions, large samples ‘True” M of sample Ms M(n=50) Our third set of samples are each fairly large, say n=50 M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) Since each individual sample has low error, a distribution of large sample Ms will have low variance. M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) It is unlikely for a sample M to far exceed the M of the sample Ms by chance alone. M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50)M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) M(n=50) Dr. David McKirnan, [email protected] <-- smaller M 2. Statistics Introduction larger ---> Back Home Page Next Psychology 242 Introduction to Research 73 critical values Central limit theorem: When df > 120 we assume a perfectly normal distribution. With smaller samples, we assume more error in each group. (Here Z = t; no compensation for sample size) Central limit theorem: When df < 120 we use t to estimate a sampling distribution based on the total df (i.e., ns of groups being sampled). Alpha [ α ]: Probability criterion for “statistical significance,” typically p < .05 Critical value Cut off point for alpha on distribution: With df > 120 critical value for p<.05 = + 1.98 (Z = t) With df < 120 we adjust the critical value based on the sampling distribution we use As df goes down we assume a more conservative sampling distribution, and use a larger critical value for p <.05. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 74 Sampling Distributions and Critical Values Critical value for p<.05 = This sampling distribution n > 120. 1.98; 95% of cases (critical ratios, differences between Ms) Other graphs will show what happens as sample size decreases. are < +1.98 and > -1.98. Z or t (120) > + 1.98 will occur by chance < 5% of the time. A distribution with n > 120 is “normal” -2 2.4% of cases < -1.98 Dr. David McKirnan, [email protected] -1 0 Z Score +1 (standard deviation units) Statistics Introduction 2. +2 2.4% of cases > +1.98 Back Home Page Next Psychology 242 Introduction to Research 75 Sampling distributions: Critical Values when df = 18 Here group sizes are small; Group1 n = 10 Group2 n = 10. df = (10-1) + (10-1) = 18. With a smaller df we estimate a flatter, more “errorful” curve. At df = 18 the critical value for p<.05 = 2.10, a more conservative test. -2 2.4% of cases < -2.10 -1 0 Z Score +1 +2 2.4% of cases > +2.10 (standard deviation units) Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research 76 Critical Values, n = 10 With only 8 df we estimate a flat, conservative curve. This sampling distribution assumes 10 participants. Group1 n = 5, Group2 n = 5; df = (5-1) + (5-1) = 8. Here the critical value for p<.05 = 2.30. -2 2.4% of cases < -2.30 Dr. David McKirnan, [email protected] -1 0 Z Score +1 +2 2.4% of cases > +2.30 (standard deviation units) Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 77 Central Limit Theorem; variations in sampling distributions As samples sizes (df) go down, the estimated sampling distributions of t scores based on them have more variance, giving a more “flat” distribution. -2 .4% of cases below this value Dr. David McKirnan, [email protected] N > 120, t > + 1.98, p<.05 df = 18, t > + 2.10, p<.05. df = 8, t > + 2.30, p<.05. This increases the critical value for p<.05. -1 0 Z Score +1 (standard deviation units) Statistics Introduction 2. +2 2.4% of cases above this value Back Home Page Next Psychology 242 Introduction to Research df 0.10 8 9 10 11 12 13 14 15 20 25 30 40 60 120 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.725 1.708 1.697 1.684 1.671 1.658 1.645 78 A t-table contains: Alpha Levels 0.05 0.02 0.01 2.306 2.896 2.262 2.821 2.228 2.764 2.201 2.718 2.179Critical 2.681 2.160 2.650 values of t 2.145 2.624 2.131 2.602 2.086 2.528 2.060 2.485 2.042 2.457 2.021 2.423 2.000 2.390 1.980 2.358 1.960 2.326 Dr. David McKirnan, [email protected] 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.845 2.787 2.750 2.704 2.660 2.617 2.576 0.001 5.041 4.781 4.587 4.437 4.318 4.221 4.140 4.073 3.850 3.725 3.646 3.551 3.460 3.373 3.291 Degrees of freedom (df) Size of the research samples: (ngroup1 - 1) + (ngrp2 - 1) Alpha levels % likelihood of a t occurring by chance. Critical Values Value t must exceed to be statistically significant [not occurring by chance] at a given alpha. Back Home Page Next Psychology 242 Introduction to Research df 0.10 Alpha Levels 0.05 0.02 0.01 8 9 10 11 12 13 14 15 20 25 30 40 60 120 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.725 1.708 1.697 1.684 1.671 1.658 1.645 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.086 2.060 2.042 2.021 2.000 1.980 1.960 79 Critical values of t (2 tailed test) Dr. David McKirnan, [email protected] Critical values of t 0.001 2.896 3.355 5.041 2.821 3.250 4.781 2.764 3.169 4.587 2.718 3.106 4.437 2.681 3.055 4.318 2.650 3.012 4.221 2.624 2.977 4.140 2.602 2.947 4.073 2.528 2.845 3.850 2.485 2.787 3.725 2.457 2.750 3.646 2.423 2.704 3.551 2.390 Alpha2.660 = .05, df3.460 = 120 2.358 2.617 3.373 2.326 2.576 3.291 Alpha = .05, df = 10 Alpha = .02, df = 13 Critical value of t is read across the row for the df in your study, to the column for your alpha. p < .05 is the most typical alpha. lower alpha (.02 .001, a more conservative test) requires a higher critical value. Back Home Page Next 80 Psychology 242 Introduction to Research df 0.10 Alpha Levels 0.05 0.02 0.01 8 9 10 11 12 13 14 15 20 25 30 40 60 120 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.725 1.708 1.697 1.684 1.671 1.658 1.645 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.086 2.060 2.042 2.021 2.000 1.980 1.960 Dr. David McKirnan, [email protected] 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.528 2.485 2.457 2.423 2.390 2.358 2.326 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.845 2.787 2.750 2.704 2.660 2.617 2.576 Statistics Introduction 2. 0.001 5.041 4.781 4.587 4.437 4.318 4.221 4.140 4.073 3.850 3.725 3.646 3.551 3.460 3.373 3.291 Back Home Page Next Psychology 242 Introduction to Research 81 Determining If A Result Is "Statistically Significant" Assumptions: Null hypothesis: the difference between Ms [or the correlation, chi square, etc.] is > 0 or < 0 by chance alone. Statistical question: is the effect in your experiment different from 0 by more than chance alone? "More than chance alone" is < 5% of the time [p < .05]. Steps: 1. Derive the t value for the difference between groups Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research Statistical significance… 82 Steps cont.: 2. Figure out what distribution to compare your t value to ... • Use the degrees of freedom (df) for this. • df = (ngroup1 - 1) + (ngroup2 - 1). • The Central Limit Theorem tells us to assume there is more error (a more "flat" distribution) as df go down. 4. Use the usual criteria [alpha value] for “statistical significance” of p < .05 (unless you have good reason to use another…). 5. Find the value on the t table that corresponds to your df, at your alpha. This is the critical value that your t must exceed to be considered “statistically significant”. 6. Compare your t to the critical value, using the absolute value of t. Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research df 8 9 10 11 12 13 14 15 18 20 25 30 40 60 120 83 Testing t 0.10 Alpha Levels 0.05 0.02 0.01 0.001 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.734 1.725 1.708 1.697 1.684 1.671 1.658 1.645 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.101 2.086 2.060 2.042 2.021 2.000 1.980 1.960 5.041 4.781 4.587 4.437 4.318 4.221 4.140 4.073 3.922 3.850 3.725 3.646 3.551 3.460 3.373 3.291 Dr. David McKirnan, [email protected] 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.552 2.528 2.485 2.457 2.423 2.390 2.358 2.326 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.878 2.845 2.787 2.750 2.704 2.660 2.617 2.576 Statistics Introduction 2. • Use p < .05 (unless you want to be more conservative by using a higher value). • Look up your df to see what sampling distribution to compare your results to. • With n = 10 per group df = (10-1) + (10-1) = 18. • Compare your t to the critical value from the table. • If the absolute value of t > the critical value, your effect is statistically significant at p < .05. Back Home Page Next Psychology 242 Introduction to Research 84 Statistics # 2 Plato’s Allegory of the Cave The critical ratio and the logic of the t-test. The central limit theorem and sampling distributions Correlations and assessing shared variance Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 85 Testing a hypothesis: t-test versus correlation How much do you fear and loathe statistics? A = Completely B = A lot C = Pretty much D = Just a little E = Not at all Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 86 t-test versus correlation How much do think your fear and loathing of statistics will affect your grade? A = Completely B = A lot C = Pretty much D = Just a little E = Not at all Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 87 t-test versus correlation How could we test the hypothesis that fear and loathing of statistics affects grades… Using an experimental design? Using a correlational or measurement design? Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research t-tests: Self-efficacy training to decrease fear and loathing of statistics (experimental group). Control group: “placebo” intervention Measure differences in the Dependent Variable. Testing a hypothesis: t-test versus correlation Used for experiments Manipulate the independent variable 88 Psychology 242 grade. Correlations: Used for measurement studies Measure the Predictor variable How much do people fear and loathe statistics …and the Outcome variable Psychology 242 grade. Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 89 Hypothesis: Evaluating the results of an Self-efficacy training to lower fear & Loathing of experimental approach Statistical question: Did the experimental group get statistically significantly higher grades than the control group? statistics will lead to higher grades t-test: Within group variance What would we expect by chance given the Within-group amount of variance variance within groups. Mgroup1 – Mgroup2 =t standard error of M Dr. David McKirnan, [email protected] Number of students How much variance is there between the Between group variance groups Between-group variance M for M grade for control group experimental group Within-group variance E- E D C B A A+ Grade on Stat. exam Distribution of grades for the control group: Normal F & L. Distribution of grades for the experimental group: Low fear & loathing Back Home Page Next Psychology 242 Introduction to Research 90 Hypothesis: Evaluating the results of an Self-efficacy training to lower fear & Loathing of experimental approach Statistical question: Did the experimental group get statistically significantly higher grades than the control group? statistics will lead to higher grades t-test: What would we expect by chance given the Within-group amount of variance variance within groups. Number of students How much variance is there between the groups Between-group variance M for M grade for control group experimental group Within-group variance E- E D C B A A+ Grade on Stat. exam Dr. David McKirnan, [email protected] Distribution of grades for the control group: Normal F & L. Distribution of grades for the experimental group: Low fear & loathing Back Home Page Next Psychology 242 Introduction to Research 91 Hypothesis: Evaluating the results of an Self-efficacy training to lower fear & Loathing of experimental approach Statistical question: Did the experimental group get statistically significantly higher grades than the control group? statistics will lead to higher grades t-test: Between group variance Within group variance Within-group variance Mgroup1 – Mgroup2 =t standard error of M Dr. David McKirnan, [email protected] Number of students Between-group variance M for M grade for control group experimental group Within-group variance E- E D C B A A+ Grade on Stat. exam Distribution of grades for the control group: Normal F & L. Distribution of grades for the experimental group: Low fear & loathing Back Home Page Next Psychology 242 Introduction to Research Taking a correlation approach t-test: 92 e.g., high versus low How much they fear ψ 242 grade fear and loathing of. and loathe statistics. statistics. We create group differences on the Independent Variable. How much does this make them differ on the Dependent variable? Difference between groups standard error of M Correlation: Do “natural” Individual differences on the Predictor Variable. …correspond to individual differences on the Outcome? Z score for Predictor * Z score for Outcome Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 93 Correlations; larger patterns of Zs Hypothesis: Students who are “naturally” high in Fearlessness & Love of Statistics will have higher grades Statistical question: Do students who are higher on a measure of F & L have significantly higher grades? Correlation: +2 1.4 As students go up in F & L, do they go up by a similar amount on grades? Z score on grades How much is variance on the predictor variable shared with the outcome variable: +1.5 1.2 +1 1 +.5 0.8 0 0.6 -.5 -1 0.4 Each individual participant -1.5 0.2 -2 0 15-2 -1.5 -125 -.5 0 35 +.5 +1 45 +1.5 +2 Z score on Fearlessness Dr. David McKirnan, [email protected] Back Home Page Next Psychology 242 Introduction to Research 94 Correlation formula Pearson Correlation: measures how similar the variance is in two variables, a.k.a. “shared variance”. are people who are above or below the mean on one variable similarly above or below the M on the second variable. If everyone who is a certain amount over the M on one variable (say, Z = +1.5) is the same amount above M on the other variable (Z also = +1.5) the correlation would be +1.0. Assess shared variance by multiplying the person’s Z scores for each of the two variables / df: Dr. David McKirnan, [email protected] Statistics Introduction 2. ZX * Z Y r n1 Back Home Page Next Psychology 242 Introduction to Research The Pearson Correlation coefficient: 95 Measures linear relation of one variable to another within participants, e.g. Wisdom & Age; Cross-sectional: how much are participants’ wisdom scores related to their different ages? Longitudinal: how much does wisdom increase (or decrease) as people age? Positive correlation: among 1.4 older participants wisdom is higher… Wisdom 1.2 1 0.8 Negative correlation: older 0.6 participants actually show lower wisdom… 0.4 0.2 0 15 25 35 Age Dr. David McKirnan, [email protected] 45 No (or low) correlation: higher / lower age makes no difference for wisdom.. Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 96 Correlations & Z scores, 1 We are going to plot each participant’s wisdom score (on a scale of ‘0’ to ‘1.4’) against his age (ranges from 15 to 50) These variables have different scales [‘0’ ‘1.4’ versus 15 50]. How can we make them comparable? We can standardize the scores by turning them into Z scores. 1.4 Wisdom 1.2 1 Calculate the M and S for each variable 0.8 Express each person’s score as their Z score on that variable. 0.6 0.4 0.2 0 15 20 25 30 35 40 45 50 Age Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 97 Correlations & Z scores, 2 Imagine the M age = 30, Standard Deviation [S] = 10 Participant What age would 14 is be ageone 25. standard What is deviation her Z score above on age? the Mean (Z = 1)? 1.4 A=0 20 1.2 B = +.5 32 Wisdom 1 0.8 C = 40 -.5 0.6 D = +1.0 45 0.4 E = +1.5 25 0.2 0 15 20 25 30 35 40 45 50 Age Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Age: Mean [M] = 30, Standard Deviation [S] = 10 Wisdom: M = 0.7, S = 0.4 +2 +1.5 +1 Wisdom 98 Correlations & Z scores, 4 +.5 0 -.5 -1 -1.5 -2 We can show the Means for each variable… 1.4 …and the corresponding Z scores 1.2 1 M wisdom = 0.7 (Z=0) S = 0.4 0.8 0.6 M age = 30 (Z=0) S = 10 0.4 0.2 0 15 -1.5 20 -1 25 -.5 30 0 35 +.5 40 45 50 +1 +1.5 +2 Age Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research For each participant we use Z scores to show how far above or below the Mean they are on the two variables. The Z scores allow us to standardize the variables. +2 +1.5 +1 Wisdom 99 Correlations & Z scores, 4 +.5 0 -.5 -1 -1.5 -2 1.4 1.2 This is a wisdom score of -0.6 (Z score = -.5) and age 18 (Z = -1.25) 1 0.8 0.6 0.4 This participant had a Eventually we1.1 can wisdom score of see a=pattern (Z score +1) andof age scores 45 (Z = –1.5) as age scores get higher so M wisdom = 0.7 (Z=0) does Wisdom. S = 0.4 This would represent a positive M age = 30 (Z=0)This participant had a wisdom score of correlation S = 10 0.66 (Z score = -.5) and age 37 (Z = 1.1) 0.2 0 15 20 25 -1.5 -1 -.5 30 0 35 +.5 40 +1 45 +1.5 50 +2 Age Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research For each participant we use Z scores to show how far above or below the Mean they are on the two variables. The Z scores allow us to standardize the variables. +2 +1.5 +1 Wisdom 100 Correlations & Z scores, 4 +.5 0 -.5 -1 -1.5 -2 1.4 1.2 This is a wisdom score of -0.6 (Z score = -.5) and age 18 (Z = -1.25) 1 0.8 0.6 0.4 This participant had a Eventually we1.1 can wisdom score of see a=pattern (Z score +1) andof age scores 45 (Z = –1.5) as age scores get higher so does Wisdom. This would represent a positive M age = 30 (Z=0)This participant had a wisdom score of correlation S = 10 0.66 (Z score = -.5) and age 37 (Z = 1.1) 0.2 0 15 20 25 -1.5 -1 -.5 30 0 35 +.5 40 +1 45 +1.5 50 +2 Age Dr. David McKirnan, [email protected] Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 101 Correlations: Z score basis Correlations (r) range from +1.0 to -1.0: r = +1.0: for every unit increase in age there is exactly one unit increase in wisdom. r = -1.0: every unit increase in age corresponds to one unit decrease in wisdom. +2 A “unit” represents a Z or Standard Deviation unit. M age (Z=0) 1.4 Wisdom +1.5 1.2 +1 +.5 0.8 0 0.6 -.5 -1 -1.5 -2 r reflects how each person’s Z score on the “Y” axis (Wisdom) corresponds to his/her Z score for the “X” axis (Age). 1 M wisdom (Z = 0) 0.4 0.2 0 15 20 25 30 35 40 45 50 -2 -1.5 -1 -.5 0 +.5 +1 +1.5 +2 Wisdom & age transformed to Z scores. Dr. David McKirnan, davidmck@uic.edu Age Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 102 Correlations; larger patterns of Zs The larger pattern of Z scores for the “Y” variable (Wisdom) and the “X” variable (age) determines how strong the correlation is… +2 1.4 +1.5 Wisdom +1 And whether it is positive or negative… 1.2 1 +.5 0.8 0 -.5 0.6 -1 0.4 -1.5 0.2 M wisdom (Z=0) -2 0 Z = -2 -1.5 -1 -.5 Age Dr. David McKirnan, davidmck@uic.edu 45 35 25 15 0 +.5 +1 +1.5 +2 M age (Z=0) Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Two variables are positively correlated if each score above / below the M on the first variable is about the same amount above / below M on Variable 2. +2 +1 Wisdom (Z=+1.7) * (Z=+1.5) = +2.55 1.4 +1.5 1.2 1 +.5 -.5 0.6 -1 0.4 -1.5 0.8 0 103 Correlations; larger patterns of Zs Z= +1 * Z= -.5 = -.5 etc. to get r =.8 M wisdom (Z=0) Z = +.4 * Z =+.5 = +.2 Z= -1.5 * Z= -1 = +1.5 0.2 -2 0 Z = -2 -1.5 -1 -.5 0 Age Dr. David McKirnan, davidmck@uic.edu 45 35 25 15 +.5 +1 +1.5 +2 M age (Z=0) Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 104 Correlations; larger patterns of Zs The same (but inverse) logic applies for a negative correlation: each score above the M on age is below the M on Wisdom , and visa versa. +2 Z = +.4 * Z =+.5 = +.2 Z= -1.5 * Z= +.7 = -1.05 1.4 +1.5 +1 Wisdom Z= +1.9 * Z= -1 = -1.9 +.5 0.8 -.5 0.6 -1 0.4 -1.5 1 0 Z= +.3 * Z= -.5 = -.15 1.2 etc. M wisdom (Z=0) 0.2 to get r =-.8 -2 0 15 Z = -2 25 -1.5 -1 35 -.5 0 Age Dr. David McKirnan, davidmck@uic.edu +.5 45 +1 +1.5 +2 M age (Z=0) Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 105 Data patterns and correlations When two variables are strongly related – for each person ZY ≈ ZX – the Scatter Plot shows a nice “tight” distribution and r is high… As the variables are less related, the plot gets more random or “scattered”, and r goes down: Click top image for interactive scatter plots. Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next 106 Correlation Example 1 Psychology 242 Introduction to Research Research question: what are ψ consequences of negative attitudes toward statistics? Hypothesis: fear and loathing of statistics leads to social isolation among students. Method: Measurement study rather than experiment Measure each students’ scores on 2 variables: Fear and Loathing of Statistics Index Social Isolation Scale; Test whether the two variables are significantly related; if a student is high on one, is s/he also high on the other? Data: Scores on 2 measured variables for 7 students Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 107 Correlation example 1; data & plot The Scatter plot provides a graphic display of the data set, showing how strongly the variables are related …. Simple scatter plot of two variables for 7 participants Example data set 1 #1 #2 #3 #4 #5 #6 #7 Social Fear & loathing Isolation of statistics score (“Y”) index (“X”) 1 2 4 7 3 5 4 5 3 1 6 6 2 7 6.00 Social Isolation Participant 7.00 5.00 4.00 3.00 2.00 R Sq Linear = 0.003 1.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 Fear & Loathing of statistics Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Results for example 1: these up) data show a Correlation example 1; scatter (made plot basically random relation between the variables. Psychology 242 Introduction to Research 108 Simple scatter plot of 2 variables for 7 participants Idealized regression line 7.00 #4 Participant scores on each of the two variables Best fitting (actual) regression line Social isolation 6.00 #6 5.00 4.00 #3 #7 3.00 #5 2.00 #2 R Sq Linear = 0.003 #1 1.00 1.00 2.00 3.00 4.00 5.00 6.00 The low correlation coefficient confirms a lack of relation between these variables. 7.00 Fear & Loathing of statistics SPSS correlation results Social Isolation Pearson Correlation Sig. (2-tailed) N Dr. David McKirnan, davidmck@uic.edu Fear & Loathing of statistics .058 .902 7 Statistics Introduction 2. With N = 7 this size r will occur about 90% of the time by chance alone Back Home Page Next Psychology 242 Introduction to Research N- 2 Level of significance for two-tailed test (N= number of pairs) p<.10 p<.05 p<.02 p<.01 1 .988 .997 .9995 .9999 2 .900 .950 .980 .990 3 .805 .878 .934 .959 .882 .917 .833 .874 4 5 109 Critical values of Pearson Correlation (r) Critical .811 .669 values of.754 r .729 10 .497 .576 .658 .708 15 .412 .482 .558 .606 25 .323 30 .296 .349 .409 .449 60 .211 .250 .295 .325 70 .195 .232 .274 .302 80 .183 .217 .256 .284 90 .173 .205 100 .164 .195 .230 .254 Dr. David McKirnan, davidmck@uic.edu .445 N =.381 27, alpha = .01 .487 > 90 participants, .242 .267 alpha (p value) = .05 Statistics Introduction 2. Like t, we test the statistical significance of a correlation by comparing it to a critical value of r. We use N – 2… … and the significance level (alpha) … to find our critical value Back Home Page Next Psychology 242 Introduction to Research Level of significance for two-tailed test N- 2 (N= number of pairs) p<.10 p<.05 p<.02 p<.01 1 .988 .997 .9995 .9999 2 .900 .950 .980 .990 3 .805 .878 .934 .959 4 .729 .811 .882 .917 5 .669 .754 .833 .874 10 .497 .576 .658 .708 15 .412 .482 .558 .606 .381 Since .323 the r we observed in .445 .296 our sample = .058.349 (< .75), we.409 assume.211it is not significantly .250 .295 different.195 from 0 (we accept the .232 .274 null hypothesis). .487 25 30 60 70 80 110 Critical values of Pearson Correlation (r) Critical value of r is .75. .325 .302 .217 .256 .284 90 .173 .205 .242 .267 100 .164 .195 .230 .254 Statistics Introduction 2. We test the effect at p<.05 .449 .183 Dr. David McKirnan, davidmck@uic.edu Critical values of r are N pairs-2 Back Home Page Next Psychology 242 Introduction to Research 111 Correlation example 2 Second correlation example. Scatter plot shows strong linear relation; nearly ideal regression line (R2 = .66). Same variables, data changed to reflect stronger relation Example data set 2 7 Participant (“Y”) #1 4 1 3 7 2 5 2 5 2 4 5 4 6 3 #2 #3 #4 #5 #6 #7 (“X”) 6 Social Isolation Social Isolation Fear & loathing of statistics 5 4 3 2 R Sq Linear = 0.66 1 2 3 4 5 6 Fear & Loathing of Statistics Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Participants above the M on isolation are also > M on F&L 7 6 Social Isolation 112 Correlation example: Z score correspondence 5 4 Participants below M on isolation are also < M on F&L M = 3.4 (Z=0) 3 2 R Sq Linear = 0.66 1 2 M = 4.1 (Z=0 ) 5 Fear & Loathing of Statistics 3 4 Regression line Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. 6 This high correlation will be reflected via their mutual Z scores Back Home Page Next Calculating r from Z scores Psychology 242 Introduction to Research 113 1. Calculate Z scores for each participant on each variable, using the basic Z computations: Social isolation Participant (variable “Y”) Calculate Squared deviations from the M [(X-M)2] to get Standard deviation [ S ] Compute Z scores [Z = X–M / S] Score M 4 3.43 (4 - 3.43)2 = .32 Z 4 3.43 2.05 .278 1 3.43 3.43)2 = Z 1 3.43 2.05 1.17 3 3 3.43 (3 – 3.43)2 = .185 -.207 4 7 3.43 = 12.4 1.72 5 2 3.43 = 2.04 - .69 6 5 3.43 = 2.46 .76 7 2 3.43 = 2.04 -.69 (N = 7) Σ = 24 1 2 X M (1 - 5.9 Σ (M – X)2 = 25.34 24 3.43 n 7 S Dr. David McKirnan, davidmck@uic.edu (X M)2 df 25.34 4.22 2.05 6 Statistics Introduction 2. Back Home Page Next Calculating r from Z scores, 2 Psychology 242 Introduction to Research 114 2. Enter Z scores for the two variables for each participant. 3. Multiply the Zs from each variable for each participant sum, Σ (Zvar1 * Zvar2) r= divide by df Npairs - 1 Participant Social isolation (variable “Y”) Fear & loathing of statistics (Variable “X”) 4.86 .81 6 ZY * ZX Score Z score Score Z score 1 4 .278 5 .64 .278 * .64 = .18 2 1 -1.17 2 -1.59 -1.17 * -1.59 = 1.87 3 3 -.207 4 -.106 = .02 4 7 1.72 5 .637 = 1.09 5 2 -.69 4 -.106 = .07 6 5 .76 6 1.381 = 1.05 7 2 -.69 3 -.845 = .58 Σ ZY * ZX = 4.86 (n = 7) Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research n- 2 115 Critical values of Pearson Correlation (r) Level of significance for two-tailed test (n = number of participants) p<.10 p<.05 p<.02 p<.01 1 .988 .997 .9995 .9999 2 .900 .950 .980 .990 3 .805 .878 .934 .959 4 .729 .811 .882 .917 5 .669 .754 .833 .874 10 .497 .576 .658 .708 15 .412 .482 .558 .606 .323 r in this sample.381= .81. .349 30 Since.296it is > .754, we .211 .250 60 assume it significantly .232 reject 70 differs .195 from 0 (we .183 hypothesis). .217 80 the null .445 .487 .409 .449 .295 .325 .274 .302 .256 .284 90 .173 .205 .242 .267 100 .164 .195 .230 .254 25 Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Find the critical value on a Pearson table. Find the critical value at n-2: The critical value of r at p<.05 for 5 df (n [7] - 2) is .754 Back Home Page Next Psychology 242 Introduction to Research 116 Summary: Statistical tests t-test: Compare one group to another Experimental v. control (Experiment) Men v. women, etc. (Measurement) Calculate M for each group, compare them to determine how much variance is due to differences between groups. Calculate standard error to determine how much variance is due to individual differences within each group. Calculate the critical ratio (t): Difference between groups standard error of M Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 117 Statistics summary: correlation Pearson Correlation: measures how similar the variance is between two variables (A.K.A. “shared variance”) within one group of participants. are people who are above or below the mean on one variable similarly above or below the M on the second variable. If everyone who is a certain amount over the M on one variable (say, Z = +1.5) is the same amount above M on the other variable (Z also = +1.5) the correlation would be +1.0. Assess shared variance by multiplying ZX * Z Y the person’s Z scores for each of the r n1 two variables / df: Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 118 Type I v. Type II errors “Reality” Accept Ho Ho true Ho false [effect due to chance alone] [real experimental effect] Correct decision Type II error Type I error Correct decision Decision Reject Ho Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 119 Statistical Decision Making: Errors Type I error; Reject the null hypothesis [Ho] when it is actually true: Accept as ‘real’ an effect that is due to chance only Type I error rate determined by Alpha (.05 / .01 / .001…) “Worst” error; statistical conventions designed to prevent type I errors Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 120 Type I v. Type II errors “Reality” Accept Ho Ho true Ho false [effect due to chance alone] [real experimental effect] Correct decision Type II error Type I error Correct decision Decision Reject Ho Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 121 Statistical Decision Making: Errors Type I error; Reject the null hypothesis [Ho] when it is actually true: Accept as ‘real’ an effect that is due to chance only Type I error rate determined by Alpha (.05 / .01 / .001…) “Worst” error; statistical conventions designed to prevent type I errors Type II error; Accept Ho when it is actually false; Assume as chance an effect that is actually real. Type II rate most strongly affected by statistical power: Central Limit Theorem: Smaller samples Dr. David McKirnan, davidmck@uic.edu Assume more variance Statistics Introduction 2. More conservative critical value* *within a given alpha… Home Back Page Next Psychology 242 Introduction to Research 122 Inferential statistics: summary Core assumptions of inferential statistics Compare observed results (from an experiment or measurement study) to a distribution of possible results Estimate the probability that the results are due to a “real” experimental effect, or are due to chance or error. The distribution of possible results is imputed based on the variance, and the Degrees of Freedom [df] for that experiment. Central Limit Theorem: data from smaller samples (fewer df) will be more “errorful”. Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 123 Type I v. Type II errors Decreased by more statistical power (↑ participants) “Reality” Accept Ho Ho true Ho false [effect due to chance alone] [real experimental effect] Correct decision Type II error Type I error Correct decision Decision Reject Ho Decreased by more conservative Alpha (.05 .01 .001) Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 124 Inferential statistics: summary, 2 We use controlled data, statistics, and inferential logic to help us deduce the true forms of the world by minimizing the Type I and Type II errors that plague human judgment. Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research Plato’s cave and the estimation of “reality” Inferences about our observations: 125 Inferential statistics: summary, Key terms Deductive v. Inductive link of theory / hypothetical constructs & data Generalizing results beyond the experiment Critical ratio (you will be asked to produce and describe this). Variance, variability in different distributions Degrees of Freedom [df] t-test, between versus within –group variance Sampling distribution, M of the sampling distribution Alpha (α), critical value t table, general logic of calculating a t-test “Shared variance”, positive / negative correlation General logic of calculating a correlation (mutual Z scores). Null hypothesis, Type I & Type II errors. Dr. David McKirnan, davidmck@uic.edu Statistics Introduction 2. Back Home Page Next Psychology 242 Introduction to Research 126 Calculating variances X M X-M (X - M)2 4 4 0 0 For each group: 3 4 -1 1 5 4 1 1 Add up the scores & divide by n to calculate the Mean 5 4 -1 1 4 4 0 0 2 4 -2 4 4 4 0 0 4 4 0 0 3 4 -1 1 6 4 2 4 Σ=0 Σ = 12 Σ = 40 n = 10 Express each score as a deviation from the Mean Square all the deviation scores Sum them (to get the “sum of squares) Divide by Degrees of Freedom [df]; n – 1 S2 M=4 Dr. David McKirnan, davidmck@uic.edu Variance formula: Statistics Introduction 2. ( X M) 2 df Back Home Page Next