Report

Information for parents on the effectiveness of English secondary schools for different types of pupil Lorraine Dearden, John Micklewright and Anna Vignoles Institute of Education, University of London Motivation • DfE provides information on schools and pupil achievement in a number of ways, including raw scores • DCSF also measures school performance with a contextualised value added model, which takes account of the different pupil intakes of schools (Ray, 2006) – better guide to school effectiveness than raw GCSE scores, which capture differences in school intake characteristics • But evidence that parents look more at raw scores than CVA (Hansen and Machin, 2010) • Our first objective is to try to find a simple measure which is easy to understand for parents Motivation • Also assumes an average CVA score of a school is meaningful as a summary statistic of the performance of a school • Yet the literature has shown schools to be differentially effective – Jesson and Gray, 1991; Teddlie and Reynolds, 2000; Thomas et al. 1997; Wilson and Piebalga 2009 • Our second objective is to try to provide a simple measure which allows for differential effectiveness Research aims • If schools are differentially effective then parents need to know the value added by a school for children with similar prior attainment to their own child • We propose a measure that would do this • Abstracts from issues of sorting into schools and mobility Key research questions • To what extent do summary measures of school performance, such as CVA, hide differential performance of schools for different types of children? • Are simple descriptive measures of the differential effectiveness of a school good enough approximations? Literature • We contribute to the following literatures: – technical limitations of published school performance measures (Goldstein and Spiegelhalter 1996, Leckie and Goldstein 2009) – measurement of differentially effective schools (Jesson and Gray, 1991; Teddlie and Reynolds, 2000; Thomas et al. 1997; Wilson and Piebalga 2009). – incentives for schools when using performance measures to improve school accountability (Ladd and Walsh, 2000) Methodology • Divide pupils into prior attainment groups on the basis of KS2 scores (parents are only given group information) • Calculate various measures of individual performance at GCSE for pupils in each of the prior attainment groups at KS2 • For each school we average across the values for its pupils in each prior attainment group - 8 summary statistics of pupil performance. • If these group averages vary significantly - school is differentially effective. Data • Integrated National Pupil Database (NPD)/Pupil Level School Census (PLASC) • Two cohorts of pupils in year 11 (age 16) in 2006/7 and 2007/8. • State school pupils for whom we have KS2 test scores Prior attainment groups • Key Stage 2’ (KS2) English and mathematics attainment (age of 10/11 year 6) • Expected level of achievement is 4 • 5 x 5 combinations of mathematics and English into 8 groups • Eight groups are below level 3; level 3-3; level 4-3; level 3-4; level 4-4; level 4-5; level 5-4 and level 5-5. KS2 prior attainment groups for year 11 children in state secondary schools in 2006/7 and 2007/8 KS2 group (cumul.) Frequency % Below level 3 73,922 6.6 6.6 33 102,591 9.1 15.7 34 73,063 6.5 22.2 43 96,762 8.7 30.9 44 339,519 30.4 61.3 45 119,474 10.7 72.0 54 113,325 10.2 82.2 55 198,326 17.8 100.0 1,116,982 100.0 Total Outcomes • Capped GCSE scores • Based on pupil’s 8 best GCSE scores • Points achieved in English and mathematics GCSE added to capped score • Ensures that essential academic skills in mathematics and English are included – If already present in the capped score, this implies that maths and English enter our measure twice • This augmented capped score has recently been adopted in official CVA model Adjusted raw score measure • Individual’s KS4 score minus the mean of other individuals in the KS2 prior attainment group • Similar to the value-added (VA) measure used by DCSF 2002-5 – – – – We use the mean group score rather than the median We use prior attainment groups rather than a univariate score We do not include science Our KS4 measure is the capped 8 score augmented by English and maths rather than the straight capped 8 score. • DCSF summarised school performance by taking the average of these individual-level differences across all pupils in the school. • We calculate 8 separate averages for each school, one for each prior attainment group. VA and Adjusted VA measures • VA measure then allows fully for prior attainment by estimating the following equation by group to predict expected KS4 • KS4ig = ag + bg.KS2ig + uig g = 1..8 groups • CVA measure then allows for contextual factors by adding controls – gender, month of birth, IDACI, FSM, EAL, SEN, ethnicity Absolute Group adjusted raw score 1. (crudely allows for prior diff = KS4-KS4 KS4 mean attainment group) metric: KS4 points 3. Relative 2. ZKS4 = [KS4-KS4mean]/KS4SD metric: group KS4 SDs 4. VA (value added controlling residual of regression of KS4 residual of regression of Z KS4 for prior KS2 score) on KS2 on ZKS2, where latter defined analogously [equivalent to metric: KS4 points measure 3 divided by KS4SD] metric: group KS4 SDs 5. 6. Adjusted VA (value added as for measure 3 but with as for measure 4 but with with covariates) controls in regression controls in regression metric: KS4 points metric: group KS4 SDs Groups Whole School - All Group 22 Group 33 Group 34 Group 43 Group 44 Group 45 Group 54 Group 55 P-value (Groups same) Group adjusted raw score 15.8 34.0 [12.270] 19.5 [12.695] 28.4 [14.580] 33.0 [12.970] 21.4 [ 4.621] 14.0 [ 9.064] 15.4 [ 7.055] -11.6 [ 5.783] VA 15.8 28.5 [12.352] 20.1 [12.603] 27.6 [14.360] 31.4 [11.968] 21.1 [ 4.409] 14.2 [ 8.544] 13.9 [ 6.491] -7.8 [ 5.231] Covariate Adjusted VA 13.2 35.2 [11.981] 14.8 [11.899] 20.9 [13.312] 25.3 [11.721] 17.5 [ 4.267] 10.6 [ 8.238] 12.2 [ 6.778] -5.8 [ 4.857] 0.005 0.02 0.039 No. Obs. % total 666 46 6.9 62 9.3 34 5.1 48 7.2 225 33.8 75 11.3 78 11.7 98 14.7 Groups Whole School - All Group 22 Group 33 Group 34 Group 43 Group 44 Group 45 Group 54 Group 55 Groups av P value Group adjusted raw score 5.287 [ 4.478] 45.606 [22.419] 2.72 [15.359] -24.973 [19.419] 23.643 [ 8.902] 15.457 [ 5.652] 3.189 [ 9.956] -11.597 [11.686] -16.698 [ 6.374] 4.808 [ 3.612] 0.001 VA 4.285 [ 3.517] 42.111 [22.059] 4.538 [14.887] -25.092 [20.035] 27.316 [ 8.012] 15.315 [ 5.256] 5.476 [ 9.043] -12.362 [11.157] -11.367 [ 6.241] 6.058 [ 3.452] 0.001 Covariate adjusted No. VA Obs 0.163 540 [ 3.363] 33.712 23 [19.209] -1.695 46 [13.988] -25.65 36 [19.396] 24.274 60 [ 8.131] 9.323 174 [ 4.933] 3.945 62 [ 8.297] -15.72 66 [11.487] -13.98 73 [ 6.066] 1.924 540 [ 3.307] 0.002 How common is differential effectiveness? This slide shows the % of schools that are differentially effective, as measured by a significant difference (at the 5% level) in the means of the measures across the prior attainment groups. Dependent Variable Absolute Relative Raw score 40.0% 35.2% VA 37.9% 31.7% CVA 31.7% 25.0% Number schools 3096 Differential effectiveness and selective schools This slide shows the % of schools that are differentially effective including and excluding selective schools. Incl selective Dependent Variable Absolute Raw score 40.0% VA 37.9% CVA 31.7% Number schools 3096 Excl selective Absolute 37.0% 35.2% 29.8% 2932 Robustness Test This slide shows the % of schools that are differentially effective as measured by a significant difference at both the 5% level and the 1% level in the means of the measures across the prior attainment groups. Dependent Variable Raw score VA CVA Number schools 5% significance Absolute 37.0% 35.2% 29.8% 2932 1% significance Absolute 23.4% 21.6% 17.0% Rank correlations within group Group 22 Group 33 Group 34 Group 43 Group 44 Group 45 Group 54 Group 55 Raw/VA Raw/CVA VA/CVA 0.99 0.92 0.93 0.99 0.91 0.92 0.99 0.88 0.89 0.99 0.91 0.92 0.99 0.87 0.89 0.98 0.86 0.89 0.98 0.89 0.91 0.97 0.86 0.9 Value Added rank correlations excluding selective schools Group 22 Group 33 Group 44 Group 55 Group 22 1.00 Group 33 0.68 1.00 Group 44 0.58 0.71 1.00 Group 55 0.37 0.49 0.71 1.00 Average 0.70 0.82 0.94 0.74 Robustness checks • Sample size issues so re-estimated results where n>10 in each prior attainment group in each school • Robustness to missing data problems – using teacher predictions Things to do.... • Multiple comparisons with the best/ comparison statistics • Noise in rank correlations Conclusions • Schools are differentially effective but estimates are sensitive to how this is measured – 30-40% of schools are differentially effective at 5% level of significance – 20% of schools are differentially effective at 1% level of significance – estimates vary somewhat across measures (raw scores, VA, adjusted VA) though there is high correlation between measures 0.86-0.99 • Even the most conservative estimate suggests one in six schools are differentially effective Conclusions • For school league tables (and hence parents) this differential effectiveness would seem to matter – the rank of schools varies substantially for different prior attainment groups (correlation across groups 0.3-0.7) – this of course abstracts from the statistical significance of the differences • But the results suggest that for a non trivial proportion of schools parents need information on value added by school for a particular prior attainment group Implications • Simple measures also suggest significant amounts of differential effectiveness but as estimates do vary by measure we need to specify preferred measure • Results indicate different rankings of schools for different ability groups but further work needed on multiple comparisons and identifying significant differences in rank correlations • Implications for policy: a sizeable minority of schools add different value for pupils with different prior attainment and there are simple measures that can communicate this to parents. References • • • • • • • • • • Goldstein, H. and Spiegelhalter, D. J. (1996) League tables and their limitations: statistical issues in comparisons of institutional performance. Journal of the Royal Statistical Society: Series A, 159, 385-443. Goldstein H, Rasbash J, Yang M, Woodhouse, G, Pan H, Nuttall, D, and Thomas, S (1993) ‘A multilevel analysis of school examination results’ Oxford Review of Education, 19: 425-33. Gorard, S. (2010) All evidence is equal: the flaw in statistical reasoning, Oxford Review of Education, (forthcoming). Jesson, D and Gray J (1991). Slants on Slopes: Using Multi-level Models to Investigate Differential School Effectiveness and its Impact on Pupils’ Examination Results. School Effectiveness and School Improvement: An International Journal of Research, Policy and Practice. 2(3):230-247. Ladd and Walsh (2000) ‘Implementing value-added measures of school effectiveness: getting the incentives right’, Economics of Education Review, vol. 2 part 1 pp. 1–17. Leckie, G. and Goldstein, H. (2009) The limitations of using school league tables to inform school choice. Journal of the Royal Statistical Society: Series A. vol. 127 part 4, pp835-52. Ray, A. (2006) School Value Added Measures in England. Paper for the OECD Project on the Development of ValueAdded Models in Education Systems. London, Department for Education and Skills http://www.dcsf.gov.uk/research/data/uploadfiles/RW85.pdf. Teddlie, C. and Reynolds, D. (2000) The International Handbook of School Effectiveness Research, Reynolds, Falmer Press, London and New York. Thomas, S, Sammons, P, Mortimore, P and Smees, R, (1997) ‘Differential secondary school effectiveness : examining the size, extent and consistency of school and departmental effects on GCSE outcomes for different groups of students over three years’, British Educational Research Journal, no. 23, part 4, p.451-469. Wilson D and Piebalga A (2008) ‘Performance measures, ranking and parental choice: an analysis of the English school league tables’ International Public Management Journal, 11: 233-66