OCCASIONAL PAPER VET program completion rates: an evaluation of the current method National Centre for Vocational Education Research Publisher’s note The views and opinions expressed in this document are those of the author/project team and do not necessarily reflect the views of the Australian Government, or state and territory governments. Any interpretation of data is the responsibility of the author/project team. To find other material of interest, search VOCEDplus (the UNESCO/NCVER international database <http://www.voced.edu.au>) using the following keywords: completion; data analysis; evaluation; outcomes; participation; qualifications; statistical method; vocational education and training. © National Centre for Vocational Education Research, 2016 With the exception of cover design, artwork, photographs, all logos, and any other material where copyright is owned by a third party, all material presented in this document is provided under a Creative Commons Attribution 3.0 Australia <http://creativecommons.org/licenses/by/3.0/au>. This document should be attributed as NCVER 2016, VET program completion rates: an evaluation of the current method, NCVER, Adelaide. NCVER is an independent body responsible for collecting, managing and analysing, evaluating and communicating research and statistics about vocational education and training (VET). NCVER’s in-house research and evaluation program undertakes projects which are strategic to the VET sector. These projects are developed and conducted by NCVER’s research staff and are funded by NCVER. This research aims to improve policy and practice in the VET sector. COVER IMAGE: GETTY IMAGES/iStock ISBN 978 1 925173 71 0 TD/TNC 126.07 Published by NCVER, ABN 87 007 967 311 Level 11, 33 King William Street, Adelaide SA 5000 PO Box 8288 Station Arcade, Adelaide SA 5000, Australia Phone +61 8 8230 8400 Fax +61 8 8212 3436 Email [email protected] Web <http://www.ncver.edu.au> <http://www.lsay.edu.au> Follow us: <http://twitter.com/ncver> <http://www.linkedin.com/company/ncver> About the research VET program completion rates: an evaluation of the current method National Centre for Vocational Education Research The premise to this work is a simple question: ‘how reliable is the method used by NCVER to estimate projected rates of VET program completion?’ In other words, how well do early projections align with actual completion rates some years later? Completion rates are simple to calculate with a cohort of students who start together in a very short program with a defined end date. The context in vocational education and training (VET) is, however, far more complex. Program lengths vary and may span several years, students commence at different times and many study part-time. Waiting for all students to complete or ‘drop out’ of their training before calculating an actual completion rate gives a reliable answer, but is somewhat impractical. This paper summarises the key findings from a technical review of the validity of the method long used by NCVER in estimating projected completion rates for government-funded VET programs. This analysis required the interrogation of large longitudinal data sets with tens of millions of enrolments over multiple years. Whilst the work beneath it is complex, the outcomes are revealing because of ever-high interest in completion rates as measures of the efficiency and effectiveness of the VET sector. Key findings The method long used by NCVER for estimating VET program completion rates using data from the National VET Provider Collection is shown to be reliable and aligns well with actual rates of completion for historical estimates. One of the advantages of the methodology is that it can be readily applied to subsets of the data based on student demographics or attributes of the training. Given that it takes a number of years for actual rates of completion to stabilise, the method is well suited for inclusion as part of any method of assessing completion rates, where the projected completion rate method is used to estimate rates for the most recent years and actual rates used for prior years. The technical review has also shown that the current predictive method can be improved by defining a program’s commencing year as the year it first appears in the National VET Provider Collection rather than using the commencing flag variable. It is anticipated that the incorporation of unique student identifiers into any preferred methodology, and its extension to total VET activity, can be phased in from the collection of 2017 training activity. Dr Craig Fowler Managing Director, NCVER Contents i About the research 3 Tables and figures 5 Introduction 6 How does NCVER currently derive VET completion rates? 8 How the projected rates are currently calculated 8 How accurate are the current estimates of completion rates? 10 Reviewing the methodology 12 Conclusion 16 References 17 Appendix 18 The current method for projecting rates of completion: a working example 18 4 VET qualification completion rates: an evaluation of the current method Tables and figures Tables 1 Breakdown of where commencing flag = ‘Y(es)’ within three-year matched datasets centred around years of interest (% of total) 12 2 Comparison of Mean Absolute and Mean Squared errors based on projected completion rates estimated using the current and revised approach, 2008—13 14 3 Actual rates of completion by collection year (%) 14 A1 Breakdown of student program enrolments within the three-year matched dataset centred around 2014 18 A2 2014 program enrolment status and their transitioning 2015 equivalents (number) 19 A3 Proportion of 2014 program enrolments transitioning to 2015 statuses 20 Figures 1 Comparison of current projected and actual program completion rates, 2008—15 (%) 10 2 Comparison of projected program completion rates (current and revised) against actual rates of completions, 2008—15 (%) 13 NCVER 5 i Introduction The Australian vocational education and training (VET) system provides training across a wide range of subject areas for students of all ages and backgrounds. The training is delivered through a variety of training institutions and enterprises (including to apprentices and trainees), and students may study individual subjects or full programs that lead to formal program completions. This diversity presents a challenge for the VET sector in devising indicators of efficiency and effectiveness, such as VET completion rates — the focus of this paper. There are two fundamental concepts associated with deriving completion rates. The first concerns subject-completion1 rates, which are straightforward and are routinely published in the Productivity Commission’s Report on government services (2016). It is simply the proportion of subjects undertaken that are successfully completed, based on hours of training. The second, the rate at which programs or qualifications are completed, is more problematic. The difficulties arise in two areas. First, technically, it is far from straightforward because the VET system has only recently introduced a unique student identifier (USI), which can be used to track a student’s training activity from commencement through to completion, and identifying the date at which a student commenced a qualification is not well defined. The second issue concerns the interpretation of a program-completion rate, as many individuals undertake particular VET subjects with a view to obtaining particular skills rather than obtaining a complete qualification. Because some of these students are reported to the National VET Provider Collection as enrolled in qualifications, the enrolment data overestimate the actual number of qualifications being undertaken, while completion rates underestimate the number of qualifications being completed. Notwithstanding, it is readily agreed the sector needs information pertaining to the rate of program completion and a methodology with which to derive it. In an occasional paper published by the National Centre for Vocational Education Research (NCVER) in 2012, Bednarz examined completion rates, which included an explanation of how they are defined and calculated. In terms of a definition for completion rates, Bednarz (2012, p.7) notes that: The most intuitive definition of a completion rate is that it is simply the proportion of students who finish the course they started. For example, if 100 students started a course in 2005, and 27 of those students went on to complete their course, we’d say that the completion rate for 2005 is 27%. As Bednarz (2012) explains, in an ideal world we would wait for all courses to finish before calculating the actual rate of completion, noting that some courses can take several years to complete and many students undertake part-time study, both of which extend completion dates. Thus, as Bednarz (2012) explains, because ‘we potentially 1 Load pass rate in the terminology of the VET sector. 6 VET qualification completion rates: an evaluation of the current method have to wait many years to ensure all students have had the opportunity to complete’, determination of actual completion rates can be delayed significantly, reducing the usefulness of the data (p.7). To overcome this issue, NCVER has derived a methodology for estimating projected program completion rates. The methodology used is presented in Mark and Karmel (2010), and applies probability theory to the National VET Provider Collection data, specifically to the status of program enrolments across successive years, to derive the probability that a commencing VET program enrolment will eventually be completed. NCVER has long published completion rates of government-funded2 VET programs in Australia for a number of VET sub-populations using this technique, including those relating to states and territories, program level and broad fields of education. These are further sub-populated for full-time students aged 25 years and under with no prior post- school program completion. Ongoing interest in completion rates as measures of the efficiency and effectiveness of the VET sector has prompted NCVER to undertake a review of the long-used methodology to examine its validity. This paper summarises the findings of this technical review and makes some recommendations for its improvement and the future publication of completion rates. 2 Government-funded VET is broadly defined as all programs delivered by government providers and government-funded programs delivered by community and other registered providers. NCVER 7 How does NCVER currently derive VET completion rates? To explain NCVER’s current approach to deriving completion rates, we again borrow from Bednarz (2012). NCVER reports completion rates at several different levels; that is, for courses, subjects, apprentices and trainees, and specific sub-groups of students. To estimate completion rates, we need to track particular components, or entities, of these, for example, courses, subjects, contracts of training, or individual students from their commencement. A group of entities that started at the same time is referred to as a ‘commencing cohort’. This paper is concerned with completion rates for VET qualifications and the methodology used to derive them. NCVER currently publishes two sets of completion rates: program completion rates and subject completion rates. Bednarz (2012, p.7) offers a useful starting point for our definitions of program completion rate and subject completion rate, noting that the terms ‘program’ ‘qualification’ and ‘course’ are used interchangeably throughout this paper. A program completion rate is the proportion of VET programs started in a given year that will eventually be completed. It is also referred to as a qualification or course completion rate. Subject completion rates A VET program is comprised of a number of subjects, also referred to as ‘modules’ or ‘units of competency’. NCVER also reports subject completion rates, termed ‘load pass rates’. Unlike the program completion rate, the subject load pass rate needs to be weighted because subjects are of different lengths, and this needs to be accommodated. Determination of the subject completion rate is based on the annual hours (or full year training equivalent — FYTE) for each assessable module or unit of competency. A subject load pass rate is defined by Bednarz (2012, p.8) as follows: A subject load pass rate is the ratio of hours studied by students who passed their subject(s) to the total hours committed to by all students who passed, failed or withdrew from the corresponding subject(s). How the projected rates are currently calculated As highlighted in this paper’s introduction, to calculate the true program completion rate, we need to wait for all students who started a program in a given period to either complete or drop out of the program; that is, we need to track each program from start to finish. Only when all programs are accounted for will we know the final program completion rate. Unfortunately, this can take years as some programs are scheduled for two or three years, which can take even longer if undertaken on a part-time basis. 8 VET qualification completion rates: an evaluation of the current method There is a further problem: even if we wait for the programs to finish (either completed or withdrawn), completions are not always reported immediately to the National VET Provider Collection. This delay in reporting means that completions occurring in a given year or quarter might take another year or longer to be reported. Not surprisingly, the longer we wait, the more accurate the completion rate becomes, although, as time goes by, the data become less relevant, making the information less useful for performance evaluation. While the direct approach of tracking programs from start to finish is adequate for tracking historic rates of completion, the need remains to derive projected completion rates for the most recent years. As a result, NCVER has developed a methodology for estimating projected program completion rates using data from the National VET Provider Collection. The data used provide information on the status of program enrolments across successive years. While the National VET Provider Collection is essentially a cross-sectional database by year, it contains enough inherent information to match data across years for individual VET students and the programs they undertake. The matched longitudinal dataset obtained then allows the use of mathematical techniques that rely on conditional probabilities to calculate the anticipated rates of completion. The current methodology, which has been used by NCVER for some time, is presented in Mark and Karmel (2010). This approach uses information about program enrolments over a three-year window (centred on the year of interest), together with the theory of absorbing Markov chains to derive the probability that a commencing VET program enrolment will eventually be completed. The advantage of Markov chain theory is that it has the property that the probability of an entity ‘transitioning’ from one status to another in successive time periods is not dependent on past transitions. This means we can use knowledge of the ‘status’ of program enrolments across successive years to predict the long-term program completion rate without having the full history of all program enrolments. Another advantage of the methodology is that it can be readily applied to subsets of the data based on student demographics or attributes of the training. To obtain these statuses, student and program information are matched across a three- year window, centred on the year of interest. Here, the year of interest is year n, the year prior year n–1 and the following year year n+1. The first two years of data (years n-1 and n) are used to determine the status of program enrolments for the year of interest. The last two years (years n and n+1) are used to determine the status of program enrolments for the following year. Once this is done, we can cross-tabulate the status of program enrolments for the year of interest with those of the following year to calculate the proportions transitioning from one status to another and use these to determine the likelihood that any program enrolment commencing in the year of interest will eventually be completed. To illustrate this process in more detail, a working example is presented in the appendix. NCVER 9 How accurate are the current estimates of completion rates? The title of this chapter asks a very important question but it is by no means an easy one to answer, as it requires tracking every student enrolment from start to finish. While there is enough information to match data across years, a number of inherent data issues limit the accuracy of the tracking process. Foremost amongst these is the fact that NCVER does not have the actual names and addresses of students but an encrypted identifier. This means we cannot be 100% certain we are following the same student over time. For example, if a student gets married and changes their name, they will get a different encrypted ID based on their new name. Also, if a student starts a course with one training provider and completes it with another, relating this activity to the same individual may not be possible. It is anticipated that the recent introduction of the unique student identifier (USI) will overcome this issue, although it will take some years before all program enrolments in the system will have an associated USI. Additional complications arise due to the lack of reliable information on the actual start date of the program enrolment, an issue discussed further in the next chapter. Notwithstanding these inherent data issues, it is possible to assess the accuracy of the completion rates derived using the Mark and Karmel (2010) method, by matching, as best we can, student program enrolment information across the collection years from commencement to completion. By taking the year in which a program enrolment first appears as a pseudo starting year and matching records across collection years by unique encrypted ID, sex, date of birth and course identifier, we can derive estimates of actual qualification completion rates for enrolments flagged as commencing in a particular year.3 The derived ‘actual’ rates of completion are shown in figure 1, together with the latest projected rates based on the Mark and Karmel (2010) method. Figure 1 Comparison of current projected and actual program completion rates, 2008–15 (%) 40 35 30 25 20 2008 2009 2010 2011 2012 2013 2014 2015 NCVER ACTUAL (as at 2015 collection) (based on commencing year = first year appears in the collection) Current NCVER rates projection method Source: National VET Provider Collection, 2015. 3 As some encrypted IDs have multiple client IDs connected to them, the ‘actual’ rates have been based on unique NCVER encrypted IDs comprising only a single client identifier. 10 VET qualification completion rates: an evaluation of the current method