UNEMPLOYMENT INSURANCE DATA VALIDATION HANDBOOK Benefits OFFICE OF UNEMPLOYMENT INSURANCE DEPARTMENT OF LABOR NOVEMBER 2009 OMB No: 1205-0431 OMB Expiration Date: July 31, 2011 Estimated Average Response Time: 550 hours. OMB Approval. The reporting requirements for ETA Handbook 361 are approved by OMB according to the Paperwork Reduction Act of 1995 under OMB No. 1205-0431 to expire July 31, 2011. The respondents' obligation to comply with the reporting requirements is required to obtain or retain benefits (Section 303(a)(6), SSA). Persons are not required to respond to this collection of information unless it displays a currently valid OMB control number. Burden Disclosure. SWA response time for this collection of information is estimated to average 550 hours per response (this is the average of a full validation every third year with an estimated burden of 900 hours, and partial validations in the two intervening years), including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to the U. S. Department of Labor, Employment and Training Administration, Office of Workforce Security (Attn: Burman Skrable), 200 Constitution Avenue, NW, Room S-4522, Washington, D.C. 20210 (Paperwork Reduction Project 1205-0431). TABLE OF CONTENTS INTRODUCTION A. PURPOSE .................................................................................................................................................1 B. DATA ERRORS IDENTIFIED THROUGH VALIDATION.................................................................................3 C. DATA SOURCES FOR FEDERAL REPORTING AND VALIDATION.................................................................3 D. BASIC VALIDATION APPROACH...............................................................................................................6 E. RECONSTRUCTING FEDERAL REPORT ITEMS............................................................................................6 F. VALIDATION TECHNIQUES AND SOURCES................................................................................................9 G. HANDBOOK OVERVIEW.........................................................................................................................10 H. OVERVIEW OF THE DATA VALIDATION METHODOLOGY.......................................................................11 MODULE 1 – REPORT VALIDATION A. PURPOSE..............................................................................................................................................1.1 B. METHODOLOGY....................................................................................................................................1.2 C. OVERVIEW OF MODULE 1.....................................................................................................................1.4 MODULE 2 – DATA ELEMENT VALIDATION A. PURPOSE..............................................................................................................................................2.1 B. METHODOLOGY....................................................................................................................................2.1 C. OVERVIEW OF MODULE 2.....................................................................................................................2.4 MODULE 3 − DATA ELEMENT VALIDATION STATE SPECIFIC INSTRUCTIONS A. PURPOSE..............................................................................................................................................3.1 B. METHODOLOGY....................................................................................................................................3.1 MODULE 4 – QUALITY SAMPLE VALIDATION A. PURPOSE..............................................................................................................................................4.1 B. SAMPLE SIZE........................................................................................................................................4.1 C. SAMPLE SELECTION..............................................................................................................................4.2 D. SAMPLE UNIVERSE...............................................................................................................................4.2 E. SAMPLE VALIDATION...........................................................................................................................4.3 F. RESULTS................................................................................................................................................4.6 APPENDIX A – SUBPOPULATION SPECIFICATIONS APPENDIX B – SAMPLE SPECIFICATIONS APPENDIX C – INTERSTATE FILED FROM AGENT RECORDS APPENDIX D – COMBINED WAGE CLAIMS AND PAYMENTS APPENDIX E – INDEPENDENT COUNT VALIDATION INTRODUCTION A. Purpose States report Unemployment Insurance (UI) data to the U.S. Department of Labor (DOL) on a monthly and quarterly basis under the Unemployment Insurance Required Reports (UIRR) system. The UIRR data are used for gathering economic statistics, allocating UI administrative funding, measuring state performance, and accounting for fund utilization. Therefore, it is important that states report UIRR data accurately and uniformly. The purpose of the Data Validation (DV) program is to verify the accuracy of the UIRR data. This handbook covers the part of the program that validates benefits data. In the DV program, the states validate their data and report the results of the validation to the Employment Training Administration (ETA). This handbook provides general instructions on how to validate the data as well as individual instructions for each state (referred to as Module 3). States use the DV software provided by DOL to conduct the validation and submit results. Table A shows the general types of UIRR data to be validated, the federal ETA reports on which the data appear, and the areas where the data are used. States are required to validate reported data every third year, except for data elements used to calculate Government Performance and Results Act (GPRA) measures, which must be validated annually. Items that do not pass validation must be revalidated the following year. The “validation year” coincides with the State Quality Service Plan (SQSP) performance year. It covers data of any reporting period during the twelve months beginning April 1 and ending March 31. Results must be submitted to the National Office by June 10, which allows sufficient time for data validation results to be included in the SQSP process. States that fail DV or do not submit their DV results by the established deadline must address these deficiencies through the SQSP. UI DV HANDBOOK, BENEFITS 1 NOVEMBER 2009 INTRODUCTION Table A General Types of Data to be Validated Funding Economic Allocation/ Claimant Monitor Data Type ETA Statistics Workload Performance Eligibility Trust Report(s) Fund Activity Weeks Claimed 5159 X X Final Payments 5159 X 218 Claims and 5159 X X X X Claims Status1 218 586 Payments 5159 X X 586 9050 9051 Nonmonetary 207 X X X Determinations/ 9052 Redeterminations Appeals 5130 X X X 9054 9055 Overpayments 227 X X 1. The ETA 539, Weekly Claims Activity Report, is not validated. However, states are strongly encouraged to compare the total claims reported on the 539 to the 5159 report for the same period to determine whether the counts reported are the same or within +2%. If the counts differ by more than +2%, then the state should investigate and advise its federal regional office of its findings and what it is doing to reconcile the differences. UI DV HANDBOOK, BENEFITS 2 NOVEMBER 2009 INTRODUCTION B. Data Errors Identified Through Validation Systematic and random errors are the two major types of data errors that can occur in federal UIRR reports. Systematic errors involve faulty design or execution of reporting programs. Random errors involve judgment and input errors. Reporting system errors are always systematic, while errors stemming from human judgment can be either systematic or random. The DV program attempts to identify both types of errors. Systematic errors are addressed through validation of the reporting programs that states use to create federal reports. These errors tend to be constant and fall into one of three categories: 1) too many transactions (overcounts), 2) too few transactions (undercounts), or 3) transactions which are misclassified. Systematic human errors occur when staff is using incorrect definitions or procedures. For example, a reporting unit may establish its own definition for a data element that deliberately or inadvertently conflicts with the federal definition. Systematic errors are the most serious because they occur repeatedly, but they are the easiest to detect and correct. In most cases systematic errors do not need to be assessed very frequently. Random errors are addressed through validation of a random sample of transactions by evaluating the accuracy of data elements stored in the database. Random errors tend to be sporadic, and are caused by human judgment. They fall into one of three categories: 1) input errors, 2) judgment errors (as in nonmonetary determinations, status determinations, and appeals), or 3) inconsistent application of state definitions or procedures. Consistent and accurate reporting requires both good systems and good data, hence the validation objective has not been achieved unless systems and data have both been validated. C. Data Sources for Federal Reporting and Validation Some states produce the federal reports directly from the state database. Computer programs scan the entire database to select, classify, and count transactions. Other states produce a database extract or flat file as transactions are processed, essentially keeping a running count of items to be tabulated for the federal reports. Still other states use a combination of these methods. Although states use different methods to prepare federal reports, the validation approach is the same in all cases: states support their reported figures by reconstructing the reported transactions. The validation methodology is flexible in accommodating the different systems that states use. However, validation is most effective when validation data are extracted directly from the state benefits database. For cost reasons and to minimize changes in data over time, some states prefer to use daily, weekly, or monthly flat files instead. When flat files are used, system errors may occur: reportable transactions may be improperly excluded from the master file, or the flat file may contain corrupt data. The only way to identify these problems is to independently reconstruct or query the master database. States that prepare validation files from the same files used to produce the UIRR, rather than directly from the database, must ensure that these files contain all the appropriate transactions by recreating the logic used to produce the federal reports. This handbook includes a validation tool, “independent count validation,” specifically UI DV HANDBOOK, BENEFITS 3 NOVEMBER 2009 INTRODUCTION for this purpose. The state programming staff must determine the specific type of independent count (simple query, multiple queries, cross tabulation). There is, however, no way to accurately reconstruct the report count when the flat file contains transactions that are no longer present in the database (e.g., when it includes a claim or type of claim designation deleted from the main database after a corrected determination is made for the same claimant). Table B outlines variations in the validation methodology, based on typical state approaches to reporting and data validation. To determine the specific validation methodology to be implemented, the validator should identify the state’s reporting and validation sources for each population. UI DV HANDBOOK, BENEFITS 4 NOVEMBER 2009 INTRODUCTION Table B Variations in Validation Methodologies Based on State Approaches to Reporting and Reconstruction Benefits UIRR Reports Data Validation Source Transactions Independent Documentation Overwritten Program Program Count Review Scenario on Database Type Source Timing Type Source Timing Required Required Comments 1 No Count Database Snapshot* Extract Database Snapshot No No Best scenario because File** comparing snapshots eliminates timing discrepancies 2 No Count Flat file*** Daily Extract Database Snapshot No No Database is only File reconstruction source. There could be changes in transaction characteristics (but will find all transactions). 3 No Extract Database Snapshot Extract Database Snapshot Yes No Reporting and validation are File File the same program. Independent count may mirror that program. 4 No Extract Flat file Daily Extract Flat file Daily Yes Yes Since transactions are not File File overwritten, states should be able to do Scenario 2 instead. 5 Yes Extract Flat file Daily Extract Flat file Daily NA NA No alternative validation File File source. Cannot reconstruct from the database. Not thorough validation. 6 Yes Count Flat file Daily Must NA NA NA NA Cannot reconstruct from create a database. Must change daily reporting process to extract Scenario 5. * All records in the database on last day of reporting period ** File constructed directly from database ***File with accumulated records used for ETA reports UI DV HANDBOOK, BENEFITS 5 NOVEMBER 2009 INTRODUCTION D. Basic Validation Approach The data validation methodology outlined in this handbook minimizes validation time and burden. The methodology is highly automated and complements existing quality components (such as the nonmonetary determinations quality review). The methodology involves reconstructing the count of transactions reported during a specific period for each federal report item to be validated. The validation specifications for reconstructing reported transactions provide a blueprint of the criteria that states should use in their federal reporting. This handbook therefore has two uses: 1. To provide technical assistance with federal reporting requirements; 2. To guide states through the data validation process. The reconstruction files provide an audit trail to support the counts and classifications of reported transactions. Validation of reported counts (referred to as report validation or RV) is accomplished when all the transactions reported for a federal report item have been reconstructed. For example, if a state reports 5,000 first payments during a month, then the state must produce a file containing the 5,000 first payments, including relevant characteristics of the transaction such as the Social Security Number (SSN), the program type code, and the mail date. The DV software then sorts the payments into groups that are used to reconstruct the counts for the appropriate items of the ETA 5159 and 9050 reports. Report validation is discussed in detail in Module 1. The DV software also draws samples of transactions from the reconstruction file and displays them on worksheets to facilitate their validation. Validators then subject the sampled transactions to a series of logic tests (state-specific “rules” described in Module 3), using the most definitive source documentation (such as database screens) to test the accuracy of the data. This validation of the characteristics of reported transactions is known as Data Element Validation (DEV) and is described in detail in Module 2. Data that passes RV and DEV are considered accurate. E. Reconstructing Federal Report Items Given that there are 11 UIRR benefits reports to validate, with over 700 report items, validation could be a laborious process to both design and implement. A single UI benefits transaction-- for example, a payment, a nonmonetary determination, or an appeal--can be reported in numerous federal report items. As an example, a first payment for a week of total unemployment, for an interstate claim with both UI and Unemployment Compensation for Federal Employees (UCFE) wages, is reported in eight items of the ETA 5159 report as well as in one item of the ETA 9050 report. A general principle of the validation design is to streamline the validation process as much as possible. Transactions are analyzed only once, even if they appear in multiple items. The UI DV HANDBOOK, BENEFITS 6 NOVEMBER 2009 INTRODUCTION streamlining is accomplished by classifying the transactions into mutually exclusive groups (referred to as populations), which match to one or more items on the federal reports. Specifically, there are fifteen benefits populations, which are composed of 347 mutually exclusive groups (referred to as subpopulations). Each subpopulation represents a unique set of data elements or characteristics. The first column of Table C lists each population. The second column identifies the ETA reports on which the transactions in each population are reported. The Reconstruction Period (third column) describes the time parameter that the programmer uses to select the transactions to be extracted. When the reports are monthly, the reconstruction can be for a single month, to match the reported counts. When the reports are quarterly, or both monthly and quarterly reports are produced for the same type of transaction (for example, claims are reported on both the ETA 5159, a monthly report, and the ETA 218, a quarterly report), the reconstruction is for a quarter. The Number of Report Items (fourth column) indicates the total number of items on each ETA report that is validated by each transaction population. The Number of Subpopulations (fifth column) refers to the number of subpopulations into which the population is divided for validation purposes. UI DV HANDBOOK, BENEFITS 7 NOVEMBER 2009
Description: