GUIDE TO USING THE U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES NATIONAL EVALUATION OF WELFARE-TO-WORK STRATEGIES (NEWWS) CHILD OUTCOMES STUDY FIVE-YEAR IMPACTS RESTRICTED ACCESS FILE I. INTRODUCTION This archive includes a restricted access analysis file and documentation for research on the five-year experimental impacts of six welfare-to-work programs on outcomes for children. These programs were operated in three of the seven sites in the National Evaluation of Welfare-to-Work Strategies (NEWWS): Atlanta, Georgia; Grand Rapids, Michigan; and Riverside, California. This and all other restricted access files from the NEWWS Evaluation are stored at the National Center for Health Statistics (NCHS). The data file, N5RC1719.TXT, contains the sample, original survey responses, and additional outcome measures calculated with survey data that were analyzed in Chapters 10 and 12 of the final report of the NEWWS Evaluation: U.S. Department of Health and Human Services and U.S. Department of Education, Evaluating Alternative Welfare-to-Work Approaches: Five-Year Impacts for Eleven Programs (2001). Chapters 10 and 12 of the report and the restricted access file were prepared by Child Trends, who is conducting the Child Outcomes Study under subcontract to the Manpower Demonstration Research Corporation (MDRC). MDRC is conducting the NEWWS Evaluation under a contract with the U.S. Department of Health and Human Services (HHS), funded by HHS under a competitive award, Contract No. HHS-100-89-0030. HHS is also receiving funding for the evaluation from the U.S. Department of Education. The study of one of the sites in the evaluation, Riverside County (California), is also conducted under a contract from the California Department of Social Services (CDSS). CDSS, in turn, is receiving funding from the California State Job Training Coordinating Council, the California Department of Education, HHS, and the Ford Foundation. II. DESCRIPTION OF THE DATA N5RC1719.TXT contains, in ASCII format, survey-based measures of developmental outcomes, child care, and child activities for the "focal child" of the Child Outcomes Study (COS). Achievement tests in reading and math were also administered to the focal child and included in this data set. Sample weights used in statistical models and in subgroup impact analyses are included as well. N5RC1719.TXT includes a total of 2,332 sample members; 1,472 of these respondents also have survey data available from the teachers of their focal child. (See table below for the sample sizes of individual sites) Sample member identifiers and other information that could be used to identify individuals have been deleted from this file. The file contains 14 records of data for each COS sample member: RECORD 1 OUTCOMES FOR FOCAL CHILD: SUMMARY MEASURES RECORD 2 FOCAL CHILD'S ACTIVITIES AND CHILD CARE ARRANGEMENTS: SUMMARY MEASURES RECORD 3 FOCAL CHILD'S ACTIVITIES AND CHILD CARE ARRANGEMENTS: SOURCE DATA (SECTION A) RECORD 4 FATHER OF FOCAL CHILD'S INVOLVEMENT AND CHILD SUPPORT (SECTION B) RECORD 5 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: NEIGHBORHOOD (SECTION AA) RECORD 6 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: DEPRESSIVE SYMPTOMS (SECTION BB) RECORD 7 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: PARENTING (SECTION CC) RECORD 8 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: FOCAL CHILD'S HEALTH AND HEALTH CARE (SECTION DD) RECORD 9 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: FATHER OF FOCAL CHILD'S INVOLVEMENT (SECTION FF) RECORD 10 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: RESPONDENT'S EXPERIENCE OF BARRIERS TO WORK FROM OR ABUSE BY INTIMATE PARTNERS AND OTHERS (SECTION GG) RECORD 11 FOCAL CHILD'S SELF-ADMINISTERED QUESTIONNAIRE RECORD 12 INTERVIEWER ASSESSMENT (SECTION IA) RECORD 13 FOCAL CHILD'S TEACHER SURVEY Record 13 is blank (except for IDNUMBER and record number) for COS families who were not included in the Teacher Survey. RECORD 14 BACKGROUND INFORMATION AND SAMPLE WEIGHTS Background information includes: focal child gender, survey screener variables, sample weight variables for use in analyses of the Riverside LFA program (see Section VI.), and dummy variables indicating whether key parts of the COS Survey were completed and whether data from the Teacher Survey are available. Each respondent has a unique randomly-generated IDNUMBER, which appears in columns 1-5 of each record. IDNUMBER should be used to merge the COS and Teacher's Survey data with data from other NEWWS Evaluation restricted access files. III. RESEARCH SAMPLES Table of Sample Sizes Teacher COS Survey Site Sample Sample Atlanta 967 603 Grand Rapids 624 408 Riverside 741 461 TOTAL 2,332 1,472 COS sample members were, at baseline, single-mother recipients of AFDC with at least one 3- to 5-year-old child (the focal child). In most cases, this was the mother's only or youngest child, except in Grand Rapids, where about one-third of families also had a 1- to 2-year-old at baseline. All COS respondents were chosen from the Full Impact Sample, which numbers 44,569 and includes sample members from the 4 other NEWWS Evaluation sites: Columbus, Ohio; Detroit, Michigan; Oklahoma City, Oklahoma; and Portland, Oregon. COS respondents also answered the Five-Year Client Survey and are included in the survey sample (N=5,463 in 4 sites: Atlanta, Grand Rapids, Riverside, and Portland). A total of 2,163 Five-Year COS Survey respondents (93 percent) also completed a Two-Year COS survey interview. The other 169 COS respondents were interviewed at five years but not at two years. They were selected for an interview at two years but could not be located, were living too far away to be interviewed in person, or refused to be surveyed. IV. THE RESEARCH DESIGN The NEWWS Evaluation tests the effectiveness of different welfare-to-work strategies with a random assignment experiment. In each research site, individuals who showed up to enroll in the program were assigned, by chance, to either a program group that had access to employment and training services and were required to participate or risk a reduction in their monthly welfare grant, or to a control group, which received no services and were not subject to the program's mandatory participation requirement. Control group members could seek out alternative employment-related services from their community and receive child care assistance from the welfare department. This random assignment design assures that there are no systematic differences between the background characteristics of people in the program and control groups when they enter the study. Thus, any subsequent differences in outcomes between the groups can be attributed with confidence to the effects of the program. These differences are called impacts. In the three Child Outcomes Study sites (Atlanta, Grand Rapids, and Riverside), two different types of welfare-to-work programs were operated side by side-a strongly employment-focused approach, called Labor Force Attachment (or LFA), or a strongly education-focused approach, called Human Capital Development (or HCD). Outcomes for LFAs and HCDs may be compared to the control group or to each other. Comparisons of average outcomes for LFAs or HCDs to the control group measure the added benefit of each approach above what the individuals would achieve in the absence of a welfare-to-work program. The difference in average outcomes for LFAs and HCDs represents the relative benefit of one welfare-to-work strategy over the other. All sample members in Atlanta and Grand Rapids were randomly assigned to an LFA group, an HCD group, or to a control group. Riverside implemented a different random assignment design to study the effects of its LFA and HCD programs. Following program intake procedures established by California's welfare department, Riverside determined each sample member's "need for basic education" just prior to random assignment. Those who had a high school diploma or GED certificate, and scored above minimum levels on both the math and the literacy sections of the GAIN Appraisal test, and were proficient in English, were determined not to need basic education. This group was randomly assigned only to the LFA or control group. Those without a high school diploma or GED certificate, or who scored below minimum levels on either section of the GAIN Appraisal test, or who did not speak English, were determined by the program to be in need of basic education. Individuals in this status were randomly assigned to the LFA, HCD, or control group. Thus, the effects of the LFA approach were tested on the entire sample, but the effects of the HCD approach were tested only on sample members determined to need basic education. In the Child Outcomes Study sites, random assignment took place when sample members showed up at a welfare-to-work office to attend a program orientation. Random assignment for the different sites took place as indicated by the table below: Site Full Impact Sample COS Sample Atlanta 01/92-01/94 03/92-06/93 Grand Rapids 09/91-01/94 03/92-01/94 Riverside 06/91-06/93 09/91-05/93 The proportion of LFAs, HCDs, and control group members is roughly equal in Atlanta and Grand Rapids, but not in Riverside. In Riverside, a sample member determined not to need basic education had a 50-50 chance of becoming an LFA (because those "not in need" were not assigned to the HCD group), whereas a sample member determined to need basic education had only a 1 in 3 chance of becoming an LFA. Therefore, those not in need of basic education are overrepresented among the LFAs and control group members, and outcomes for those determined not to need basic education unduly influence unweighted LFA-control group comparisons. Researchers should therefore select HCDs, LFAs, and control group members determined to be in need of basic education when estimating the impacts of Riverside's HCD program, or when comparing the relative effectiveness of the LFA versus the HCD approach in Riverside. Moreover, researchers should select Atlanta and Grand Rapids HCDs and control group members who had not received a high school diploma or received a GED certificate before random assignment when comparing impact results to those of the HCD approach in Riverside. V. USING THE DATA The outcome measures contained in this data file should be merged (by IDNUMBER) with data from other restricted access data files from the NEWWS Evaluation. These files contain information on COS sample members and on additional members of the NEWWS research samples: N5RS1621.TXT (Five-Year Client Survey): contains outcome measures for adult respondents in 4 sites and a limited number of outcomes for children (including the COS focal child) in Section L of the survey. See N5RS_CBK.TXT (the codebook for N5RS1621.TXT) for more information on child outcome measures on this data file. To identify the focal child in each family using Section L, we used the following SAS code: array focal{10} L5FC01 L5FC02 L5FC03 L5FC04 L5FC05 L5FC06 L5FC07 L5FC08 L5FC09 L5FC10; foclchld = 0; do a = 1 to 10; if focal{a} = 1 then foclchld = foclchld + 1; end; label foclchld = 'Checks for Focal Child at 5yr'; If foclchld = 0 then do; if snatchld = 1 then flglfc = 5; /*missing should have answered*/ if snatchld = 2 then flglfc = 2; /*skipped appropriately*/ if schlive = 3 then flglfc = 2; /*skipped appropriately, is deceased*/ end; if foclchld = 1 then do; if snatchld = 1 then flglfc = 1; /*valid answer -- good result */ if snatchld = 2 then flglfc = 3; /*answered should have skipped */ end; if foclchld gt 1 then flglfc = 4; /* out of range*/ label flglfc = 'Valid-data flag for foclchld'; There were 16 individuals originally coded as focal children, but they did not have any data from the Parent Self-Administered Questionnaire, the Child Self-Administered Questionnaire, or the Woodcock-Johnson Tests of Achievement--Revised. Because of this, we dropped them from our focal child sample. In addition, 4 individuals who were focal children (as evidenced by available data on the Parent Self-Administered Questionnaire, the Child Self-Administered Questionnaire, or the Woodcock-Johnson Tests of Achievement--Revised), were not marked as focal children in Section L. We recoded to mark them as focal children. Users can also identify the focal child and can calculate additional outcomes from Section L for this child. Note that there is a limited amount of information in Section L for 262 additional 3- to 5-year-old children at baseline. These children were not part of the Child Outcomes Study sample for the following reasons: Of the 262 families, 203 families had moved out of the survey area by the time of the five-year survey. In an additional 57 of these families, the named focal child was not the mother's biological child. One duplicate case was dropped and one family was dropped because the focal child was deceased at the five-year follow-up point. Data from N5RC1719.TXT should also be merged with data from: N5RI1515.TXT: (Five-Year Full Impact Sample): contains measures of employment, earnings, and receipt of welfare and Food Stamps for follow-up years 3 to 5, calculated from administrative records for the Full Impact Sample in 7 sites. N2RC1326.TXT (Two-Year Child Outcomes Study Survey): contains outcome measures for COS focal children recorded after 2 years of follow-up. NOTE: The Two-Year COS Survey sample includes 3,018 respondents. Of these, 855 were not interviewed for the Five-Year COS Survey. To summarize: Number of respondents in Two-Year and Five-Year COS Surveys Five-Year COS Respondent Two-Year COS Respondent Yes No Total Yes 2,163 855 3,018 No 169 -- Total 2,332 N2RS1221.TXT (Two-Year Client Survey): contains outcome measures for adult respondents in 7 sites and a limited number of outcomes for children (including the COS focal child) recorded after 2 years of follow-up. N2RI1213.TXT (Two-Year Full Impact Sample): contains measures of employment, earnings, and receipt of welfare and Food Stamps for follow-up years 1 to 2, calculated from administrative records for the Full Impact Sample in 7 sites. We strongly suggest that users of this file do the following before conducting any further analyses. 1. Read the C5README file, which gives a brief description of the data and documentation. 2. Read the report, particularly Chapter 2, which describes the research design, samples, and data sources; Chapter 10, which summarizes the impacts of each program on child care and child activities; and Chapter 12, which summarizes the impacts of each program on children's developmental outcomes for the Child Outcomes Study sample. 3. Review the tables located included as documentation. All of the tables are annotated with the appropriate variable names. 4. Review the rest of the documentation on the Five-Year COS Survey restricted access file, N5RC1719.TXT, including the Child Outcomes Study codebook (N5RC_CBK.TXT), background memo on source and created measures (C5VARMEM.TXT), file layouts and output. 5. After reading the data into SAS or another statistical or econometric software package, replicate the sample sizes and means. VI. ESTIMATING PROGRAM IMPACTS A. IMPORTANT First Steps: Modify Values of Focal Child Age and Gender As discussed in Section VII., ChildTrends controlled for the focal child's gender and his/her age at random assignment (and other background characteristics) when estimating program impacts on child outcomes. FCGENDER (focal child's gender) was saved to BOTH the Two-Year and Five-Year data files. However, the value for female DIFFERS on each file (Two-Year: 2=FEMALE; Five-Year: 0=FEMALE). Researchers should recode FCGENDER from 2 to 0 if they merge the Two-Year and Five-Year files. CHAGERAD (focal child's age at random assignment in months) was saved to the Two-Year COS data file only. Therefore, CHAGERAD has missing values for the 169 respondents who only answered the Five-Year COS Survey. The values of CHAGERAD for these 169 respondents are stored in a file called CHAGE169.TXT. This file has 2 measures (IDNUMBER and CHAGERAD), 169 observations, and the following record structure: @00001 IDNUMBER 5.0 @00007 CHAGERAD 2.0 Researchers should therefore obtain the values of CHAGERAD from 1) N2RC1326.TXT (the Two-Year COS Survey restricted access file) for the 2,163 respondents to both the Two-Year and Five-Year COS Surveys AND 2) CHAGE169.TXT for the remaining 169 respondents who only answered the Five-Year COS Survey. B. Weighting For research purposes, certain populations were oversampled when choosing respondents to the Five-Year Client- and COS Survey samples. Therefore, analyses must be weighted in order to yield results that are representative of the (site-specific) populations from which the samples were drawn. For all programs except Riverside LFA, researchers should use FLD5WGT, the weight variable for respondents to the Five-Year Client Survey, for estimating impacts for COS sample members (aggregate and subgroup). FLD5WGT is stored on N5RS1621.TXT, the Five-Year Client Survey restricted access file. A series of weight variables have been calculated for estimating impacts of the Riverside LFA program for the COS sample and subgroups. These measures are stored on N5RC1719.TXT. Researchers should use RILFAWT in impact calculations that include Riverside COS LFAs and control group members and RILFAWTT in analyses for Riverside COS LFAs and control group members for whom teacher data are also available. Analyses of subgroups assigned to Riverside's LFA or control group also require special weights. The weight to use when running impacts of Riverside's LFA program for COS female focal children is RILFAWTF, and the corresponding weight for analyses of male focal children is COS RILFAWTM. The corresponding weights for female and male focal children in the teacher survey sample are RILFWTFT and RILFWTMT, respectively. Similarly, the weights to use when running impacts of Riverside's LFA program for the "least disadvantaged," "moderately disadvantaged," and "most disadvantaged" subgroups in the parent-child and teacher survey samples are shown below: Parent-child survey sample Least disadvantaged RILFWTLS Moderately disadvantaged RILFWTMD Most disadvantaged RILFWTMS Teacher survey sample Least disadvantaged RILWTLST Moderately disadvantaged RILWTMDT Most disadvantaged RILWTMST Finally, though impacts by race for the COS sample(s) were not presented in the five-year report, appropriate weights have been created for use in these analyses of Riverside's LFA program. See below: Parent-child survey sample Black RILFWTBL White RILFWTWH Hispanic RILFWTHI Teacher survey sample Black RILWTBLT White RILWTWHT Hispanic RILWTHIT C. Missing Values for Outcome Measures The user should note that, to retain the experimental design, as many of the 2,332 cases as possible should be included in impact analyses. Most importantly, respondents who were appropriately skipped out of questions because they did not apply to them should be assigned a "0" on the skipped items, thereby retaining these cases in impact analyses. Note that missing values on outcome measures were neither "hard-coded" to 0 in the data file nor treated as 0 in analyses. Instead, cases with missing values on a given outcome were effectively listwise deleted from the analyses (using PROC GLM; see below). VII. TUTORIAL All impacts analyses were run separately within site, selecting only the applicable program group (b = HCD group; j = LFA group) and the applicable control group (all Cs in Atlanta, all Cs in Grand Rapids, all Cs when assessing the impacts of Riverside's LFA program, and only "in-need" Cs when assessing the impacts of Riverside's HCD program). Impacts analyses used OLS regression methods (PROC GLM, in SAS). The SAS language used to run all impacts analyses, with "b" used in models testing the impact of HCD programs, and "j" used in models testing the impact of LFA programs is shown below. PROC GLM; CLASS b (or j); where alphsite = /* Choose 1 site each time: 1 (Atlanta) 4 (Grand Rapids) 7 (Riverside) ; */ MODEL = b (or j) marstat twochild thrchild BLACK5 BLACKM NOTBW5 agep hsdip yremp emppq1 yrearn yrearnsq pearn1 recpc1 yrrec gyradc yrkrec yrrfs gyrfs yrkrfs CHAGERAD5 FCGENDER5 FCGENDERM LOWREADR LOWMATHR LOWREADM LOWMATHM / solution; LSMEANS b (or j)/PDIFF STDERR; WEIGHT (FLD5WGT or appropriate weight for RILFA); run; NOTE: 1) ALPHSITE and most of the covariates listed above are stored in N2RI1213.TXT (the Two-Year Full Impact Sample file). However, as will be explained below, some measures in N2RI1213.TXT have slightly different names: BLACK, LOWREAD, and LOWMATH. 2) As noted above FCGENDER and CHAGERAD are stored in N2RC1326.TXT, the Two-Year COS Survey file. CHAGE169.TXT stores values of CHAGERAD for 169 respondents who did not answer the Two-Year COS Survey. 3) The following baseline covariates contained some missing data: BLACK, FCGENDER, LOWREAD, LOWMATH. To estimate program impacts for the Final Report, missing values in the COS sample were assigned the value of 0; these new variables were saved (temporarily) as BLACK5, FCGENDER5, LOWREADR, LOWMATHR and were used in all impact models. Missing value dummy variables were also created (BLACKM, FCGENDERM, LOWREADM, LOWMATHM) to note cases whose values were missing (and thus changed to 0); these missing dummy variables were also included in the impact models. Researchers may, if they wish, address the problem of missing values in a different way (e.g., by imputing values) when estimating program impacts. However, they should expect to get slightly different results from those published in the Final Report. 4) None of the following measures were saved to this or any other NEWWS restricted access file: BLACK5, BLACKM, NOTBW5, CHAGERAD5, FCGENDER5, FCGENDERM, LOWREADR, LOWMATHR, LOWREADM, LOWMATHM. Researchers wishing to calculate impacts as described above will need to create these measures. 5) Child Trends used SAS, Version 8 to estimate program impacts. This version allows programmers to create variable names with more than 8 characters. Researchers using software without this capability will need to create shorter names for some measures (e.g., CHAGERAD5, FCGENDER5, FCGENDERM) if they choose to follow the procedure outlined above. 6) When analyzing impacts for race or gender subgroups, only cases with non-missing values on these subgroup variables should be used. Consequently, FCGENDER (and not FCGENDER5) should be used when analyzing impacts by focal child gender, and BLACK (not BLACK5) should be used when analyzing impacts for blacks in the COS sample(s). (Impacts for whites should be run using WHITE, and impacts for Hispanics should be run using HISPANIC, which contained no missing data.)