        GUIDE TO USING THE U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES

           NATIONAL EVALUATION OF WELFARE-TO-WORK STRATEGIES (NEWWS)

                FIVE-YEAR CHILD OUTCOMES STUDY PUBLIC USE FILE



I. INTRODUCTION

This CD-ROM contains a public use analysis file and documentation for research
on the five-year experimental impacts of six welfare-to-work programs on
outcomes for children.  These programs were operated in three of the seven
sites in the National Evaluation of Welfare-to-Work Strategies (NEWWS):
Atlanta, Georgia; Grand Rapids, Michigan; and Riverside, California.

The data file, N5PC1730.TXT, contains the sample, original survey responses,
and additional outcome measures calculated with survey data that were analyzed
in Chapters 10 and 12 of the Final Report of the NEWWS Evaluation: U.S.
Department of Health and Human Services and U.S. Department of Education,
The National Evaluation of Welfare-to-Work Strategies: How Effective Are
Different Welfare-to-Work Approaches? Five-Year Adult and Child Impacts for
Eleven Programs, 2001.

Chapters 10 and 12 of the report and the public use data file were prepared
by Child Trends, who is conducting the Child Outcomes Study under subcontract
to the Manpower Demonstration Research Corporation (MDRC).  MDRC is conducting
the NEWWS Evaluation under a contract with the U.S. Department of Health and
Human Services (HHS), funded by HHS under a competitive award, Contract No.
HHS-100-89-0030. HHS is also receiving funding for the evaluation from the U.S.
Department of Education.  The study of one of the sites in the evaluation,
Riverside County (California), is also conducted under a contract from the
California Department of Social Services (CDSS).  CDSS, in turn, is receiving
funding from the California State Job Training Coordinating Council, the
California Department of Education, HHS, and the Ford Foundation.


II. DESCRIPTION OF THE DATA

N5PC1730.TXT contains, in ASCII format, survey-based measures of developmental
outcomes, child care, and child activities for the "focal child" of the Child
Outcomes Study (COS).  Achievement tests in reading and math were also
administered to the focal child and included in this data set.  Sample weights
used in statistical models and in subgroup impact analyses are included as
well.

N5PC1730.TXT includes a total of 2,332 sample members;  1,472 of these
respondents also have survey data available from the teachers of their focal
child. (See table below for the sample sizes of individual sites)  Sample
member identifiers and other information that could be used to
identify individuals have been deleted from this file.

The file contains 14 records of data for each COS sample member:

RECORD 1  OUTCOMES FOR FOCAL CHILD: SUMMARY MEASURES

RECORD 2  FOCAL CHILD'S ACTIVITIES AND CHILD CARE ARRANGEMENTS: SUMMARY MEASURES

RECORD 3  FOCAL CHILD'S ACTIVITIES AND CHILD CARE ARRANGEMENTS: SOURCE DATA
          (SECTION A)

RECORD 4  FATHER OF FOCAL CHILD'S INVOLVEMENT AND CHILD SUPPORT
          (SECTION B)

RECORD 5  RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: NEIGHBORHOOD (SECTION
          AA)

RECORD 6  RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: DEPRESSIVE
          SYMPTOMS (SECTION BB)

RECORD 7  RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: PARENTING (SECTION CC)

RECORD 8  RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: FOCAL CHILD'S HEALTH
          AND HEALTH CARE (SECTION DD)

RECORD 9  RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: FATHER OF FOCAL
          CHILD'S INVOLVEMENT (SECTION FF)

RECORD 10 RESPONDENT'S SELF-ADMINISTERED QUESTIONNAIRE: RESPONDENT'S EXPERIENCE
          OF BARRIERS TO WORK FROM OR ABUSE BY INTIMATE PARTNERS AND OTHERS
          (SECTION GG)

RECORD 11 FOCAL CHILD'S SELF-ADMINISTERED QUESTIONNAIRE

RECORD 12 INTERVIEWER ASSESSMENT (SECTION IA)

RECORD 13 FOCAL CHILD'S TEACHER SURVEY

Record 13 is blank (except for IDNUMBER and record number) for COS families who
were not included in the Teacher Survey.

RECORD 14 BACKGROUND INFORMATION AND SAMPLE WEIGHTS

Background information includes:  focal child gender, survey screener
variables, sample weight variables for use in analyses of the Riverside LFA
program (see Section VI.), and dummy variables indicating whether key parts of
the COS Survey were completed and whether data from the Teacher Survey are
available.


Each respondent has a unique randomly-generated IDNUMBER, which appears in
columns 1-5 of each record.  IDNUMBER should be used to merge the COS and
Teacher's Survey data with data from other NEWWS Evaluation public use data
files.


III. RESEARCH SAMPLES

     Table of Sample Sizes

                      Teacher
                COS    Survey
Site         Sample    Sample

Atlanta         967     603

Grand Rapids    624     408

Riverside       741     461

TOTAL         2,332   1,472


COS sample members were, at baseline, single-mother recipients of AFDC with at
least one 3- to 5-year-old child (the focal child).  In most cases, this was the
mother's only or youngest child, except in  Grand Rapids, where about one-third
of families also had a 1- to 2-year-old at baseline.

All COS respondents were chosen from the Full Impact Sample, which numbers
44,569 and includes sample members from the 4 other NEWWS Evaluation sites:
Columbus, Ohio; Detroit, Michigan;  Oklahoma City, Oklahoma; and Portland,
Oregon.  COS respondents also answered the Five-Year Client Survey and are
included in the survey sample (N=5,463 in 4 sites: Atlanta, Grand Rapids,
Riverside, and Portland).

A total of 2,163 Five-Year COS Survey respondents (93 percent) also completed a
Two-Year COS Survey interview.  The other 169 COS respondents were interviewed
at five years but not at two years.  They were selected for an interview at
two years but could not be located, were living too far away to be interviewed
in person, or refused to be surveyed.


IV. THE RESEARCH DESIGN

The NEWWS Evaluation tests the effectiveness of different welfare-to-work
strategies with a random assignment experiment.  In each research site,
individuals who showed up to enroll in the program were assigned, by
chance, to either a program group that had access to employment and training
services and were required to participate or risk a reduction in their monthly
welfare grant, or to a control group, which received no services and were not
subject to the program's mandatory participation requirement. Control group
members could seek out alternative employment-related services from their
community and receive child care assistance from the welfare department. This
random assignment design assures that there are no systematic differences
between the background characteristics of people in the program and control
groups when they enter the study. Thus, any subsequent differences in outcomes
between the groups can be attributed with confidence to the effects of the
program. These differences are called impacts.

In the three Child Outcomes Study sites (Atlanta, Grand Rapids, and Riverside),
two different types of welfare-to-work programs were operated side by side-a
strongly employment-focused approach, called Labor Force Attachment (or LFA),
or a strongly education-focused approach, called Human Capital Development (or
HCD). Outcomes for LFAs and HCDs may be compared to the control group or to each
other.  Comparisons of average outcomes for LFAs or HCDs to the control group
measure the added benefit of each approach above what the individuals would
achieve in the absence of a welfare-to-work program.  The difference in average
outcomes for LFAs and HCDs represents the relative benefit of one
welfare-to-work strategy over the other.

All sample members in Atlanta and Grand Rapids were randomly assigned to an LFA
group, an HCD group, or to a control group. Riverside implemented a different
random assignment design to study the effects of its LFA and HCD programs.
Following program intake procedures established by California's welfare
department, Riverside determined each sample member's "need for basic education"
just prior to random assignment.  Those who had a high school diploma or GED
certificate, and scored above minimum levels on both the math and the literacy
sections of the GAIN Appraisal test, and were proficient in English, were
determined not to need basic education.  This group was randomly assigned only
to the LFA or control group.  Those without a high school diploma or GED
certificate, or who scored below minimum levels on either section of the GAIN
Appraisal test, or who did not speak English, were determined by the program
to be in need of basic education.  Individuals in this status were randomly
assigned to the LFA, HCD, or control group.  Thus, the effects of the LFA
approach were tested on the entire sample, but the effects of the HCD approach
were tested only on sample members determined to need basic education.

In the Child Outcomes Study sites, random assignment took place when sample
members showed up at a welfare-to-work office to attend a program orientation.
Random assignment for the different sites took place as indicated by the table
below:

Site          Full Impact Sample      COS Sample

Atlanta           01/92-01/94         03/92-06/93

Grand Rapids      09/91-01/94         03/92-01/94

Riverside         06/91-06/93         09/91-05/93


The proportion of LFAs, HCDs, and control group members is roughly equal
in Atlanta and Grand Rapids, but not in Riverside.  In Riverside, a sample
member determined not to need basic education had a 50-50 chance of becoming
an LFA (because those "not in need" were not assigned to the HCD group),
whereas a sample member determined to need basic education had only a 1 in 3
chance of becoming an LFA. Therefore, those not in need of basic education are
overrepresented among the LFAs and control group members, and outcomes for
those determined not to need basic education unduly influence unweighted
LFA-control group comparisons. Researchers should therefore select HCDs, LFAs,
and control group members determined to be in need of basic education when
estimating the impacts of Riverside's HCD program, or when comparing the
relative effectiveness of the LFA versus the HCD approach in Riverside.
Moreover, researchers should select Atlanta and Grand Rapids HCDs and
control group members who had not received a high school diploma or received a
GED certificate before random assignment when comparing impact results to those
of the HCD approach in Riverside.



V. USING THE DATA

The outcome measures contained in this data file should be merged (by IDNUMBER)
with data from other public use data files from the NEWWS Evaluation. These
files contain information on COS sample members and on additional members
of the NEWWS research samples:

Five-Year Client Survey (CD #5): contains outcome measures for adult
respondents in 4 sites and a limited number of outcomes for children (including
the COS focal child) in Section L of the survey.

See N5PS_CBK.TXT (the codebook for the 5-Year Client Survey) for more
information on child outcome measures on this data file. To identify the focal
child in each family using Section L, we used the following SAS code:

array focal{10} L5FC01 L5FC02 L5FC03 L5FC04 L5FC05 L5FC06 L5FC07 L5FC08 L5FC09
L5FC10; foclchld = 0;
do a = 1 to 10;
     if focal{a} = 1 then foclchld = foclchld + 1;
   end;
label foclchld = 'Checks for Focal Child at 5yr';
If foclchld = 0 then do;
if snatchld = 1 then flglfc = 5;       /*missing should have answered*/
if snatchld = 2 then flglfc = 2;       /*skipped appropriately*/
if schlive = 3 then flglfc = 2;        /*skipped appropriately, is deceased*/
end;

if foclchld = 1 then do;
if snatchld = 1 then flglfc = 1;       /*valid answer -- good result */
if snatchld = 2 then flglfc = 3;       /*answered should have skipped */
end;

if foclchld gt 1 then flglfc = 4;      /* out of range*/
label flglfc = 'Valid-data flag for foclchld';

There were 16 individuals originally coded as focal children, but they did not
have any data from the Parent Self-Administered Questionnaire, the Child
Self-Administered Questionnaire, or the Woodcock-Johnson Tests of
Achievement--Revised.  Because of this, we dropped them from our focal child
sample.  In addition, 4 individuals who were focal children (as evidenced by
available data on the Parent Self-Administered Questionnaire, the Child
Self-Administered Questionnaire, or the Woodcock-Johnson Tests of
Achievement--Revised), were not marked as focal children in Section L.  We
recoded to mark them as focal children.

Users can also identify the focal child and can calculate additional outcomes
from Section L for this child.

Note that there is a limited amount of information in Section L for 262
additional 3- to 5-year-old children at baseline.  These children were not part
of the Child Outcomes Study sample for the following reasons:  Of the 262
families, 203 families had moved out of the survey area by the time of the
five-year survey.  In an additional 57 of these families, the named focal child
was not the mother's biological child.  One duplicate case was dropped and one
family was dropped because the focal child was deceased at the five-year
follow-up point.


Data from N5PC1730.TXT should also be merged with data from:


Five-Year Full Impact Sample (CD #1), which contains (for all 44,569 members of
the Full Impact Sample) background characteristics needed to calculate program
impacts and define key subgroups, measures that indicate membership in the
key samples, and outcome measures calculated from administrative records.

Two-Year Client Survey (CD #2):  contains outcome measures for adult respondents
in 7 sites and a limited number of outcomes for children (including the COS
focal child) recorded after 2 years of follow-up.

Two-Year Child Outcomes Study Survey (CD #3): contains outcome measures for COS
focal children recorded after 2 years of follow-up.


NOTE: The Two-Year COS Survey sample includes 3,018 respondents. Of these, 855
were not interviewed for the Five-Year COS Survey.

To summarize:

Number of respondents in Two-Year and Five-Year COS Surveys

                   Five-Year COS Respondent
Two-Year COS
Respondent          Yes       No    Total

Yes                2,163      855   3,018

No                   169      --

Total              2,332


Two-Year Literacy and Math Test Scores (CD #4): Two-Year COS Survey respondents
took the literacy test but not the math test.


We strongly suggest that users of this file do the following before conducting
any further analyses:

1.  Read the C5README file, which gives a brief description of all files
included on the CD-ROM.

2.  Read the report, particularly Chapter 2, which describes the research
design, samples, and data sources; Chapter 10, which summarizes the impacts of
each program on child care and child activities; and Chapter 12, which
summarizes the impacts of each program on children's developmental outcomes
for the Child Outcomes Study sample.

3.  Review the tables located within the \TABLES subdirectory. All of the tables
are annotated with the appropriate variable names.

4.  Review the rest of the documentation on the Five-Year COS Survey public use
file, N5PC1730.TXT, including the Child Outcomes Study codebook (N5PC_CBK.TXT),
background memo on source and created measures (N5PCVARS.TXT), file layouts and
output.

5. After reading the data into SAS or another statistical or econometric
software package, replicate the sample sizes and means.


VI. ESTIMATING PROGRAM IMPACTS

A. IMPORTANT First Step: Modify Values of Focal Child Gender

As discussed in Section VII., Child Trends controlled for the focal child's
gender (and other background characteristics) when estimating program impacts
on child outcomes. FCGENDER (focal child's gender) was saved to BOTH the
Two-Year and Five-Year data files.  However, the value for female DIFFERS on
each file (Two-Year: 2=FEMALE; Five-Year: 0=FEMALE).  Researchers should recode
FCGENDER from 2 to 0 if they merge the Two-Year and Five-Year files.

B. Weighting

For research purposes, certain populations were oversampled when choosing
respondents to the Five-Year Client- and COS Survey samples.  Therefore,
analyses must be weighted in order to yield results that are representative of
the (site-specific) populations from which the samples were drawn.  For all
programs except Riverside LFA, researchers should use FLD5WGT, the weight
variable for respondents to the Five-Year Client Survey, for estimating
impacts for COS sample members (aggregate and subgroup).  FLD5WGT is stored on
the Five-Year Client Survey public use data file (CD #5).

A series of weight variables have been calculated for estimating impacts of
the Riverside LFA program for the COS sample and subgroups.  These measures are
stored on N5PC1730.TXT.  Researchers should use RILFAWT in impact
calculations that include Riverside COS LFAs and control group members and
RILFAWTT in analyses for Riverside COS LFAs and control group members for whom
teacher data are also available.

Analyses of subgroups assigned to Riverside's LFA or control group also
require special weights.  The weight to use when running impacts of Riverside's
LFA program for COS female focal children is RILFAWTF, and the corresponding
weight for analyses of male focal children is COS RILFAWTM.  The corresponding
weights for female and male focal children in the teacher survey sample are
RILFWTFT and RILFWTMT, respectively.

Similarly, the weights to use when running impacts of Riverside's LFA program
for the "least disadvantaged," "moderately disadvantaged," and "most
disadvantaged" subgroups in the parent-child and teacher survey samples are
shown below:

Parent-child survey sample

Least disadvantged        RILFWTLS
Moderately disadvantaged  RILFWTMD
Most disadvantaged        RILFWTMS

Teacher survey sample

Least disadvantaged       RILWTLST
Moderately disadvantaged  RILWTMDT
Most disadvantaged        RILWTMST

Finally, though impacts by race for the COS sample(s) were not presented in
the five-year report, appropriate weights have been created for use in these
analyses of Riverside's LFA program.  See below:

Parent-child survey sample

Black     RILFWTBL
White     RILFWTWH
Hispanic  RILFWTHI

Teacher survey sample

Black     RILWTBLT
White     RILWTWHT
Hispanic  RILWTHIT


C. Missing Values for Outcome Measures

The user should note that, to retain the experimental design, as many of the
2,332 cases as possible should be included in impacts analyses.  Most
importantly, respondents who were appropriately skipped out of questions
because they did not apply to them should be assigned a "0" on the skipped
items, thereby retaining these cases in impact analyses.  Note that "truly"
missing values on outcome measures (showing no response when one was expected)
were neither "hard-coded" to 0 in the data file nor treated as 0 in analyses.
Instead, cases with missing values for a given outcome were effectively
listwise deleted from the analyses (using PROC GLM; see below).


VII. TUTORIAL

All impacts analyses were run separately within site, selecting only the
applicable program group (b = HCD group; j = LFA group) and the applicable
control group (all Cs in Atlanta, all Cs in Grand Rapids, all Cs when assessing
the impacts of Riverside's LFA program, and only "in-need" Cs when assessing
the impacts of Riverside's HCD program).

Impacts analyses used OLS regression methods (PROC GLM, in SAS).  The SAS
language used to run all impacts analyses, with "b" used in models testing the
impact of HCD programs, and "j" used in models testing the impact of LFA
programs is shown below.

PROC GLM;
CLASS b (or j);
where alphsite =   /* Choose 1 site each time:
                   1 (Atlanta)
                   4 (Grand Rapids)
                   7 (Riverside)
;                  */

MODEL <outcome> = b (or j)
marstat twochild thrchild BLACK5 BLACKM NOTBW5 agep hsdip yremp emppq1
   yrearn yrearnsq pearn1 recpc1 yrrec gyradc yrkrec yrrfs gyrfs yrkrfs
   FCGENDER5 FCGENDERM LOWREADR LOWMATHR LOWREADM LOWMATHM  /
solution;
LSMEANS b (or j)/PDIFF STDERR;
WEIGHT (FLD5WGT or appropriate weight for RILFA);
run;

NOTE:

1) ALPHSITE, B, J, C, and most of the covariates listed above are stored in the
Five-Year Full Impact Sample data file (CD #1). However, as will be explained
below, some measures have slightly different names:  BLACK, NOTBW, LOWREAD, and
LOWMATH.

2) The original impact model for the Final Report included the age (in months)
of the focal child (CHAGERAD5) as a covariate.  This measure is not available in
this public use file.  Moreover, values of most other covariates have been
modified (rounded or collapsed into fewer categories) to protect sample members'
confidentiality.  See the codebook for Five-Year Full Impact Sample data
file (CD #1) for details.  For these reasons, researchers will obtain slightly
different impact results from those which appear in tables of the Final Report.


3) The following baseline covariates contained some missing data:  BLACK,
FCGENDER, LOWREAD, LOWMATH.  Prior to estimating impacts, missing values in the
COS sample were temporarily assigned the value of 0; these values were copied
to BLACK5, FCGENDER5, LOWREADR, LOWMATHR and were used in all impact models.
Missing value dummy variables were also created (BLACKM, FCGENDERM, LOWREADM,
LOWMATHM) to note cases whose values were missing (and thus changed to 0);
these missing dummy variables were also included in the impact models.

(NOTE: BLACK has no missing values in the public use data file.)

Researchers may, if they wish, address the problem of missing values in a
different way (e.g., by imputing values) when estimating program impacts.
Changing Child Trends' strategy for dealing with missing values will
introduce additional small changes to impact results displayed in the Final
Report.


4) None of the following measures were saved to this or any other NEWWS
public use file: BLACK5, BLACKM, NOTBW5, FCGENDER5, FCGENDERM, LOWREADR,
LOWMATHR.  Researchers wishing to calculate impacts as described above will
need to create these measures.

5) Child Trends used SAS, Version 8 to estimate program impacts.  This
version allows programmers to create variable names with more than 8 characters.
Researchers using software without this capability will need to create shorter
names for FCGENDER5 and FCGENDERM if they choose to follow the procedure
outlined above.

6) When analyzing impacts for subgroups, only cases with non-missing values on
these subgroup variables should be used.  For instance, FCGENDER (and not
FCGENDER5) should be used when analyzing impacts by focal child gender.
