*---SAS data step (template) to read the pl94-2000 files, filtering by summary level---; filename geos ''; filename data1 ''; filename data2 ''; libname sasout ''; *--replace "sasout" with "work" if you do not want to save output data sets for future use; data sasout.counties sasout.mcds; *--you can put it all on one data set as well--; infile geos lrecl=512 obs=max; input sumlev $9-11 recidgeo 19-25 @; *--read summary level from geo record and hold the record with trailing @; if sumlev in ('040','050','060') then keep=1; else keep=0; if keep then input <....the current geos file record with geocodes, etc.....>; else input; *--to release trailing @ and effectively ignore the current record-; infile data1 lrecl=1024 obs=max missover dsd; *<---------------Read input file 1----------------; if keep then input _first14 $char14. recid 7. +1 PL1i1-PL1i71 PL2i1-PL2i73; *-or some such to read the data-; else input; *--this "null" input statement will cause the record to be effectively ignored; infile data2 lrecl=1024 obs=max missover dsd; *<---------------Read input file 2----------------; if keep then input _first14 $char14. recid2 7. +1 PL3i1-PL3i71 PL4i1-PL4i73; *-or some such to read the data-; else input; *--this "null" input statement will cause the record to be effectively ignored; if keep=0 then delete; *--no data (except sumlev and recidgeo) has been read - go to next records. The delete statement tells SAS not to output an observation-; *--at this point keep=1 and you have read all the data for 1 geographic entity. Verify that all the recids are in synch and then output to the data set(s)--; if recid ne recidgeo or recid2 ne recidgeo then do; *<---------------Verify file synchronization------; file log; put //'************Data Problem: The recid fields on the 3 input files do not match '/ 'each other as expected and required***** ' recidgeo= recid= recid2= _n_= ; list; stop; end; *--this statement illustrates ease of sending different geo level to different data sets. If you are not doing this - if you are only creating a single set - then no output statement is required; if sumlev in ('040','050') then output sasout.counties; else output sasout.mcds; run;