If the data sets attempting to be stacked have variables with different type definitions (char/numeric), the macro will instead generate a data set report delineating each variable's type incompatibilities. As a result, the macro does not truncate data. These lengths are used in each variables definition in the outputted data set. This macro determines the maximum length of each variable. StackAllDataSets.sas: Stacks (appends) all data sets in a library into a single data set.By using the optional “AlwaysKeep” parameter, certain columns can always be kept, even when they do not contain data. It then outputs a second data set in which the columns without data have been removed. RemoveUnusedColumns.sas – Analyzed an inputted data set for columns that have no (null) data.This results in all leading and trailing spaces being removed from all character variables. StripDownAllCharVariables.sas - Applies the SAS function strip() to all variables in each character field in a dataset.For example, the macro converts all names to upper case, strips out generational suffixes such as JR or III from the name fields and places these in a “Suffix” field, etc. StandardizeNames.sas – Applies a set of business rules to standardize the first, middle and last names in a dataset.For example, this macro can be used to set all instances of "NULL" to the empty string. DenullifyDataset.sas - Sets to empty all text fields in a dataset are equal to a given value. Can be used, for example, to convert all Y/N flags to 1/0 flags, or vise versa. ConvertFlags.sas - Converts all flags in all character fields in a dataset from one defined set of two values to another defined set of two values.Zipped folder also contains a set Pytest unit tests for delete_superfluous_records.py – (Version 0.1) April 11, 2022 DeleteSuperfluousRecords.py - A Python implementation of DeleteSupefluousRecords.sas.In this case, the record with the complete middle name has more complete information than the other records, so this macro will delete the other records. An example is with name data, where one record has a first name, last name and a complete middle name, and other records have the same first and last names, but they have only the middle initial or no initial at all. A superfluous record here means a record that contains information that is a strict subset of another record.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |