Data transformation
the data are transformed into forms appropriate for mining.
Data transformation tasks:.
- Normalization
- Attribute construction
- Aggregation
- Attribute Subset Selection
- Descritization
- Generalization
- Normalization:
- the attribute data are scaled so as to fall within a small specified range, such as -1.0 to 1.0, 0.0 to 1.0.
- Attribute construction (or feature construction): new attributes are constructed and added from the given set of attributes to help the mining process.
- Aggregation: summary or aggregation operations are applied to the data. For example, the daily sales data may be aggregated so as to compute monthly and annual total amounts.
- Discretization: Dividing the range of a continuous attribute into intervalsFor example, values for numerical attributes, like age, may be mapped to higher-level concepts, like youth, middleaged, and senior.
- Generalization: Data Transformation – where low-level or “primitive” (raw) data are replaced by higher-level concepts through the use of concept hierarchies. For example, categorical attributes, like street, can be generalized to higher-level concepts, like city or country.