Pre-Screen - Data Pre-Screening Toolbox
- Designed and developed for use by the practising chemometricians, engineers and data scientists who wish to pre-process and pre-screen process data prior to multivariate data analysis, process data modelling or building predictive and inferential models.
- The initial data cleaning and conditioning tasks can consume up to 75% of the modelling time.
- The core feature of Pre-Screen is that it has been specifically developed to make the analysis of large data sets as visual and automated as possible without taking away the need for engineering science understanding.
A Multivariate Statistical Data Pre-screening Toolbox (Pre-Screen) has been designed and developed for process engineers who wish to pre-process and pre-screen process data prior to multivariate data analysis, process data modelling or building predictive and inferential models. The Pre-Screen toolkit was developed specifically with the aim of pre-screening large industrial data sets, but it is also applicable to other large analytical data sets.
In today’s industrial environment, where a large number of highly collinear and noisy process variables are collected for use in process modelling or performance monitoring, the initial data cleaning and conditioning task can consume up to 75% of the modelling time. The core function of Pre-Screen is that it makes the analysis of large data sets as automated as possible without taking away the need for engineering science understanding.
The toolbox builds on top of the MATLAB programming environment, with powerful user interface procedures providing user friendly, mouse/menu driven software. The Main Features of Pre-Screen include: Data Tags, Load and Save facilities, Data Plotting, Normality (Univariate and Multivariate), Summary Statistics (Mean, standard deviations, covariance, correlations, skewness and kurtosis), Missing Data analysis and rectification, Spurious (outlier) data Elimination, Data Transformations, Data Filtering, Cross correlation, Data Transformations (Mathematical and Time shifting), Scatter Plots to Observe Possible Relationships, Loadings and Contribution plots, Histogram Plots, Normal Probability plots, Action tracking, and plot copying to WORD files.