Chemometrics: Interpretation of real-world data tables
Chemometrics concerns mathematical and statistical modelling of real-world data tables in chemistry, e.g. in analytical chemistry, organic chemistry, biochemistry and chemical engineering. The data analytical culture and method toolbox from chemometrics has, over the last two decades, been introduced to many other fields as well – food and agricultural sciences, sensory science, consumer science and quality evaluation and process analytical technology (PAT). At the present moment, bio-chemometrics is rapidly finding new uses in integrative bio-sciences, such as in bio-spectroscopy, metabolomics, functional genomics and systems biology. Data modelling in chemometrics is done for various purposes, ranging from selectivity enhancement of multi-channel measuring instruments by multivariate calibration, via cost/benefit optimisation in research by experimental design, to cross-disciplinary interpretation of large data tables by “multivariate soft modelling”.

Common in all these applications is the attempt to balance data-driven modelling with theory-driven modelling: Multivariate pragmatic, graphically oriented data analysis, validated by understandable re-sampling, is combined with domain-specific background knowledge, to allow the user to discover, interpret and utilize patterns in data tables.

Chemometricians employ multivariate data modelling methods from many disciplines – theoretical and computational statistics, psychometrics and sensometrics, econometrics and machine learning. Some methods have been developed within chemometrics, such as Partial Least Squares (PLS) regression and various data pre-processing methods.

Near infrared (NIR) instruments represent a great success story for chemometrics: Fast, cheap multichannel light measuring instruments in the near-infrared range (750-2500 nm) are calibrated mathematically to provide precise determinations of chemical and physical qualities that would be prohibitively expensive to measure by traditional chemical or physical means. A wide range of NIR instruments are now used in many petrochemical and pharmaceutical process industries, as well as in food and agricultural quality assessment. The same approach is now being extended into molecular biology, production biology and medicine, based on other types of multivariate spectrometers.