Statistics and chemometrics for analytical chemistry PDF

Title Statistics and chemometrics for analytical chemistry
Author An Lê Văn
Pages 286
File Size 24.1 MB
File Type PDF
Total Downloads 19
Total Views 81

Summary

0131291920_COVER 3/2/05 10:48 AM Page 1 Statistics and Chemometrics for Analytical Chemistry This popular textbook gives a clear account of the principles of the Statistics and main statistical methods used in modern analytical laboratories. Such Chemometrics methods underpin high quality analyses i...


Description

Accelerat ing t he world's research.

Statistics and chemometrics for analytical chemistry An Lê Văn

Related papers

Download a PDF Pack of t he best relat ed papers 

[N K Aras, O Y At aman] Trace Element Analysis of F(BookSee.org) copy Сергей Писарев

Probabilidad y est adíst ica para ingenieros - Mont gomery y Runger - III Ed. Carlos Cerrut t i Numerical Met hods for Engineers Sixt h Edit ion Mesera Abdi

Statistics and Chemometrics for Analytical Chemistry James N Miller & Jane C Miller Fifth edition

Statistics and Chemometrics for Analytical Chemistry Fifth Edition

We work with leading authors to develop the strongest educational materials in chemistry, bringing cutting-edge thinking and best learning practice to a global market. Under a range of well-known imprints, including Prentice Hall, we craft high quality print and electronic publications which help readers to understand and apply their content, whether studying or at work. To find out more about the complete range of our publishing please visit us on the World Wide Web at: www.pearsoned.co.uk

James N. Miller Jane C. Miller

Statistics and Chemometrics for Analytical Chemistry Fifth Edition

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk Third edition published under the Ellis Horwood imprint 1993 Fourth edition 2000 Fifth edition 2005 © Ellis Horwood Limited 1993 © Pearson Education Limited 2000, 2005 The rights of J. N. Miller and J. C. Miller to be identified as authors of this Work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written permission of the Publishers or a licence permitting restricted copying in the United Kingdom issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP. All trademarks used herein are the property of their respective owners. The use of any trademark in this text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks imply any affiliation with or endorsement of this book by such owners. Microsoft is a trademark of Microsoft Corporation; MINITAB is a registered trademark of Minitab, Inc.; The Unscrambler is a trademark of Camo ASA. ISBN 0 131 29192 0 British Library Cataloguing-in-Publication Data A catalogue record for this book can be obtained from the British Library Library of Congress Cataloging-in-Publication Data Miller, J. N. (James N.), 1943– Statistics and chemometrics for analytical chemistry / James N. Miller and Jane C. Miller. — 5th ed. p. cm. Includes bibliographical references. ISBN 0-13-129192-0 (pbk.) 1. Chemistry, Analytic — Statistical methods — Textbooks. I. Miller, J. C. (Jane Charlotte) II. Title. QD75.4.S8M54 2005 543'.072 — dc22 2004060168 10 9 8 7 6 5 4 09 08 07 06 05

3

2

1

Typeset by 68 in 9.25/12pt Stone Serif Printed in Great Britain by Ashford Colour Press, Gosport, Hants

Contents

Preface to the fifth edition

ix

Preface to the first edition

xi

Acknowledgements

xiii

Glossary of symbols

xv

1 Introduction

1

1.1 Analytical problems 1.2 Errors in quantitative analysis 1.3 Types of error 1.4 Random and systematic errors in titrimetric analysis 1.5 Handling systematic errors 1.6 Planning and design of experiments 1.7 Calculators and computers in statistical calculations Bibliography Exercises

2 Statistics of repeated measurements 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10

Mean and standard deviation The distribution of repeated measurements Log-normal distribution Definition of a ‘sample’ The sampling distribution of the mean Confidence limits of the mean for large samples Confidence limits of the mean for small samples Presentation of results Other uses of confidence limits Confidence limits of the geometric mean for a log-normal distribution 2.11 Propagation of random errors 2.12 Propagation of systematic errors Bibliography Exercises

1 2 3 6 9 12 13 15 16

18 18 20 24 24 26 27 28 29 31 31 32 35 37 37

vi

Contents

3 Significance tests 3.1 Introduction 3.2 Comparison of an experimental mean with a known value 3.3 Comparison of two experimental means 3.4 Paired t-test 3.5 One-sided and two-sided tests 3.6 F-test for the comparison of standard deviations 3.7 Outliers 3.8 Analysis of variance 3.9 Comparison of several means 3.10 The arithmetic of ANOVA calculations 3.11 The chi-squared test 3.12 Testing for normality of distribution 3.13 Conclusions from significance tests Bibliography Exercises

4 The quality of analytical measurements 4.1 Introduction 4.2 Sampling 4.3 Separation and estimation of variances using ANOVA 4.4 Sampling strategy 4.5 Quality control methods – Introduction 4.6 Shewhart charts for mean values 4.7 Shewhart charts for ranges 4.8 Establishing the process capability 4.9 Average run length: cusum charts 4.10 Zone control charts (J-charts) 4.11 Proficiency testing schemes 4.12 Collaborative trials 4.13 Uncertainty 4.14 Acceptance sampling Bibliography Exercises

5 Calibration methods: regression and correlation 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11

Introduction: instrumental analysis Calibration graphs in instrumental analysis The product–moment correlation coefficient The line of regression of y on x Errors in the slope and intercept of the regression line Calculation of a concentration and its random error Limits of detection The method of standard additions Use of regression lines for comparing analytical methods Weighted regression lines Intersection of two straight lines

39 39 39 41 45 47 49 51 54 55 58 61 63 67 69 69

74 74 74 76 77 78 79 81 83 86 89 90 93 98 102 103 104

107 107 108 110 114 115 118 121 124 126 131 135

Contents

5.12 ANOVA and regression calculations 5.13 Curvilinear regression methods – Introduction 5.14 Curve fitting 5.15 Outliers in regression Bibliography Exercises

6 Non-parametric and robust methods 6.1 Introduction 6.2 The median: initial data analysis 6.3 The sign test 6.4 The Wald–Wolfowitz runs test 6.5 The Wilcoxon signed rank test 6.6 Simple tests for two independent samples 6.7 Non-parametric tests for more than two samples 6.8 Rank correlation 6.9 Non-parametric regression methods 6.10 Robust methods – Introduction 6.11 Robust estimates of location and spread 6.12 Robust regression methods 6.13 Re-sampling statistics 6.14 Conclusions Bibliography Exercises

7 Experimental design and optimization 7.1 Introduction 7.2 Randomization and blocking 7.3 Two-way ANOVA 7.4 Latin squares and other designs 7.5 Interactions 7.6 Factorial versus one-at-a-time design 7.7 Factorial design and optimization 7.8 Optimization: basic principles and univariate methods 7.9 Optimization using the alternating variable search method 7.10 The method of steepest ascent 7.11 Simplex optimization 7.12 Simulated annealing Bibliography Exercises

8 Multivariate analysis 8.1 8.2 8.3 8.4 8.5

Introduction Initial analysis Principal component analysis Cluster analysis Discriminant analysis

vii

136 138 141 145 146 147

150 150 151 156 158 159 162 165 167 169 171 173 175 176 178 179 179

181 181 182 183 186 187 192 193 197 200 202 205 209 209 210

213 213 214 215 220 223

viii

Contents

8.6 K-nearest neighbour method 8.7 Disjoint class modelling 8.8 Regression methods 8.9 Multiple linear regression (MLR) 8.10 Principal components regression (PCR) 8.11 Partial least squares (PLS) regression 8.12 Artificial neural networks 8.13 Conclusions Bibliography Exercises

227 228 228 229 232 234 236 237 238 239

Solutions to exercises

241

Appendix 1: Commonly used statistical significance tests

251

Appendix 2: Statistical tables

254

Index

263

Supporting resources Visit www.pearsoned.co.uk/miller to find valuable online resources For instructors

• Complete, downloadable Instructor’s Manual • PowerPoint slides that can be downloaded and used as OHTs For more information please contact your local Pearson Education sales representative or visit www.pearsoned.co.uk/miller

Preface to the fifth edition

Since the publication of the fourth edition of this book in 2000 the use of statistics and chemometrics methods in the teaching and practice of the analytical sciences has continued to expand rapidly. Once again this new edition seeks to reflect these developments while retaining the basic aims of previous editions, which adopted a pragmatic and as far as possible non-mathematical approach to statistical calculations. We have also continued to include in the text examples solved with the aid of Microsoft Excel® and the familiar statistics package Minitab®, both widely available to teachers, students and researchers, and both in a welcome process of continuing development. Extras and macros for both programs are available through the Internet, in many cases without charge. The graphic and other facilities offered by these programs are further exploited in the Instructors’ Manual, which is again available to accompany this edition, and still further updates and examples for teachers and students are provided through the associated website. As in the last edition the solutions to the exercises given in this book are only outline ones – full solutions are given in the Instructors’ Manual. The main areas where new material has appeared in this edition are in Chapters 4–8. Chapter 4 includes an expanded treatment of control charts and additional material on uncertainty and on proficiency testing schemes. In Chapter 5 there is more material on the use of regression lines for method comparisons. Chapter 6 reflects the continuing growth of importance of robust methods, and Chapter 7 provides extra sections on factorial designs and on the simplex optimization method. The use of multivariate methods is now very common, so Chapter 8 includes an extended discussion of the principal components and partial least squares regression methods, and more on neural networks. In the earlier chapters on basic statistics the main changes are the greater emphasis on the Grubbs outlier test and a move of the section of Kolmogorov methods to Chapter 3. The bibliographies for each chapter have been updated, with rather more annotations than in the past, and with more emphasis on publications from standards organizations. As always we are very grateful to colleagues and correspondents who have pointed out minor errors (we remain responsible for any that are still there) and made other constructive suggestions. Once again we thank the Royal Society

x

Preface to the fifth edition

of Chemistry for permission to use examples taken from papers published in The Analyst, one of the world’s leading journals in this field. And we thank Simon Lake and his colleagues at Pearson Education for their patience, enthusiasm and professional expertise in a perfect mix. James N. Miller Jane C. Miller October 2004

Preface to the first edition

To add yet another volume to the already numerous texts on statistics might seem to be an unwarranted exercise, yet the fact remains that many highly competent scientists are woefully ignorant of even the most elementary statistical methods. It is even more astonishing that analytical chemists, who practise one of the most quantitative of all sciences, are no more immune than others to this dangerous, but entirely curable, affliction. It is hoped, therefore, that this book will benefit analytical scientists who wish to design and conduct their experiments properly, and extract as much information from the results as they legitimately can. It is intended to be of value to the rapidly growing number of students specializing in analytical chemistry, and to those who use analytical methods routinely in everyday laboratory work. There are two further and related reasons that have encouraged us to write this book. One is the enormous impact of microelectronics, in the form of microcomputers and hand-held calculators, on statistics: these devices have brought lengthy or difficult statistical procedures within the reach of all practising scientists. The second is the rapid development of new ‘chemometric’ procedures, including pattern recognition, optimization, numerical filter techniques, simulations and so on, all of them made practicable by improved computing facilities. The last chapter of this book attempts to give the reader at least a flavour of the potential of some of these newer statistical methods. We have not, however, included any computer programs in the book – partly because of the difficulties of presenting programs that would run on all the popular types of microcomputer, and partly because there is a substantial range of suitable and commercially available books and software. The availability of this tremendous computing power naturally makes it all the more important that the scientist applies statistical methods rationally and correctly. To limit the length of the book, and to emphasize its practical bias, we have made no attempt to describe in detail the theoretical background of the statistical tests described. But we have tried to make it clear to the practising analyst which tests are appropriate to the types of problem likely to be encountered in the laboratory. There are worked examples in the text, and exercises for the reader at the end of each chapter. Many of these are based on the data provided by research papers published in The Analyst. We are deeply grateful to Mr. Phil Weston, the Editor, for

xii

Preface to the first edition

allowing us thus to make use of his distinguished journal. We also thank our colleagues, friends and family for their forbearance during the preparation of the book; the sources of the statistical tables, individually acknowledged in the appendices; the Series Editor, Dr. Bob Chalmers; and our publishers for their efficient cooperation and advice. J. C. Miller J. N. Miller April 1984

Acknowledgements

The publishers wish to thank the following for permission to reproduce copyright material: Tables A.2, A.3, A.4, A.7, A.8, A.11, A.12, A.13 and A.14 reproduced with the permission of Routledge; Table A.5 reproduced by permission of John Wiley & Sons, Limited; Table A.6 reprinted with permission from the Journal of the American Statistical Association, copyright 1958 by the American Statistical Association. All rights reserved; Table A.10 adapted with the permission of the Institute of Mathematical Statistics; data from articles published in The Analyst used with the permission of the Royal Society of Chemistry; examples of Minitab input and output used with the permission of Minitab, Inc.

Glossary of symbols

a b c C C F G h µ M n N N ν P(r) Q r r r

– – – – – – – – – – – – – – – – – – –

R2 R′2 rs s sy/x sb sa s(y/x)w sx0 sB sxE sx0w

– – – – – – – – – – – –

σ



intercept of regression line gradient of regression line number of columns in two-way ANOVA correction term in two-way ANOVA used in Cochran’s text for homogeneity of variance the ratio of two variances used in Grubbs’ test for outliers number of samples in one-way ANOVA arithmetic mean of a population number of minus signs in Wald–Wolfowitz runs test sample size number of plus signs in Wald–Wolfowitz runs test total number of measurements in two-way ANOVA number of degrees of freedom probability of r Dixon’s Q, used to test for outliers product–moment correlation coefficient number of rows in two-way ANOVA number of smallest and largest observations omitted in trimmed mean calculations coefficient of determination adjusted coefficient of determination Spearman rank correlation coefficient standard deviation of a sample standard deviation of y-residuals standard deviation of slope of regression line standard deviation of intercept of regression line standard deviation of y-residuals of weighted regression line standard deviation of x-value estimated using regression line standard deviation of blank standard deviation of extrapolated x-value standard deviation of x-value estimated by using weighted regression line standard deviation of a population

xvi

Glossary of symbols

σ 20 σ 21 t

– – –

T T1 and T2 w wi x x0 x0 x∼i xE xw X2 yˆ yw yB z

– – – – – – – – – – – – – – –

measurement variance sampling variance quantity used in the calculation of confidence limits and in significance testing of mean (see Section 2.4) grand total in ANOVA test statistics used in the Wilcoxon rank sum test range weight given to point on regression line arithmetic mean of a sample x-value estimated by using regression line outlier value of x pseudo-value in robust statistics extrapolated x-value arithmetic mean of weighted x-values quantity used to test for goodness-of-fit y-values predicted by regression line arithmetic mean of weighted y-values signal from blank standard normal variable

1 1.1

Introduction

Analytical problems Practising analytical chemists face both qualitative and quantitative problems. As an example of the former, the presence of boron in distilled water is very damaging in the manufacture of microelectronic components – ‘Does this distilled water sample contain any boron?’. Again, the comparison of soil samples is a common problem in forensic science – ‘Could these two soil samples have come from the same site?’. In other cases the problems posed are quantitative. ‘How much albumin is there in this sample of blood serum?’, ‘How much lead in this sample of tapwater?’, ‘This steel sample contains small quantities of chromium, tungsten and manganese – how much of each?’: these are typical examples of single-component or multiple-component quantitative analyses. Modern analytical chemistry is overwhelmingly a quantitative science. In many cases a quantitative answer will be much more valuable than a qualitative one. It may be useful for an analyst to claim to have detected some boron in a distilled water sample, but it is much more useful to be able to say how much boron is present. The person who requested the analysis could, armed with this quantitative answer, judge whether or not the level of boron was of concern or consider how it might be reduced. But if it was known only that some boron was present it would be hard to judge the significance of the result. In other cases, it is only a quantitative result that has any value at all. For example, almost all samples of (human) blood serum contain albumin; the only question is, how much? Even where a qualitative answer is required, quantitative methods are used to obtain it. In reality, an analyst would never simply report ‘I can/cannot detect boron in this water sample’. A quantitative method capable of detecting boron at, say, levels of 1 µg ml−1 would be used. If the test gave a negative result, it would then be described in the form ‘This sample contains less than 1 µg ml−1 boron’. If the test gave a positive result the sample would be reported to contain at least 1 µg ml−1 boron (with other information too – see below). Quantitative approaches might also be used to compare two soil samples. For example, they might be subjected to a ...


Similar Free PDFs