Cover

Cover page

Table of Contents

Title page

Preface to third edition

Preface to second edition

Preface to first edition

About the companion website

1: The whys and wherefores of statistics

1.1 Learning objectives

1.2 Aims of the book

1.3 What is statistics?

1.4 Statistics in veterinary and animal science

1.5 Evidence-based veterinary medicine

1.6 Types of variable

1.7 Variations in measurements

1.8 Terms relating to measurement quality

1.9 Populations and samples

1.10 Types of statistical procedures

1.11 Conclusion

2: Descriptive statistics

2.1 Learning objectives

2.2 Summarizing data

2.3 Empirical frequency distributions

2.4 Tables

2.5 Diagrams

2.6 Numerical measures

2.7 Reference interval

3: Probability and probability distributions

3.1 Learning objectives

3.2 Probability

3.3 Probability distributions

3.4 Discrete probability distributions

3.5 Continuous probability distributions

3.6 Relationships between distributions

4: Sampling and sampling distributions

4.1 Learning objectives

4.2 Distinction between the sample and the population

4.3 Statistical inference

4.4 Sampling distribution of the mean

4.5 Confidence interval for a mean

4.6 Sampling distribution of the proportion

4.7 Confidence interval for a proportion

4.8 Bootstrapping and jackknifing

5: Experimental design and clinical trials

5.1 Learning objectives

5.2 Types of study

5.3 Introducing clinical trials

5.4 Importance of design in the clinical trial

5.5 Control group

5.6 Assignment of animals to the treatment groups

5.7 Avoidance of bias in the assessment procedure

5.8 Increasing the precision of the estimates

5.9 Further considerations

6: An introduction to hypothesis testing

6.1 Learning objectives

6.2 Introduction

6.3 Basic concepts of hypothesis testing

6.4 Type I and Type II errors

6.5 Distinction between statistical and biological significance

6.6 Confidence interval approach to hypothesis testing

6.7 Collecting our thoughts on confidence intervals

6.8 Equivalence and non-inferiority studies

7: Hypothesis tests 1 – the t-test: comparing one or two means

7.1 Learning objectives

7.2 Requirements for hypothesis tests for comparing means

7.3 One-sample t-test

7.4 Two-sample t-test

7.5 Paired t-test

8: Hypothesis tests 2 – the F-test: comparing two variances or more than two means

8.1 Learning objectives

8.2 Introduction

8.3 The F-test for the equality of two variances

8.4 Levene’s test for the equality of two or more variances

8.5 Analysis of variance (ANOVA) for the equality of means

8.6 One-way analysis of variance

9: Hypothesis tests 3 – the Chi-squared test: comparing proportions

9.1 Learning objectives

9.2 Introduction

9.3 Testing a hypothesis about a single proportion

9.4 Comparing two proportions: independent groups

9.5 Testing associations in an r × c contingency table

9.6 Comparing two proportions: paired observations

9.7 Chi-squared goodness-of-fit test

10: Linear correlation and regression

10.1 Learning objectives

10.2 Introducing linear correlation and regression

10.3 Linear correlation

10.4 Simple (univariable) linear regression

10.5 Regression to the mean

11: Further regression analyses

11.1 Learning objectives

11.2 Introduction

11.3 Multiple (multivariable) linear regression

11.4 Multiple logistic regression: a binary response variable

11.5 Poisson regression

11.6 Regression methods for clustered data

12: Non-parametric statistical methods

12.1 Learning objectives

12.2 Parametric and non-parametric tests

12.3 Sign test

12.4 Wilcoxon signed rank test

12.5 Wilcoxon rank sum test

12.6 Non-parametric analyses of variance

12.7 Spearman’s rank correlation coefficient

13: Further aspects of design and analysis

13.1 Learning objectives

13.2 Transformations

13.3 Sample size

13.4 Sequential and interim analysis

13.5 Meta-analysis

13.6 Methods of sampling

14: Additional techniques

14.1 Learning objectives

14.2 Diagnostic tests

14.3 Bayesian analysis

14.4 Measuring agreement

14.5 Measurements at successive points in time

14.6 Survival analysis

14.7 Multivariate analysis

15: Some specialized issues and procedures

15.1 Learning objectives

15.2 Introduction

15.3 Ethical and legal issues

15.4 Spatial statistics and geospatial information systems

15.5 Veterinary surveillance

15.6 Molecular and quantitative genetics

16: Evidence-based veterinary medicine

16.1 Learning objectives

16.2 Introduction

16.3 What is evidence-based veterinary medicine?

16.4 Why has evidence-based veterinary medicine developed?

16.5 What is involved in practising evidence-based veterinary medicine?

16.6 Integrating evidence-based veterinary medicine into clinical practice

16.7 Example

17: Reporting guidelines

17.1 Learning objectives

17.2 Introduction to reporting guidelines (EQUATOR network)

17.3 REFLECT statement (livestock and food safety RCTs)

17.4 ARRIVE guidelines (research using laboratory animals)

17.5 STROBE statement (observational studies)

17.6 STARD statement (diagnostic accuracy)

17.7 PRISMA statement (systematic reviews and meta-analysis)

18: Critical appraisal of reported studies

18.1 Learning objectives

18.2 Introduction

18.3 A template for critical appraisal of published research involving animals

18.4 Paper 1

18.5 Critical appraisal of paper 1

18.6 Paper 2

18.7 Critical appraisal of paper 2

18.8 General conclusion

Solutions to exercises

Appendix A: Statistical tables

Acknowledgements

The Standard Normal distribution (two-tailed P-values from values of z, the SND)

The Standard Normal distribution (values of z, the SND, from P-values)

The t-distribution

The Chi-squared (χ²) distribution

The F-distribution

Pearson’s correlation coefficient (r)

Spearman’s rank correlation coefficient (r_s)

The sign test

The Wilcoxon signed rank test

The Wilcoxon rank sum test

The table of random numbers

Appendix B: Tables of confidence intervals

Appendix C: Glossary of notation

Mathematical symbols and transformations

Common notation

Abbreviations

Appendix D: Glossary of terms

Appendix E: Flowcharts for selection of appropriate tests

References

Supplemental Images

Index

Title page

Wiley-Blackwell is an imprint of John Wiley & Sons, formed by the merger of Wiley’s global Scientific, Technical and Medical business with Blackwell Publishing.

Registered office: John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial offices: 9600 Garsington Road, Oxford, OX4 2DQ, UK

The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

111 River Street, Hoboken, NJ 07030-5774, USA

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.

The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.

The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting a specific method, diagnosis, or treatment by health science practitioners for any particular patient. The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. Readers should consult with a specialist where appropriate. The fact that an organization or Website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, readers should be aware that Internet Websites listed in this work may have changed or disappeared between when this work was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising herefrom.

Library of Congress Cataloging-in-Publication Data is available for this title.

A catalogue record for this book is available from the British Library.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Cover image: Horse illustration: all-silhouettes.com

Pig and dog illustration: Neubau Welt
Cover design by www.hisandhersdesign.co.uk

Preface to third edition

The continuing interest in our textbook together with the ongoing development of statistical applications in veterinary and animal science has encouraged us to prepare this third edition of Statistics for Veterinary and Animal Science. We have introduced some new material but we want to reassure all readers that our original intention of this being an introductory text still stands. Again, you will find everything that you need to begin to understand statistics and its application to your scientific and clinical endeavours; it remains an introduction for the novice with emphasis on understanding the application, rather than exhibiting mathematical competence in the calculations. Readily available statistical software packages, which provide the mechanics of the calculations, have become more extensive in the range of procedures they offer. Accordingly, we have augmented our text, within the bounds of an introductory exposition, to match their development.

As in previous editions, we use two commonly employed statistical software packages, SPSS and Stata, to analyse the data in our examples. We believe that by presenting you with different forms of computer output, you will have the confidence and proficiency to interpret output from other statistical packages. The previous edition of the book had an accompanying CD which contained the data sets (in ASCII, Excel, SPSS and Stata) used as examples in the text. These data sets are now available at www.wiley.com/go/petrie/statisticsforvets, and will be helpful if you wish to get to grips with various statistical techniques by attempting the analyses yourselves. You will find a website icon next to the examples for which the data are available on the website. Please note that, although we have provided details of a considerable number of websites that you may find useful, we cannot guarantee that these website addresses will remain correct over the course of time because of the mutability of the internet.

Some sections of the book are, as in previous editions, in small print and are accompanied by a jumping horse symbol. These sections contain information that relates to more advanced or obscure topics, and you may skip (jump over) them without loss of continuity. Our teaching experience has demonstrated that one of hardest tasks for the novice when analysing his or her own data set is deciding which test or procedure is most appropriate. To overcome this difficulty, we provide two flow charts (Figure E.2 for binary data and Figure E.3 for numerical data) which lead you through the various questions that need to be asked to aid that decision. Another flow chart (Figure E.1) organizes the tests and procedures into relevant groups and indicates the particular section of the book where each is located: you can find these flow charts in the Appendix as well as on the inside back/front covers for easy reference.

Many of the chapters in this third edition are similar to those in the second edition, apart from some minor modifications and additional exercises. However, Chapter 5 has been extended to include techniques for recognizing and dealing with confounding, and this chapter now provides a description of the different types of missing data that might be encountered. We have added a section on checking the assumptions underlying a logistic regression model to Chapter 11, and have included modifications of the sample size estimation process to take account of different group sizes and losses to follow-up in Chapter 13. Chapter 14 has been expanded considerably by extending the sections on diagnostic tests, measuring agreement and survival analysis as well as Bayesian analysis. Chapter 15 is entirely new, bringing together a group of specialist topics – ethical issues of animal investigation (some of which was in Chapter 5 of the second edition), spatial statistics, surveillance and its importance, and statistics in molecular and quantitative genetics. While none of these is intended as more than an introduction, you will find references to help you explore the topics more fully should you so desire. The section on evidence-based veterinary medicine (EBVM) in Chapter 16 is unchanged from that in the second edition’s Chapter 15, but in the third edition this chapter no longer provides guidelines for reporting results. Instead, we have devoted the new Chapter 17 to this topic by presenting different published guidelines relevant to veterinary medicine (i.e. for reporting of livestock trials, research using laboratory animals, diagnostic accuracy studies, observational studies in epidemiology, and systematic reviews and meta-analyses) as a ready reference for those wanting to follow best practice both in planning and in writing up their research. Lastly, in Chapter 18, which is entirely new, we bring together the concepts of EBVM and the guidelines provided in Chapter 17 by proffering a template for the critical appraisal of randomized controlled trials and observational studies. We use this template to critically appraise two published papers, both of which are reproduced in full, and hope that by providing these examples, we will help you develop your own skills in what is an essential, but frequently overlooked, component of statistics.

We are indebted as always to those who, for earlier editions of this book, have offered their data to us to use for examples or exercises, have assisted with the presentation of the illustrations and tables, and have provided critical advice on the text. These colleagues are all identified in the prefaces to the first and second editions. As in earlier editions, we have occasionally taken summary data or abstracts from published papers and have used them to develop exercises or to illustrate techniques: we extend our thanks to the authors and the publishers for the use of this material. For this third edition, we are most grateful to Dr Geoff Pollott and Professor Dirk Pfeiffer (both of the Royal Veterinary College, University of London) for their critical reading and suggestions for sections of Chapter 15. We wish to record our particular thanks to Professor Garry Anderson (University of Melbourne) for his critique of much of the new text. His suggestions have drawn our attention to errors and have considerably improved the presentation. Nonetheless, we remain responsible for all contained herein, and offer it, with all its shortcomings, to our readership.

This preface would not be complete without acknowledging our marriage partners, Gerald and Rosie, and our children, Nina, Andrew and Karen, and Oliver and Anna, who have allowed us once again to engage with this task to their inevitable exclusion, and offer them our most grateful thanks.

Aviva Petrie

Paul Watson

2013

Preface to second edition

It is six years since this book was first available, and we are glad to acknowledge the positive responses we have received to the first edition and the evident uptake of the text for a number of courses around the world. In the intervening period much has happened to encourage us to update and expand our initial text. However, many of the chapters which were in the first edition of the book are changed only slightly, if at all, in this second edition. To these chapters, we have added some exercises and further explanations (for example, on equivalence studies, confounding, interactions and bias, Bayesian analysis and Cox survival analysis) to make the book more comprehensive. We have nevertheless retained our original intent of this being an introductory text starting with very basic concepts for the complete novice in statistics. You will still find sections marked for skipping unless you have a particular need to explore them, and these include the newer more complex analysis methods. This edition also contains the glossaries of notation and of terms, but we have expanded them to reflect the enhanced content of the text. For easy reference, the flow charts for choosing the correct statistical analyses in different situations are now found immediately before the index, and we hope these will serve to guide you to the appropriate procedures and text relating to their use.

Computer software to deal with increasingly sophisticated analytical tools has been developed in recent years in such a way that the associated methodology is more readily accessible to those who previously believed such techniques were out of their reach. As a consequence, we have substantially enhanced the material relating to regression analysis and created a new chapter (Chapter 11) to describe some advanced regression techniques. The latter incorporates the sections on multiple regression and an expanded section on logistic regression from Chapter 10 of the first edition, and introduces Poisson regression, different regression methods which can be used to analyse clustered data, maximum likelihood estimation and the concept of the generalized linear model. Because we have inserted this new Chapter 11, the numbering of the chapters which follow does not accord with that of the corresponding chapters in the first edition.

Chapter 15 is an entirely new chapter which is devoted in large part to introducing the concepts of evidence-based veterinary medicine (EBVM), stressing the role of statistical knowledge as a basis for its practice. The methodology of EBVM describes the processes for integrating, in a systematic way, the results of scientifically conducted studies into day-to-day clinical practice with the aim of improving clinical outcome. This requires the practitioner to develop the skills to evaluate critically the efforts of others in respect of the design of studies, and of the presentation, analysis and interpretation of results. The recognition of the value of the evidence-based approach to veterinary medicine has followed a similar emphasis in human clinical medicine, and is influencing the whole veterinary profession. Accordingly, it is also very much a part of the mainstream veterinary curriculum. Whether you are a practitioner of veterinary medicine or of one of the allied sciences, you will now more than ever need to be conversant with modern biostatistical analysis. Knowing how best to report your own results is also vital if you are to impart knowledge correctly, and so, to this end, we include in Chapter 15 a section on the CONSORT Statement, designed to standardize clinical trial reporting.

Although we refer only to two common statistical packages in the text, SPSS and Stata, sufficient information is given to interpret output from other packages, even though the layout and content may differ to some degree. We have also mentioned a number of websites containing useful information, and which were correct at the time of printing. Given the mutability of the internet, we cannot guarantee that such sites will stay available.

Also included with this edition is a CD containing the data sets used as examples in the text. You can use these data sets to consolidate the learning process. It is only when you attempt the analyses yourself that you are fully able to get to grips with the techniques. Each data set is presented in four different formats (ASCII, Excel, SPSS and Stata), so you should be able to access the data and use the software that is available to you.

We would like to acknowledge the generosity of the late Dr Penny Barber, Mark Corbett, Dr J. E. Edwards, Professor Jonathan Elliott, Professor Gary England, Dr Oliver Garden, Dr Ilke Klaas, Dr Teresa Martinez, Dr Anne Pearson, Dr P. D. Warriss, Professor Avril Waterman-Pearson and Dr Susannah Williams who shared their original data with us, and to others who have allowed us to use their published data. In places, we have taken published summary data and constructed a primary data set to suit our own purposes; if we have misrepresented our colleagues’ data, we accept full responsibility. We are particularly grateful to Alex Hunte who lent us his skills in refining the illustrations in the first edition, and to Dr David Moles who assisted with the preparation of the statistical tables. We especially thank Dr Ben Armstrong, Professor Caroline Sabin and Dr Ian Martin who kindly gave us their critical advice as the text of the first edition was developed, and Professor John Smith who was instrumental in getting us to consider writing the book in the first place. In addition, we acknowledge our debt to a host of other colleagues who have helped with discussions over the telephone, with their expertise in areas we are lacking, and in their encouragement to complete what we hope will be a useful contribution to the field of veterinary and animal science. We are particularly indebted to those of our colleagues who have graciously pointed us to our errors, which we hope are now corrected.

Lastly, we again acknowledge with gratitude the patience and encouragement of our marriage partners, Gerald and Rosie, and our children, Nina, Andrew and Karen, and Oliver and Anna, who have once more graciously allowed us to become absorbed in the book and have had to suffer neglect in the process. We trust that they still appreciate the worthiness of the cause!

Aviva Petrie

Paul Watson

Preface to first edition

Although statistics is anathema to many, it is, unquestionably, an essential tool for those involved in animal health and veterinary science. It is imperative that practitioners and research workers alike keep abreast with reports on animal production, new and emerging diseases, risk factors for disease and the efficacy of the ever-increasing number of innovations in veterinary care and of developments in training methods and performance. The most cogent information is usually contained in the appropriate journals; however, the usefulness of these journals relies on the reader having a proper understanding of the statistical methodology underlying study design and data analysis. The modern animal scientist and veterinary surgeon therefore need to be able to handle numerical data confidently and properly. Often, for us, as teachers, there is little time in busy curricula to introduce the subject slowly and systematically; students find they are left bewildered and dejected because the concepts seem too difficult to grasp. While there are many excellent introductory books on medical statistics and on statistics in other disciplines such as economics, business studies and engineering, these books are unrelated to the world of animal science and health, and students soon lose heart. It is our intention to provide a guide to statistics relevant to the study of animal health and disease. In order to illustrate the principles and methods, the reader will find that the text is well endowed with real examples drawn from companion and agricultural animals. Although veterinary epidemiology is closely allied to statistics, we have concentrated only on statistical issues as we feel that this is an area which, until now, has been neglected in veterinary and animal health sciences.

Our book is an introductory text on statistics. We start from very simple concepts, assuming no previous knowledge of statistics, and endeavour to build up an understanding in such a way that progression on to advanced texts is possible. We intend the book to be useful for those without mathematical expertise but with the ability to utilize simple formulae. We recognize the influence of the computer and so we avoid the description of complex hand calculations. Instead, emphasis is placed on understanding of concepts and interpretation of results, often in the context of computer output. In addition to acquiring an ability to perform simple statistical techniques on original data, the reader will be able critically to evaluate the efforts of others in respect of the design of studies, and of the presentation, analysis and interpretation of results. The book can be used either as a self-instructional text or as a basis for courses in statistics. In addition, those who are further on in their studies will be able to use the text as a reference guide to the analysis of their data, whether they be postgraduate students, veterinary practitioners or animal scientists in various other settings. Every section contains sufficient cross referencing for the reader to find the relevant background to the topic.

We would like to acknowledge the generosity of Penny Barber, Mark Corbett, Dr J. E. Edwards, Dr Jonathan Elliott, Dr Gary England, Dr Oliver Garden, Dr Anne Pearson, Dr P. D. Warriss, Professor Avril Waterman-Pearson and Susannah Williams, who shared their original data with us. In places, we have taken published summary data and constructed a primary data set to suit our own purposes; if we have misrepresented our colleagues’ data, we accept full responsibility. We are particularly grateful to Alex Hunte who lent us his skills in preparing the illustrations, and to Dr David Moles who assisted with the preparation of the statistical tables. We especially thank Dr Ben Armstrong, Dr Caroline Sabin and Dr Ian Martin who kindly gave us their critical advice as the text was developed. Professor John Smith was instrumental in getting us to consider writing the text in the first place, and we thank him for his continual encouragement. In addition, we acknowledge our debt to a host of other colleagues who have helped with discussions over the telephone, with their expertise in areas we are lacking, and in general encouragement to complete what we hope will be a useful contribution to the field of veterinary and animal science.

Lastly, we acknowledge with gratitude the patience and encouragement of our families. Our marriage partners, Gerald and Rosie, have endured with fortitude our neglect of them while this work was in preparation. In particular, our children, Nina, Andrew and Karen, and Oliver and Anna, have had to cope with our absorption with the project and lack of involvement in their activities. We trust they will recognize that it was in a good cause.

Aviva Petrie

Paul Watson

About the companion website

This book is accompanied by a companion website:

www.wiley.com/go/petrie/statisticsforvets

The website includes:

Data files which relate to some of the examples in the text. Each data file is provided for download in four different formats: ASCII, Excel, SPSS and Stata.
Examples relating to the data files are indicated in the text using the following icon:

1

The whys and wherefores of statistics

1.1 Learning objectives

By the end of this chapter, you should be able to:

State what is meant by the term ‘statistics’.
Explain the importance of a statistical understanding to the animal scientist.
Distinguish between a qualitative/categorical and a quantitative/numerical variable.
List the types of scales on which variables are measured.
Explain what is meant by the term ‘biological variation’.
Define the terms ‘systematic error’ and ‘random error’, and give examples of circumstances in which they may occur.
Distinguish between precision and accuracy.
Define the terms ‘population’ and ‘sample’, and provide examples of real (finite) and hypothetical (infinite) populations.
Summarize the differences between descriptive and inferential statistics.

1.2 Aims of the book

1.2.1 What will you get from this book?

All the biological sciences have moved on from simple qualitative description to concepts founded on numerical measurements and counts. The proper handling of these values, leading to a correct understanding of the phenomena, is encompassed by statistics. This book will help you appreciate how the theory of statistics can be useful to you in veterinary and animal science. Statistical techniques are an essential part of communicating information about health and disease of animals, and their agricultural productivity, or value as pets, or in the sporting or working environment. We, the authors, aim to introduce you to the subject of statistics, giving you a sound basis for managing straightforward study design and analysis. Where necessary, we recommend that you extend your knowledge by reference to more specialized texts. Occasionally, we advocate that you seek expert statistical advice to guide you through particularly tricky aspects.

You can use this book in two ways:

1. The chapter sequence is designed to develop your understanding systematically and we therefore recommend that, initially, you work through the chapters in order. You will find certain sections marked in small type with a symbol, which indicates that you can skip these, at a first read through, without subsequent loss of continuity. These marked sections contain information you will find useful as your knowledge develops. Chapters 11, 14 and 15 deal with particular types of analyses which, depending on your areas of interest, you may rarely need.

2. When you are more familiar with the concepts, you can use the book as a reference manual; you will find sufficient cross-referenced information in any section to answer specific queries.

1.2.2 What are learning objectives?

Each chapter has a set of learning objectives at the beginning. These set out in task-oriented terms what you should be able to ‘do’ when you have mastered the concepts in the chapter. You can therefore test your growing understanding; if you are able to perform the tasks in the learning objectives, you have understood the concepts.

1.2.3 Should you use a computer statistics package?

We encourage you to use available computer statistics packages, and therefore we do not dwell on the development of the equations on which the analyses are based. We do, however, present the equations (apart from when they are very complex) for completeness, but you will normally not need to become familiar with them since computer packages will provide an automatic solution. We provide computer output, produced when we analyse the data in the examples, from two statistical packages, mostly from SPSS (IBM SPSS Version 20 (www-01.ibm.com/software/analytics/spss, accessed 9 October 2012)) and occasionally from Stata (Stata 12, StataCorp, 2011, Stata Statistical Software: Release 12. College Station, TX: StataCorp LP (www.stata.com/products, accessed 9 October 2012)). Although the layout of the output is particular to each individual package, from our description you should be able to make sense of the output from any other major statistical package.

1.2.4 Will you be able to decide when and how to use a particular procedure?

Our main concern is with the understanding that underlies statistical analyses. This will prevent you falling into the pitfalls of misuse that surround the unwitting user of statistical packages. We present the subject in a form that we hope is accessible, using examples showing the application of the subject to veterinary and animal science. A brief set of exercises is provided at the end of each chapter, based on the ideas presented within. These exercises should be used to check your understanding of the concepts and procedures; solutions to the exercises are given at the back of the book. The two exceptions are Chapter 17, which provides reporting guidelines and Chapter 18 in which we ask you to critically appraise two published articles, preferably before looking at the ‘model answers’ provided in the chapter.

1.2.5 Use of the glossaries of notation and terms

Statistical nomenclature is often difficult to remember. We have gathered the most common symbols and equations used throughout this book into a Glossary of notation in Appendix C. This gives you a readily accessible reminder of the meaning of the terminology.

You will find a Glossary of terms in Appendix D. In this glossary, we define common statistical terms which are used in this book. They are also defined at the appropriate places in relevant chapters, but the glossary provides you with a ready reference if you forget the meaning of a term. Terms that are in the glossary are introduced in the text in bold type. Note, however, that there are some instances where bold is purely used for extra emphasis.

1.3 What is statistics?

The number of introductory or elementary texts on the subject of statistics indicates how important the subject has become for everyone in the biological sciences. However, the fact that there are many texts might also suggest that we have yet to discover a foolproof method of presenting what is required.

The problem confronted in biological statistics is as follows. When you make a set of numerical observations in biology, you will usually find that the values are scattered. You need to know whether the values differ because of factors you are interested in (e.g. treatments) or because they are part of a ‘background’ natural variation. You need to evaluate what the numbers actually mean, and to represent them in a way that readily communicates their meaning to others.

The subject of statistics embraces:

The design of the study in order that it will reveal the most information efficiently.
The collection of the data.
The analysis of the data.
The presentation of suitably summarized information, often in a graphical or tabular form.
The interpretation of the analyses in a manner that communicates the findings accurately.

Strictly, this broad numerical approach to biology is correctly termed ‘biometry’ but we shall adopt the more generally used term ‘statistics’ to cover all aspects. Statistics (meaning this entire process) has become one of the essential tools in modern biology.

1.4 Statistics in veterinary and animal science

One of the common initial responses of both veterinary students and animal science students is: Why do I need to study statistics? The mathematical basis of the subject causes much uncertainty, and the analytical approach is alien. However, in professional life, there are many instances of the relevance of statistics:

The published scientific literature is full of studies in which statistical procedures are employed. Look in any of the relevant scientific journals and notice the number of times reference is made to mean ± SEM (standard error of mean), to statistical significance, to P-values or to t-tests or Chi-squared analysis or analysis of variance or multiple regression analysis. The information is presented in the usual brief form and, without a working knowledge of statistics, you are left to accept the conclusions of the author, unable to examine the strength of the supporting data. Indeed, with the advent of computer-assisted data handling, many practitioners can now collect their own observations and summarize them for the advantage of their colleagues; to do this, they need the benefit of statistical insights.
The subject of epidemiology (see Section 5.2) is gaining prominence in veterinary and animal science, and the concepts of evidence-based veterinary medicine (see Section 1.5 and Chapter 16) are being explicitly introduced into clinical practice. As never before, there is an essential need for you to understand the types of trials and investigations that are carried out and to know the meaning of the terms associated with them.
In the animal health sciences, there are an increasing number of independent diagnostic services that will analyse samples for the benefit of health monitoring and maintenance. Those running such laboratory services must always be concerned about quality control and accuracy in measurements made for diagnostic purposes, and must be able to supply clear guidelines for the interpretation of results obtained in their laboratories.
The pharmaceutical and agrochemical industries are required to demonstrate both the safety and the efficacy of their products in an indisputable manner. Such data invariably require a statistical approach to establish and illustrate the basis of the claim for both these aspects. Those involved in pharmaceutical product development need to understand the importance of study design and to ensure the adequacy of the numbers of animals used in treatment groups in order to perform meaningful experiments. Veterinary product licensing committees require a thorough understanding of statistical science so that they can appreciate the data presented to substantiate the claims for a novel therapeutic substance. Finally, practitioners and animal carers are faced with the blandishments of sales representatives with competing claims, and must evaluate the literature which is offered in support of specific agents, from licensed drugs to animal nutrition supplements.
Increasingly, there is concern about the regulation of safety and quality of food for human consumption. Where products of animal origin are involved, the animal scientist and the veterinary profession are at the forefront. Examples are: pharmaceutical product withdrawal times before slaughter based on the pharmacokinetics and pharmacodynamics of the products, the withholding times for milk after therapeutic treatment of the animal, tissue residues of herbicides and insecticides, and the possible contamination of carcasses by antibiotic-resistant bacteria. In every case, advice and appropriate regulations are established by experimental studies and statistical evaluation. The experts need to be aware of the appropriate statistical procedures in order to play their proper roles.

In all these areas, a common basic vocabulary and understanding of biometrical concepts is assumed to enable scientists to communicate accurately with one another. It is important that you gain mastery of these concepts if you are to play a full part in your chosen profession.

1.5 Evidence-based veterinary medicine

The veterinary profession is following the medical profession in introducing a more objective basis to its practice. Under the term evidence-based veterinary medicine (EBVM) – by which we mean the conscientious, explicit and judicious use of current best evidence to inform clinical judgements and decision-making in veterinary care (see Cockcroft and Holmes, 2003) – we are now seeing a move towards dependence upon good scientific studies to underpin clinical decisions. In many ways, practice has implicitly been about using clinical experience to make the best decisions, but what has changed is the explicit use of the accessible information. No longer do clinicians have to depend on their own clinical experience and judgement alone; now they can benefit from other studies in a formalized manner to assist their work. The clinician has to know what information is relevant and how to access this evidence, and be able to use rigorous methods to assess it. Generally, this requires a familiarity with the terminology used and an understanding of the principles of statistical analysis. Moreover, the wider world of animal science is finding a need to understand these ideas as the evidence-based concepts are being applied not only in the treatment of clinical disease but also in aspects of production and performance.

One of the differences between the application of EBVM in veterinary science and in human medicine is that in the latter the body of literature is now very large, and this makes finding relevant information easier. In the veterinary field, EBVM is still hampered by the relatively small amount and variable quality of the evidence available. Nevertheless, EBVM is gaining momentum, and we have devoted Chapter 16 to its concepts. One of the key requirements of EBVM is reliably reported information and, as in the human medical field, the veterinary publishing field is in the process of consolidating a set of guidelines for good reporting. We have addressed this in Chapter 17, outlining the information that is available at the time of writing. As critical appraisal of the published literature is invariably an essential component of evaluating evidence, we have devoted Chapter 18 to it. In this chapter, we provide templates for critically appraising randomized controlled trials and observational studies, and invite you to develop your skills by critically appraising two published articles.

1.6 Types of variable

A variable is a characteristic that can take values which vary from individual to individual or group to group, e.g. height, weight, litter size, blood count, enzyme activity, coat colour, percentage of the flock which are pregnant, etc. Clearly some of these are more readily quantifiable than others. For some variables, we can assign a number to a category and so create the appearance of a numerical scale, but others have a true numerical scale on which the values lie. We take readings of the variable which are measurements of a biological characteristic, and these become the values which we use for the statistical procedures. Both these terms are in general use, and both refer to the original measurements, the raw data.

Numerical data take various forms; a proper understanding of the nature of the data and the classification of variables is an important first step in choosing an appropriate statistical approach. The flow charts shown in Appendix E, and on the inside front and back covers, illustrate this train of thought, which culminates in a suitable choice of statistical procedure to analyse a particular data set.

We distinguish the main types of variable in a systematic manner by determining whether the variable can take ‘one of two distinct values’, ‘one of several distinct values’ or ‘any value’ within the given range. In particular, the variable may be one of the following:

1. Categorical (qualitative) variable – an individual belongs to any one of two or more distinct categories for this variable. A binary or dichotomous variable is a particular type of categorical variable defined by only two categories; for example, pregnant or non-pregnant, male or female. We customarily summarize the information for the categorical variable by determining the number and percentage (or proportion) of individuals in each category in the sample or population. Particular scales of a categorical variable are:
Nominal scale – the distinct categories that define the variable are unordered and each can be assigned a name, e.g. coat colours (piebald, roan or grey).
Ordinal scale – the categories that constitute the variable have some intrinsic order; for example, body condition scores, subjective intensity of fluorescence of cells in the fluorescence microscope, degree of vigour of motility of a semen sample. These ‘scales’ are often given numerical values 1 to n.

2. Numerical (quantitative) variable – consisting of numerical values on a well-defined scale, which may be:
Discrete (discontinuous) scale, i.e. data can take only particular integer values, typically counts, e.g. litter size, clutch size, parity (number of pregnancies within an animal).
Continuous scale, for which all values are theoretically possible (perhaps limited by an upper and/or lower boundary), e.g. height, weight, speed, concentration of a chemical constituent of the blood or urine. Theoretically, the number of values that the continuous variable can take is infinite since the scale is a continuum. In practice, continuous data are restricted by the degree of accuracy of the measurement process. By definition, the interval between two adjacent points on the scale is of the same magnitude as the interval between two other adjacent points, e.g. the interval on a temperature scale between 37°C and 38°C is the same as the interval between 39°C and 40°C.