Unlikethe descriptive statistics that focuses on describing the features ofa given sample, the inferential statistics aims at making ageneralisation about the characteristics of a population afterstudying a sample picked from the population. The word inferentialcomes from the word infer which means to make a conclusion.Inferential statistics is inductive in nature, and they are mainlyused in most disciplines because of the advantages associated withdealing with a sample rather than the whole population. A simple formof inferential statistics would be utilized in the case when aresearcher wants to compare performances between two classes on asingle measure to see whether they differ. Here we could apply the t-test. Other inferential tests include the Analysis of Variance(ANOVA), regression analysis and the Analysis of Covariance (ANCOVA)among others.
Adescriptive measure of any population under study is known as aparameter. However, these descriptive measures have unknown valuessince it is not feasible to consider the entire population. Take asituation in which to obtain the value of a given parameter theresearcher have to destroy the population elements. In that case, wemay not be able to study the entire population. For future care, theresearcher needs to pick a subset of the population and calculate theparameter estimates also known as sample statistics. The estimaterepresents the approximate numerical value of the parameter underquestion. An estimate is derived from an estimator. An estimator is aformula applied in calculating the estimate. Thus in statistics,parameter estimation is simply the process of calculating the valueof a given population parameter using some measured empirical datarandomly drawn from the population (Asadoorian & Kantarelis,2011). Take for example, the population mean denoted as.To approximate it, we need to pick a sample from the population andcalculate its mean denoted as .Thus the value of is the estimate of the population mean and the process of obtaining is known as parameter estimation.
Parameterestimates can either be point estimates or interval estimates. Apoint estimate is a specific numeric value obtained from the samplewhich represents the population parameter. For example, 10is a point estimate. On the other hand an interval estimate is arange of all possible values within which the unknown populationparameter could lie with some defined level of accuracy orconfidence. In that case, the interval estimate is referred to as theconfidence interval (Siegmund,2013).For instance, if we estimate the population mean and find out that the we can refer this to as an interval estimate or a confidenceinterval.
Apoint estimate is usually specific in nature. It is therefore takento 100 percent accurate. However, an interval estimate can only beaccurate up to a certain level or percentage. For instance, if we saythat the mean weight of particular cows is between 110 to 150kilograms we need to show our level of certainty or confidence. Thuswe could say that we are 95 percent confident that on average thecows weigh between 110 and 150 kilograms. The confidence level can,therefore, be defined as the probability that the unknown populationparameter lies within the particular interval. Therefore, if we are95% confident that the mean weight can be between 110 and 150kilograms, then the likelihood that the actual average weight lieswithin this range is 0.95.
Itis also known as the probability of error. Take the example of themean weight of cows. We are 95% confident that the actual mean weightof the cows in question lies between 110 and 150 kilograms. Thisimplies that there is a 5% chance that the actual average weight doesnot lie within the stated range or the interval. Therefore, thelikelihood that the specified range does not contain the unknownparameter is what we refer to as the level of significance or merelythe probability of error (Siegmund, 2013). Statisticians denote it asalpha (α ) which in our case is 5%. Thus, we can conclude that thelevel of confidence can be indicated by (1-α).
Anormal probability curve can represent the distinction between theprobability of error and the confidence interval. We shall use thesame example on the mean weight of cows.
Theshaded region represents the probability that the unknown parameterlies within the stated range ()which is the confidence level while the unshaded region is the chancethat the actual weight lies outside the range. The actual weight canbe below 110 or above 150 kilograms thus to capture both sides wedivide the level of significance by 2.
Hypothesistesting for a population mean when the standard deviation is known
Astatistical hypothesis is an educated guess or a postulated value ofa parameter that on basis of the observed data can be tested to beproven whether right or wrong. The process of hypothesis testingtakes various steps as outlined below.
Step1: State the hypothesis
:The hypothesised mean is equal to the actual mean
:The hypothesized mean is not equal to the actual mean
Step2: Determine the distribution or the test statistic
Sincethe standard deviation is known, then it is automatic that thepopulation mean follows a normal distribution and the applicable teststatistic is the Z- test given as
Step3: Determine the level of significance ()
Thelevel of significance can be obtained by subtracting the confidencelevel from 100. In cases where it is not stated 5% level ofsignificance is the most appropriate.
Step4: Obtain the critical value of Z
Thecritical value of Z is always obtained from the Z-tables which arealways provided
Step5: Conclusion and decision making
IfZ-calculated is greater than the Z- critical we reject the nullhypothesis and vice versa (Anderson& Finn, 2013).Take for instance a situation where the calculated value of Z isgreater than the tabulated value of Z. In that case the conclusionwould be at 5% level of significance the hypothesized value of thepopulation mean is not equal to the actual population mean.
TypeI and Type II error
TypeI error occurs as a result of rejecting the null hypothesis when itis true while type II error occurs as a result of failing to rejectthe null hypothesis when it is not true (Cowan,2011).There are various causes of these errors such as use of smallsamples, statistical errors and wrong hypotheses among others.
Anderson,T. & Finn, J. (2013). Thenew statistical analysis of data.New York: Springer.
Cowan,G. (2011). Statisticaldata analysis.Oxford: Clarendon Press.
Asadoorian,M. & Kantarelis, D. (2011). Essentialsof inferential statistics.Lanham: University Press of America.
Siegmund,D. (2013). Sequentialanalysis:Testsand confidence intervals.New York: Springer-Verlag.