#### Essay > Words: 3025 > Rating: Excellent > Buy full access at $1

**EXECUTIVE SUMMARY**:

Statistics s a major mathematical instrument, which is used for the simple and quantitative analysis of numerical data. Owing to this fact, statistics serves an integral role by which we extract important information from a specific data set. In the project, we study the two distinctive research methods, that is empirical and theoretical data analysis. Empirical research often generates more than one outcome of the same data because it replicates the measurements, which are prone to error. Simple statistical data analysis can be used to to summarize the observations made that is by coming up with an estimate of the average value, which is also referred to as the true mean. The other most important statistical data analysis is the determination of the variance, which is used to quantify the uncertainty in a measures variable (Peters, 1). Statistical data analysis is crucial especially when used to spread the error measurement over a model in mathematics, which is used to estimate the error in the derived quantity. These methods of statistical data analysis are just among the few among a list of many ways for analysing an empirical and theoretical data set.

**INTRODUCTION**[0.5 pages]

Scientific research represents a wide range of disciplines that are interrelational to one another. Further, within each discipline, a researcher can use several different methods of conducting research. Simple data analysis is divided into two broad categories namely theoretical and empirical research (SAS, 2003). Theoretical research incorporates mathematics and logic functions to prove beyond doubt that certain propositions are true. Experimental research is based on coming up with conclusions from observations. A good example is in biology, where one is required to compare the genetic strands obtained from two species, here one might conclude that the species compared may have a likely ancestry.

Testing of the null hypothesis is one common language social science researchers use (SAS, 2003). Theoretical data analysis is based on the fact that data is obtained from a normal population distribution but the experiments conducted prove that there is inconsistencies arising from the assumptions of data normality (SAS, 2003). However, if the investigated data is distributed normally, then the theoretic data analysis becomes a powerful tool in research because it detects the more important and significant data (SAS, 2003). Empirical research on the other hand uses data information regarding the relative size of any research observation without making any assumptions concerned with addressing the variance and mean of the population under survey, and in the end can only be useful in the analysis of any type of dataset (SAS, 2003).

Despite the fact that the scientific methods are diverse, quite a number of the data analysis methods used by researchers share several characteristics that are common. Most research involves a researcher, regardless of the field of study, gathering data and performing certain data analysis in order to determine the meaning of the data (SAS, 2003). Additionally, researchers in the field of social sciences like sociology and psychology use one common language in terms of reporting and conducting their research.

**PROJECT WORK**[4-6 pages]**Definition**

When it comes to discussing how a simple and comprehensive data analysis and comparison works, the use of several concepts that bear specific meanings cannot be avoided. In the project, certain concepts were important and will be defined first

**Error**

In statistics, error is the random and unpredictable deviances between duplicated data, which have been quantified with a standard deviation (Peters, 6). Error in data analysis can also be attributed to several factors that include; a predictable regular deviation obtained from the true value, then quantified as the mean difference. This means error could occur from the the difference between the mean data from replicate determinations and the true value (Peters, 6). The other way analytical errors happen can be attributed to a constant data, which is unrelated to the data being analyzed.

**Accuracy**

This is defined as the closeness of the analytical data representation to the true value. Accuracy constitutes of several combinations incorporating a systematic as well as random errors because the errors cannot be directly quantified (Novikova, 1). The test result, which has been conducted may be a mean of several values, therefore, an accurate determination of the data may produce and precise and quantifiable value (Novikova, 1).

**Precision**

This means the results of the replicate data analysis of a given sample are close and tend to agree with each other (SAS, 2003). Precision is a measure of dispersion around the mean value, and it is usually expressed in the form of a standard deviation and range. Range is used to describe the difference between the lowest value and the highest value within a given data set (SAS, 2003).

**Bias**

Bias data are the most used measure of dispersion but quite the opposite of trueness of a data and is defined as the agreement of the mean of the logical results with respect to the true value, that is after one has excluded the contribution of randomness as represented by precision (Peters, 2). This means the steady deviation of the studied data results from the true value, which is brought about by the logical errors within a given procedure (Peters, 17). There are several factors that contribute greatly to the analytical data being bias:

- Method bias is the difference between the mean test result obtained from various data sets. The method bias is dependent on the level of analysis conducted
- Sample bias is the difference obtained from the true value of the target dataset from which the sample data was taken and the mean value of the duplicate test result.

**Data collection**

Data collection forms a very very important part in any kind of statistical research carried out. Accurate data have a very huge impact on the results obtained from a research and can ultimately determine the validity of the research results. Data collective have a wide range of different ways of data information. The two types of data collection methods include quantitative and qualitative methods (SAS, 2003). Depending on the type of research carried out, each method is suitable for a specific study.

**Tools of data analysis and representation(Measures of Central Tendency)**

**Measures of Central Tendency**

**The Mean**

The mean in each dataset is calculated by means of adding up all the scores within the given dataset, and then later dividing the total from the data set by the total number of participants in that condition. Supposing we are to compare three datasets D1, D2 and D3 as obtained in the workbook. Sheet 1 represents D1, Sheet 2 represents D2 and Sheet 3 representing D3. Comparing the three datasets D1, D2, and D3, whose values are 287.5533, 252.6467 and 259.2 respectively. According to the spreadsheet representation of the three data sets, the mean value of D2 and D3 have minimum variations as compared to dataset D1, whose values are quite extreme. This is illustrated in figure1 below

The main advantage of using the mean is based on the fact that puts into account all the scores present in the dataset. This makes it an important measure of central tendency especially if the scores are a resemblance of the normal distribution (Wike, 14). The normal distribution in most cases shows a bell shaped kind of distribution and if this is so most scores are clustered closely to the mean.

On the other side, the mean values can be extremely misleading especially if the distribution scores have a distinctive contrast from the normal, having one or more extremes leaning towards one direction (SAS, 2003).

** Figure 1**** **

**The Median**

The median is used to describe the general performance level within each statistical condition (SAS, 2003). In other terms, the median is the middle number in an odd number of scores and having equal number of scores on either side of the data. If the number of datasets is equal, then the median is the value that stands alone with the remaining values on either side remaining equal. In case the numerical data have even numbers, the median is obtained by determining the mean of the middle two values (SAS, 2003). In the case of the provided data sets D1, D2 and D3, the dataset comprises of one hundred and fifty different values. The median values fall in-between the seventy fifth and the seventy sixth value. The median of D1, D2 and D3 are 287, 252 and 255.5 respectively..............

#### Type: Essay || Words: 3025 Rating || Excellent

Subscribe at $1 to view the full document.Buy access at $1