INTRO TO RESEARCH METHOD

DATA ANALYSIS



¨ Statistic
is a set of procedure for describing, synthesizing, analyzing, and interpreting
quantitative data. One thousand scores, for example, can be represented with a
single number.

¨ Choice of appropriate statistical techniques is determined to a great extent by the
design of the study and by the kind of data to be collected.

¨ The choice of statistical techniques is largely
determined by the research hypothesis to be tested.

¨ A simple statistic is often more appropriate
than a more complicated one.



Types of Descriptive Statistics

The first step in data analysis is to describe, or summarize the data using descriptive
statistics. Descriptive statistics permit the researcher to meaningfully describe many, many scores with a small number of indices. If such indices are calculated for a sample drawn from a
population, the resulting values are referred to as statistics; if they are calculated for an entire population, they are referred to as parameters.

Graphing Data

The shape of the distribution may not be self-evident, especially if a large number of
scores are involved. The most common method of graphing research data is to construct a frequency polygon. The first step in constructing a frequency polygon is to list all the scores and tabulate how many subjects received each score. Once the scores are tallied, the steps are as follows: place all the scores on a horizontal axis, at equal intervals on the vertical axis,
starting with Zero for each score; find the point where the scores intersect with its frequency of occurrence and make a dot; connect all the dots with straight lines.


Measures of Central Tendency

Measured of central tendency give the researcher a convenient way of describing a set of data with a single number. The number resulting from computation of a measure of central
tendency represents the average or typical score attained by a group of subjects. Each index of central tendency is appropriate for a different scale of measurement; the mode is
appropriate for nominal data, the median for ordinal data. And the mean for interval or ratio data.

Ø The Mode

The mode is the score that is attained by more subjects than any others core. The mode is not established through calculation; it is determined by looking at a set of scores or at a graph
of scores and seeing which score occures more frequently. There are several problems associated with the mode, and it is therefore of limited values and seldom used. For one thing a set of scores may have two (or more) modes, in which case it is referred to as bimodal. Another
problem
is that it is an unstable measure of central tendency; equal-sized samples randomly selected from the same accessible population are likely to have different modes. When nominal data are involved, however, the mode is the only appropriate measure of central tendency.

ØThe Median

The median is that point in a distribution above and below which are 50% of the scores; in
other words, the median is the midpoint. The median doesn’t take into account
each and every score; it ignores, for example, extremely high scores and extremely low scores.

Ø The Mean

The mean is the arithmetic average of the scores, or the most frequently and measure of central tendency. By the very nature of the way in which it is computed, the mean takes into account or based on each and every score. It is appropriate when the data represents either an interval or ratio scale and is a more precise stable index than both the median and the mode. In situation in which there are one or more extreme scores, the median will be the best index of typical performance.


Measures of Variability

Two sets of data that are very different can have identical means or medians. Thus, there is a need for a measure that indicates how spread out the scores are, how much variability there is. While the standard deviation is by far the most often used, the range is the only appropriate measure of variability for the nominal data, and the quartile deviation is the appropriate index of variability for ordinal data. As with measures of central tendency, measures of variability appropriate for nominal and ordinal data may be used with interval or ratio data even though the standard deviation is generally the preferred index of variability.


Ø The Range

The range is simply the difference between the highest and the lowest score in a distribution and is determined by subtraction. Like the mode, the range is not a very stable
measure of variability, and its chief advantage is that it gives a quick, rough
estimate of variability.

Ø The Quartile Deviation

The quartile deviation is one-half of the difference between the upper quartile (the 75 percentile) and lower quartile (the 25 percentile) in a distribution. The quartile deviation is more stable measure of variability than the range and it is appropriate whenever the median is appropriate.



Ø The Standard Deviation

Like the mean, its counterpart measure of central tendency, the standard deviation is the most stable measure of variability and takes into account each and every score. If you know the mean and the standard deviation of a set of score, you have a pretty good picture of what the distribution looks like. If the distribution is relatively normal, then the mean plus three
standard deviation
and the mean minus three standard deviations
encompasses just about all the scores, over 99% of them.

The Normal Curve

Many, many variables do yield a normal curve if a sufficient
number of subjects are measured.

Ø Normal Distributions

If a variable is normally distributed, that is, does form a normal curve, then several things are true. First, 50% of the scores are above the mean and 50% of the scores are below the mean. Second, the mean, the median, and the mode are the same. Third, most scores are near the mean and the farther from the mean a score is, the fewer the number of subjects who attained that score. Fourth, the same number, or percentage, of scores is between the
mean and plus one standard deviation (X + 1 SD) as is between the mean and minus
one standard deviation (X- 1 SD), and similarly for + 2 SD and X + 3 SD.

Many variables form a normal distribution, including
physical measures, such as height and weight, and psychological measures, such
as intelligence and aptitude. Since research studies deal with a finite number
of subjects, and often not a very large number, research data only more or less
approximate a normal curve.


Ø Skewed Distributions

When a distribution is not normal, it is said to be skewed. A distribution which is skewed is
not symmetrical, and the values of the mean, the median, and the mode are different. In a skewed distribution, there are more extreme scores at one end than the other. If the extreme scores are at the lower end of the distribution, the distribution is said to be negatively
skewed
; if the extreme scores are at the upper, or higher, end of the distribution, the distribution is said to be positively skewed. In both cases, the mean is “pulled” in
the direction of the extreme scores. For a negatively skewed distribution the
mean (X) is always lower, or smaller, than the median (md); for a positively skewed distribution the mean is always higher, or greater, than the median. Usually, in a negatively skewed distribution the mean and the median are lower, or smaller, than the mode, whereas in a positively skewed distribution the mean and the median are higher, or greater, than the mode.

Measures of Relationship

Degree of relationship is expressed as a correlation coefficient which is computed
based on the two sets of scores. If two variables are highly related, a correlation coefficient near + 1.00 (or – 1.00) will be obtained; if two variables are not related, a coefficient near .00 will be obtained.

Ø The Spearman Rho

If the data for one of the variables are expressed as ranks instead of scores, the Spearman rho is the appropriate measure of correlation. It is thus appropriate when the data represent an ordinal scale (although it may be used with interval data) and is used when the median and quartile deviation are used. If only one of the variables to be correlated is in rank order, for
example, class standing at time of graduation, then the other variable to be correlated with it must also be expressed in terms of ranks. The Spearman rho is interpreted in the same way as the Pearson r and produces a coefficient somewhere between – 1.00 and + 1.00.
If more than one subject receives the same score, then their ranks are averaged.


Ø The Pearson r

The Pearson r is the most appropriate measure of correlation when the sets of data to be
correlated represent either interval or ratio scales. Like the mean and the
standard deviation, the Pearson r takes into account each and every score in both distributions; it is also the most stable measure of correlation. Since most educational measures represent
interval scales, the Pearson r is usually the appropriate coefficient for determining relationship. An assumption associated with the application of the Pearson r is that the relationship between the variables being correlated is a linear one.

Measures of Relative Position

Measures of relative position indicate where a score is in relation to all other scores in the
distribution. A major advantage of such measures is that they make it possible
to compare the performance of an individual on two or more different tests.

Ø Percentile Ranks

A percentile ranks indicates the percentage of scores that fall below a given score. Percentile
are
appropriate for data representing an ordinal scale, although they are frequently computed for interval data.

Ø Standard Scores

A standard score is a measure of relative position which is appropriate when the data represent
an interval or ratio scale. A z score expresses how far a score is from the mean in terms of standard deviation units. If a set of scores is transformed into a set of z scores the new distribution has a mean of 0 and a standard deviation of 1. The major disadvantage of the z score is that it allows scores from different tests to be compared. The only problem with z score is that they involve negative numbers and decimals. A simple solution is to transform z scores into Z scores. To do this, you simply multiply the z score by 10 and add 50. Stanines are standard scores that divide a distribution into nine parts.

§ The formula for the mean is

§ The formula for the standard deviation is where

§ The formula for the Pearson r is

The formula for degrees of freedom for the Pearson r is N-2 standard scores.

§ The formula for a z score is

§ The formula for a Z score is Z= 10z + 50

§ Calculation for Interval Data

Symbols commonly used in statistical:

X = any scores; = the sum of; add them up of all the scores; = the sum of all the scores= the mean or arithmetic average; = the square of the sum; add up all the scores and square the sum, or total.

Formulas are as follows:

N = total number of subjects; n = number of subjects in a particular graph; = the sum of all the squares; square each score and add up all the squares.




INTRO TO RESEARCH METHOD

RESEARCH REPORT



WRITING THE REPORT

Everything in the main body of the report up to the results section can actually be written before the experiment is conducted. The rest is merely a matter of writing down what happened, analyzing these happenings, and drawing conclusions.

FORMAT OF THE RESEARCH REPORT

The research report, whether it be a thesis, dissertation, or shorter term paper of report, usually follows a fairly standardized pattern. The usual sequence of topics is as follows:


A. Preliminary
Section or Front Matter,
consists of (1) Title Page, (2) Acknowledgement (if any), (3) Table of Contents, (4) List of tables (if any), (5) List of Figures (if any).

B. Main Body of the Report, consists of (1) Introduction: Statement of the problem-specific questions to be answered-hypotheses to be tested; Significance of the problem; Purposes of the study; Assumptions, limitations, and delimitations; Definition on important terms, (2) Review of Related Literature or Analysis of Previous Research, (3) Design of the Study: Procedures used; Sources of data; Methods of gathering data; Description of data-gathering instruments used, (4) Presentation and Analysis of Data: Text; Tables; Figures, (5)Summary and Conclusions: Restatement of the problem; Description of procedures used; Principal findings and conclusions; Recommendations for further research.

C. Reference Section, consists of (1) Bibliography, and (2) Appendix



PRELIMINARY SECTION


The first page of the report is the title page. The forms usually include: (1) the name of the topic, (2) the name of the author, (3) the relationship of the report to a course or degree requirement, (4) the name of the institution where the report is to be submitted, and (5) the date of presentation.

The title should communicate as briefly and directly as possible the precise nature of what the report is about. It should contain key words that will be recognized by others who might be interested in the research because abstracting services index reports by the key words in the titles. The title should be written after the report is written, if it occurs to the researcher before, he must not throw it away but he must be sure to examine it carefully after the report is written to make sure it conveys what is expected to convey.

An acknowledgement page is included if the writer has received unusual assistance in the conduct of the study. If used, acknowledgements should be simple and restrained.

Table of contents serves an important purpose in providing an outline of the contents of the
report.

THE ABSTRACT

The abstract should include the brief summary of the key points of the report. It is usually limited to 100 or 150 words. It should contain as its essential ingredients the statement of the hypothesis, the statement of the research prediction, and a brief statement of the results. In addition, a very brief statement of why the research is worth doing may sometimes be important.

THE MAIN BODY OF THE REPORT

This section may be divided into five divisions, that is:

1. An introduction to the area of consideration; a clear statement of the problem with
specific questions to be answered or hypothesis to be tested is presented first. The hypothesis should be clearly stated immediately before the methods section of the report. This statement can be couched in purely conceptual terms at this point; or it seems natural to do so, it can employ operationally defined terms. A consideration of the significance of the problem and its historical background is also appropriate. Specific purposes of the study are described, and all assumptions, limitations, and delimitations are recognized. All important terms are carefully defined, so that the reader may understand the concepts underlying the development of the investigation.

2. Review of the important literature; previous research studies are abstracted, and significant writing of authorities in the area under study are reviewed. This part provides a background for the development of the present study and brings the reader up to date. It gives evidence of the investigator’s knowledge of the field. A brief summary, indicating areas of agreement or disagreement in findings, or gaps in existing knowledge, should be included.

3. Design of the study; all the important variables in the study should be operationally defined, including control and moderator variables as well as the dependent and the independent variables, the size of the samples and how they are selected is carefully described, as well as the sources and methods of gathering data, the reliability of instruments selected or constructed, and the statistical procedures used in the analysis.

4. The presentation and analysis of data; through textual discussion and tabular and graphic
de-vices, the data are critically analyzed and reported. Tables and figures are used to clarify significant relationships. They are constructed and titled to be self-explanatory and are relatively simple. If complex tables are developed, they should be placed in the appendix.

5. Summary; after a brief statement of the problem and a description of the procedures used
in the investigation, the findings and conclusions are presented. Findings are statements of factual information based upon the data analysis. Conclusions are answers to the questions raised, or the statements of acceptance or rejection of the hypotheses proposed. This is often a very short section, but it can be lengthened considerably by being combined with the conclusions section of the report. It may be appropriate in concluding this part of the report to indicate promising side-problems that have been uncovered and to suggest areas or problems for further investigation. The summary section is the most used part of the research report. Readers who scan research literature to find significant studies examine this section before deciding whether or not further examination of the report is worthwhile.

REFERENCE SECTION

1. Bibliography; located at the end of the main body of the report, lists in alphabetical order the references used by the writer in preparing the report. In a short bibliography, books, pamphlets, monographs, and periodical references may be combined in the same list. If the number of references is large, the bibliography may be divided into sections, for books,
periodicals, and special documents. Ordinarily, a selected bibliography is preferable
to an exhaustive list.

2. Appendix; tables and data-important, but not essential to the understanding of the report-copies of cover letters used, and printed forms of questionnaires, tests, and other
data-gathering devices may be placed in the appendix.

§ Footnotes serve a number of purposes:
enabling writers to substantiate their presentation by citations of other
authorities, giving credit to sources of material that they have quoted or
paraphrased, and providing the reader with specific purposes that or she may use
to verify the authenticity and accuracy of material used. Footnotes are found
at the bottom of the page.

§ Tables and Figures; a table is a systematic method of presenting statistical data in
vertical columns and horizontal rows, according to some classifications of
subject matter. Tables enable the reader to comprehend and interpret masses of
data rapidly, and to grasp significant details and relationships at a glance. Good
tables are relatively simple, concentrating on a limited number of ideas. Text references should identify tables by number, rather than by such expressions as, “the table above” or “the following table”.

A figure is a device that presents statistical data in graphic
form. The term figure is applied to a wide variety of graphs, charts, maps,
sketches, diagrams, and drawings. When skillfully used, figures
present
aspects of data in a
visualized form that may be clearly and easily understood. Figures should not
be intended as substitutes for textual description, bu
included to emphasize certain significant relationships.

Tables and figures should be used sparingly; too many will
overwhelm the reader.


REFERENCES

1. John W. Best, Research in Education (4th Edition), 1981.

2. Edward L. Vockell, Educational Research, 1983.