Introduction
The art and science of collecting, analyzing, presenting and interpreting data to make more effective decision. A collection of numerical information is called statistics.
Statistical Data:
According to Horace Secrist “By statistics we mean aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standard of accuracy, collected in a systematic manner for a pre-determined purpose and placed in relation to each other”.
Features of This Definition Are:
- Statistics are aggregate of facts.
- Statistics are affected to a marked extent by multiplicity of causes.
- Statistics are numerically expressed.
- Statistics are enumerated or estimated according to reasonable standard of accuracy.
- Statistics are collected in a systematic manner.
- Statistics are for a pre-determined purpose.
- Statistics should be placed in relation to each other.
Functions of Statistics:
- It presents facts in definite forms.
- It simplifies mass of figures.
- It facilitates comparison.
- It helps in formulating and testing hypothesis.
- It helps in prediction and
- It helps in the formulation of suitable policies.
Limitations of Statistics:
- It does not deal with isolated measurements.
- It deals only with quantitative characteristics.
- Its results are true only on average.
- It is only a means.
- It can be misused.
- Graphs can be misleading (scale).
- Association does not necessarily imply causation (addiction and talented)
About spss
SPSS is a softwear package of statistics. Its elaboration is “Statistical Package for Social Science”.
SPSS is a very easy-to-use statistical package that runs on Windows and Macintosh platforms. This class is designed for people who are just starting to use SPSS. The students in the class will have a hands-on experience using SPSS for doing statistics, graphics, and data management. The class notes are the scripts for the class and are printed and given to the students in the class. The SPSS class notes do not contain any of the computer output. The class notes are not meant to be a SPSS textbook or a reference manual. However, it is possible for individuals to use the class notes to help them learn SPSS even if they don’t enroll in the class.
Problem:
Heller Company manufacturers lawn mowers & related lawn equipments. The managers belive the quantity of lawn mowers sold depends on the price of the mower and the price of a compititors mower.
Let
Y= Quantity sold (1000s)
X1= Price of compititors mower ($)
X2= Price of Heller’s mower ($)The manager want to develop an estimated regression equation that relates quantity sold to the prices of the Heller mower and the competitors mower. The following table lists price in 8 cities.
Competitor’s Price (X1) | Heller’s Price (X2) | Quantity sold (Y) |
190 | 90 | 120 |
130 | 150 | 77 |
155 | 210 | 46 |
175 | 150 | 93 |
125 | 250 | 26 |
145 | 270 | 30 |
180 | 300 | 25 |
150 | 250 | 30 |
a) Develop the regression equation and discuss this equation.
b) Explain the values of co-efficient of correlation and determination. Also test the individual regression coefficients and overall model.
Solution:
The SPSS out put of the above problem is given below:
Frequencies:
Statistics
Quantity Sold | compititor’s Price | Heller’s Price | ||
N | Valid | 8 | 8 | 8 |
Missing | 0 | 0 | 0 | |
Mean | 55.88 | 156.2500 | 208.7500 | |
Median | 38.00 | 152.5000 | 230.0000 | |
Mode | 30 | 125.00(a) | 150.00(a) |
a Multiple modes exist. The smallest value is shown
Frequency Table:
Quantity Sold
Frequency | Percent | Valid Percent | Cumulative Percent | ||
Valid | 25 | 1 | 12.5 | 12.5 | 12.5 |
26 | 1 | 12.5 | 12.5 | 25.0 | |
30 | 2 | 25.0 | 25.0 | 50.0 | |
46 | 1 | 12.5 | 12.5 | 62.5 | |
77 | 1 | 12.5 | 12.5 | 75.0 | |
93 | 1 | 12.5 | 12.5 | 87.5 | |
120 | 1 | 12.5 | 12.5 | 100.0 | |
Total | 8 | 100.0 | 100.0 |
compititor’s Price
Frequency | Percent | Valid Percent | Cumulative Percent | ||
Valid | 125.00 | 1 | 12.5 | 12.5 | 12.5 |
130.00 | 1 | 12.5 | 12.5 | 25.0 | |
145.00 | 1 | 12.5 | 12.5 | 37.5 | |
150.00 | 1 | 12.5 | 12.5 | 50.0 | |
155.00 | 1 | 12.5 | 12.5 | 62.5 | |
175.00 | 1 | 12.5 | 12.5 | 75.0 | |
180.00 | 1 | 12.5 | 12.5 | 87.5 | |
190.00 | 1 | 12.5 | 12.5 | 100.0 | |
Total | 8 | 100.0 | 100.0 |
Heller’s Price
Frequency | Percent | Valid Percent | Cumulative Percent | ||
Valid | 90.00 | 1 | 12.5 | 12.5 | 12.5 |
150.00 | 2 | 25.0 | 25.0 | 37.5 | |
210.00 | 1 | 12.5 | 12.5 | 50.0 | |
250.00 | 2 | 25.0 | 25.0 | 75.0 | |
270.00 | 1 | 12.5 | 12.5 | 87.5 | |
300.00 | 1 | 12.5 | 12.5 | 100.0 | |
Total | 8 | 100.0 | 100.0 |
Descriptives:
Descriptive Statistics
N | Minimum | Maximum | Mean | Std. Deviation | Variance | |
Quantity Sold | 8 | 25 | 120 | 55.88 | 36.290 | 1316.982 |
compititor’s Price | 8 | 125.00 | 190.00 | 156.2500 | 23.56602 | 555.357 |
Heller’s Price | 8 | 90.00 | 300.00 | 208.7500 | 72.19765 | 5212.500 |
Valid N (listwise) | 8 |
Correlations:
Correlations
Quantity Sold | compititor’s Price | Heller’s Price | ||
Quantity Sold | Pearson Correlation | 1 | .496 | -.968(**) |
Sig. (2-tailed) | . | .211 | .000 | |
N | 8 | 8 | 8 | |
compititor’s Price | Pearson Correlation | .496 | 1 | -.305 |
Sig. (2-tailed) | .211 | . | .462 | |
N | 8 | 8 | 8 | |
Heller’s Price | Pearson Correlation | -.968(**) | -.305 | 1 |
Sig. (2-tailed) | .000 | .462 | . | |
N | 8 | 8 | 8 |
** Correlation is significant at the 0.01 level (2-tailed).
Regression:
Variables Entered/Removed(b)
Model | Variables Entered | Variables Removed | Method |
1 | Heller’s Price, compititor’s Price(a) | . | Enter |
a All requested variables entered.
b Dependent Variable: Quantity Sold
Model Summary
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
1 | .991(a) | .981 | .974 | 5.886 |
a Predictors: (Constant), Heller’s Price, compititor’s Price
ANOVA(b)
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 9045.637 | 2 | 4522.819 | 130.538 | .000(a) |
Residual | 173.238 | 5 | 34.648 | |||
Total | 9218.875 | 7 |
a Predictors: (Constant), Heller’s Price, compititor’s Price
b Dependent Variable: Quantity Sold
Coefficients(a)
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | ||||||
(Constant) | 97.074 | 18.811 | 5.160 | .004 | ||
compititor’s Price | .341 | .099 | .221 | 3.438 | .018 | |
Heller’s Price | -.453 | .032 | -.900 | -13.983 | .000 |
a Dependent Variable: Quantity Sold
From the above out put we get that
The Mean of Quantity sold is 55.88 Competitor’s price is 156.25 Heller’s price is 208.75. The Median of Quantity sold is 38 Competitor’s price is 152.50 Heller’s price is 230.00. The Mode of Quantity sold is 30 Competitor’s price is 125.00 Heller’s price is 150.00.
Minimum quantity sold is 25
Maximum quantity sold is 120
Minimum Competitor’s price is 125.00
Maximum Competitor’s price is 190.00
Minimum Heller’s price is 90.00
Maximum Heller’s price is 300.00
Sandard Deviation of Quantity sold is 36.290
Competitor’s price is 23.566
Heller’s price is 72.198
Correlation:
There is a moderate positive relationship between quantity sold and competitopr’s price..
There is a highly negative relationship between quantity sold and Heller’s price.
There is a low degree of negative relationship between Heller’s price and competitopr’s price.
Regression:
Here the estimated regression equation is
Y= 97.074+ 0.341X1 -0.453X2
Where, X1 = Competitor’s price ( Independed Varriable)
X2 = Heller’s price (Independed Varriable)
A= 97.074 (Constant).
b1= 0.341
b2= – 0.453
Interpretation of coefficient:
If we increase competitor’s price by 1 unit then it will increase quantity sold by 0.341 unit when Heller’s price is constant.
On the hand if we increase Heller price by 1 unit then it will decrease quantity sold by 0.453 unit when competitor’s price is constant.
R:
R is the correlation between the observed and predicted values of dependent variable. Here R is 0.991, that means there is a highly positive relation between the observed and predicted value.
R-Square:
R-Square is the proportion of variance in the dependent variable (Quantity) which can be predicted from the independent variables (Competitor’s price and Heller’s price). This value indicates that 98.1% of the variance in Quantity sold can be predicted from the variables Competitor’s price and Heller’s price. Note that this is an overall measure of the strength of association, and does not reflect the extent to which any particular independent variable is associated with the dependent variable. R-Square is also called the coefficient of determination.
Adjusted R2:
Many analysts prefer Adjusted R2 to avoid overestimating the impect of adding an independent varriable in the equation. Here the adjusted R2 is 0.974.
Testing for significance:
In multiple regrassion the t test and F test have different purposes.
- F test is used to determined wheather a significant relationship exists between the dependent varriable and set of all the independent varriables.
- If F test shows an overall significance, the t test is used to determined wheather the each of the independent varriable is significant.
Here the p value is smaller than the alpha value (say 0.05), so there is a significant relation between the dependent varriable and set of all the independent varriables.
Again from the t test we see that the p values of all of each of the three are individualy smaller than the value of alpha (say 0.05), so there is a significant relationship between each of the independent varriable with the dependent varriable.