Professor Salaries Investigation
This report will analyze professor salaries at an unnamed college for the nine-month school year of 2008-2009. The source of the data set is an excel file taken from the book, An R Companion to Applied Regression, Second Edition Sage, written by John Fox and Sanford Weisberg. The data set contains salary statistics for all full time professors, therefore, the data set is a population. The conclusions reached in this data analysis will be for the entire population of full time professors at this one unnamed college. The data set contains 255 observations and five variables. The variables are discipline of the professor as theoretical or applied, years of ...view middle of the document...
and the proportion of females in the population can be found using the same equation:
p̂ = number of femalestotal number = 18255 = 0.07.
The proportions can then be input into a frequency table:
Gender - Frequency Table |
Response | Relative Frequency |
Male | 0.93 |
Female | 0.07 |
Total | 1.00 |
The proportion of males in the population is a substantial amount at 93 percent.
We now will conduct a mean and median salary comparison for males and females. The mean and median are used to find the center of a single quantitative variable. The mean is calculated using the equation:
x̅ = Σxn
This equation is used for the population, males, and females and is summarized in the following table:
| mean | median |
population | 126,606 | 123,683 |
male | 126,958 | 124,309 |
female | 121,968 | 120,258 |
The results show us that the mean and median female salary is lower than that of the population and male salaries roughly four percent. This is not a substantial amount, and we conclude that this small discrepancy is due to the fact that male salaries make up 93 percent of the population. We would need more information about each observation to discern the reason some males have higher salaries than the females in the population.
To measure the dispersion about the mean salary levels we need to create a histogram for each of the male and female salaries. To create a histogram for population, ranges of salary data will first need to be made. To create a histogram that clearly shows the shape of the salary data there needs to be around five to ten salary ranges. With the salaries in the population a good salary range is considered $20,000. With this data, the histograms for each of the population, male, and female salaries are below:
The results of the histograms for the population, males, and females show a normal distribution about the center of the salary data.
The standard deviation is also used to determine the spread of data. To calculate the standard deviation the following equation is used:
σ = Σ(x-x)2n-1
The standard deviation for each of the population, male, and female salaries are summarized below:
| standard deviation |
population | 27,617 |
male | 28,129 |
female | 19,067 |
The standard deviation for female salaries shows that the data is not as spread out as the population or male salaries. This result supports the comparison of the histograms where the female graph looks more compact than the others. The means of male and female salaries are comparable but salaries for females are just less distributed. As stated before, this can be explained by the fact that females make up only 7 percent of the population.
Based on the results of the mean and median comparison and the...