The dataset we are going to analyze is college majors from the American Community Survey 2010-2012 Public Use Microdata Series regarding the different employment rates and salaries of a wide variety of majors in the US. Along with our dataset, we also decided to analyze the gender distribution of different majors. With our datasets, we are able to break down the different incomes, popularities and many other relevant properties of different college majors. Furthermore, this analysis will help colleges and academic suepervisors understand which majors are best suited for a particular student.
As a college student, we are aware that there are still a plethora of college students that are in the process of choosing a major which could be one of the biggest turning point of their lives. Many of which fear of choosing the wrong major and regret their decision later on in life. This analysis will provide the necessary information to help students decide their majors as well as understanding the potential of their future careers.
Throughout this analysis, we will use data provided by the American Community Survey 2010-2012 Public Use Microdata Series that contains data (employment status, gender, salary, major) from surverys of recently graduated students and graduate students of all ages. Moreover, further documentation regarding this data is available here.
Rank | Major_code | Major | Total | Men | Women | Major_category | ShareWomen | Sample_size | Employed | Full_time | Part_time | Full_time_year_round | Unemployed | Unemployment_rate | Median | P25th | P75th | College_jobs | Non_college_jobs | Low_wage_jobs |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2419 | PETROLEUM ENGINEERING | 2339 | 2057 | 282 | Engineering | 0.1205643 | 36 | 1976 | 1849 | 270 | 1207 | 37 | 0.0183805 | 110000 | 95000 | 125000 | 1534 | 364 | 193 |
2 | 2416 | MINING AND MINERAL ENGINEERING | 756 | 679 | 77 | Engineering | 0.1018519 | 7 | 640 | 556 | 170 | 388 | 85 | 0.1172414 | 75000 | 55000 | 90000 | 350 | 257 | 50 |
3 | 2415 | METALLURGICAL ENGINEERING | 856 | 725 | 131 | Engineering | 0.1530374 | 3 | 648 | 558 | 133 | 340 | 16 | 0.0240964 | 73000 | 50000 | 105000 | 456 | 176 | 0 |
4 | 2417 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | 1258 | 1123 | 135 | Engineering | 0.1073132 | 16 | 758 | 1069 | 150 | 692 | 40 | 0.0501253 | 70000 | 43000 | 80000 | 529 | 102 | 0 |
5 | 2405 | CHEMICAL ENGINEERING | 32260 | 21239 | 11021 | Engineering | 0.3416305 | 289 | 25694 | 23170 | 5180 | 16697 | 1672 | 0.0610977 | 65000 | 50000 | 75000 | 18314 | 4440 | 972 |
6 | 2418 | NUCLEAR ENGINEERING | 2573 | 2200 | 373 | Engineering | 0.1449670 | 17 | 1857 | 2038 | 264 | 1449 | 400 | 0.1772264 | 65000 | 50000 | 102000 | 1142 | 657 | 244 |
7 | 6202 | ACTUARIAL SCIENCE | 3777 | 2110 | 1667 | Business | 0.4413556 | 51 | 2912 | 2924 | 296 | 2482 | 308 | 0.0956522 | 62000 | 53000 | 72000 | 1768 | 314 | 259 |
8 | 5001 | ASTRONOMY AND ASTROPHYSICS | 1792 | 832 | 960 | Physical Sciences | 0.5357143 | 10 | 1526 | 1085 | 553 | 827 | 33 | 0.0211674 | 62000 | 31500 | 109000 | 972 | 500 | 220 |
9 | 2414 | MECHANICAL ENGINEERING | 91227 | 80320 | 10907 | Engineering | 0.1195589 | 1029 | 76442 | 71298 | 13101 | 54639 | 4650 | 0.0573423 | 60000 | 48000 | 70000 | 52844 | 16384 | 3253 |
10 | 2408 | ELECTRICAL ENGINEERING | 81527 | 65511 | 16016 | Engineering | 0.1964503 | 631 | 61928 | 55450 | 12695 | 41413 | 3895 | 0.0591738 | 60000 | 45000 | 72000 | 45829 | 10874 | 3170 |
Major_code | Major | Major_category | Total | Employed | Employed_full_time_year_round | Unemployed | Unemployment_rate | Median | P25th | P75th |
---|---|---|---|---|---|---|---|---|---|---|
1100 | GENERAL AGRICULTURE | Agriculture & Natural Resources | 128148 | 90245 | 74078 | 2423 | 0.0261471 | 50000 | 34000 | 80000 |
1101 | AGRICULTURE PRODUCTION AND MANAGEMENT | Agriculture & Natural Resources | 95326 | 76865 | 64240 | 2266 | 0.0286361 | 54000 | 36000 | 80000 |
1102 | AGRICULTURAL ECONOMICS | Agriculture & Natural Resources | 33955 | 26321 | 22810 | 821 | 0.0302483 | 63000 | 40000 | 98000 |
1103 | ANIMAL SCIENCES | Agriculture & Natural Resources | 103549 | 81177 | 64937 | 3619 | 0.0426789 | 46000 | 30000 | 72000 |
1104 | FOOD SCIENCE | Agriculture & Natural Resources | 24280 | 17281 | 12722 | 894 | 0.0491884 | 62000 | 38500 | 90000 |
1105 | PLANT SCIENCE AND AGRONOMY | Agriculture & Natural Resources | 79409 | 63043 | 51077 | 2070 | 0.0317909 | 50000 | 35000 | 75000 |
1106 | SOIL SCIENCE | Agriculture & Natural Resources | 6586 | 4926 | 4042 | 264 | 0.0508671 | 63000 | 39400 | 88000 |
1199 | MISCELLANEOUS AGRICULTURE | Agriculture & Natural Resources | 8549 | 6392 | 5074 | 261 | 0.0392304 | 52000 | 35000 | 75000 |
1301 | ENVIRONMENTAL SCIENCE | Biology & Life Science | 106106 | 87602 | 65238 | 4736 | 0.0512898 | 52000 | 38000 | 75000 |
1302 | FORESTRY | Agriculture & Natural Resources | 69447 | 48228 | 39613 | 2144 | 0.0425633 | 58000 | 40500 | 80000 |
Most popular major for all ages : Business with 9858741 people in that major
Least popular major for all ages : Interdisciplinary with 45199 people in that major
Most popular major for recent graduate students : Business with 1302376 people in that major
Least popular major for recent graduate students : Interdisciplinary with 12296 people in that major
The average median salary for recent grads is: 40151.4450867;
The standard deviation for median salary of recent grads is 11470.1818021.
Outlier analysis does not fit for these two sets of data because all three graphs are showing a trend of mean values for each topic. Outliers may be identified inside of eacj major category, but they are still in the major category, so does not effect the trend and the result of problem analysis.
In this section, we will analyze whether or not gender has a correlation between major choices, and if it does, we will also analyze whether or not women opt for lesser paying jobs than men. Throughout this question analysis, we will use datasets from recently graduated students from a wide variety of universities.
Subsequently, we will analyze and compare the differences in the ratio of female recent graduates and major categories - the categories each major falls in, e.g. Computer and Mathematics. In order to acquire this data, we calculated the the percentage of women in each major by dividing the number of women women with the total per major and multiplying by a hundred. Moreover, we used dplyr’s group_by()
method to summarize with the use of dplyr’s summarize()
into the means of male ratio, female ratio, salary medians, 25th percentile and 75th percentile. Furthermore, we were also able to plot the mean salary medians, 25th percentile and 75th percentile of a variety of major categories in order to identify which kinds of majors reflect to higher salaries.
Then, we will present a scatterplot of median salaries over ratio of men and women to prove whether or not women opt to lesser paying majors by using dplyr’s gather()
to plot both men and women ratios in the same scatterplot.
PETROLEUM ENGINEERING
Throughout the analysis, we found 1 outlier on the major median salary vs gender plot. We found out that PETROLEUM ENGINEERING is the outlier which has a median salary of 110000.
The results of our analysis proves that gender indeed is a factor when it comes to major decisions. Furthermore, we discovered that more women opt to lower paying majors than men. WE found out that majors such as engineering, bussiness and Computer and Mathematics are the highest paying major categories, however has the lowest ratios of women.
At the beginning of the college, it is all about thinking and choosing this whole business of majors. Since many students are getting lost and having a hard time to make decisions, this question will give them a general idea on where they should start based on the income and umemployment rate of different major categoies.
We grouped all the 173 majors by major categories, and then add all the people in each major category, calculate the average income and unemployment rate.
Major with the highest income : Engineering with an average income of $77758.6206897/year
Major with the lowest income : Interdisciplinary with an average income of $43000/year
Major with the highest umempolyment rate : Arts with an average umempolyment rate of 0.0876005
Major with the lowest umempolyment rate : Agriculture & Natural Resources with an average umempolyment rate of 0.0395692
We saw that Business is the most popular major based on the number of people in that major, and Interdisciplinary is the least popular major. If the answer is people should take popular major, then we assume that Business should have the highest income and the lowest unemployment rate; otherwise, it is meaningless to go for a popular major since it is really competitive to stand out among so many people.
However, the result shows some kinds of better give back from going into popular majors. In the first graph showing income of all the major categories, although Business as the most popular major does not have the highest average income, and Interdisciplinary are not necessarily being the lowest income, we could still see a trend of higher income on the two ends and relatively lower income in the middle. We discovered something interesting that this may because companies give higher salaries to those who are in minority majors because they are fewer people to hire, and resource are scarce, and to those who are in majority majors because they are in large demand and are necessary for the company. Therefore, people could have two choice when considering higher incomes, either going for minority majors or popular majors.
In the second graph showing unemployment rate of all the major categories, we could see a normal distribution on the unemployment rate besides Interdisciplinary has a relatively high unemployment rate. Therefore, we could have the similar conclusion as the income graph that for those in the minority majors and popular majors are more likely to find a stable job becuase of the same reason as above.
Therefore, based on our interpretations, we suggest that people should choose either the popular majors or the minority majors if they do not have yet decide majors.
For this question, we are comparing the unemployment rate and median salary for recent grads and all-ages.
In order to analyze the data, we selected the columns needed for analysis from the two datasets, and renamed the columns when necessary. We then combined the two dataframes, grouped the data by major categories, and summarized the data for each category. Such process produced following data frame:
Major.Category | Recent.Grads.Unemployment.Rate | Recent.Grads.Median.Salary | Overall.Unemployment.Rate | Overall.Median.Salary |
---|---|---|---|---|
Agriculture & Natural Resources | 0.0563283 | 36900.00 | 0.0395692 | 55000.00 |
Arts | 0.0901727 | 33062.50 | 0.0876005 | 43525.00 |
Biology & Life Science | 0.0609178 | 36421.43 | 0.0499360 | 50821.43 |
Business | 0.0710635 | 43538.46 | 0.0544960 | 60615.38 |
Communications & Journalism | 0.0755378 | 34500.00 | 0.0691245 | 49500.00 |
Computers & Mathematics | 0.0842560 | 42745.45 | 0.0594370 | 66272.73 |
Education | 0.0517020 | 32350.00 | 0.0467620 | 43831.25 |
Engineering | 0.0633339 | 57382.76 | 0.0506300 | 77758.62 |
Health | 0.0659202 | 36825.00 | 0.0472093 | 56458.33 |
Humanities & Liberal Arts | 0.0810076 | 31913.33 | 0.0694287 | 46080.00 |
Industrial Arts & Consumer Services | 0.0480713 | 36342.86 | 0.0585457 | 52642.86 |
Interdisciplinary | 0.0708609 | 35000.00 | 0.0772690 | 43000.00 |
Law & Public Policy | 0.0908048 | 42200.00 | 0.0678536 | 52800.00 |
Physical Sciences | 0.0465111 | 41890.00 | 0.0545406 | 62400.00 |
Psychology & Social Work | 0.0720648 | 30100.00 | 0.0778670 | 44555.56 |
Social Science | 0.0957288 | 37344.44 | 0.0656857 | 53222.22 |
Other quantitative results we found are displayed below:
The average difference in median salary for all majors is: -28.97%.
The standard deviation is: 0.097239.
This plot displays the overall unemployment rate (with dark green dots) and the unemployment rate for recent grads (with yellow green dots), along with their differences shown in the form of a bar graph. From this visualized data, we can conclude that for most of the major categories, those who recently graduated has a higher unemployment rate than the overall average.
In this plot, the difference in median salary for each major category is illustrated - it shows that the median salary is lower for recent grads in every single major category.
This histogram is created using the percent difference in median salary for every single major. It portrays a symmetric and unimodal distribution, with an average difference of -28.97% and a standard deviation of 0.097239 - that is, on average, recent grads receives a -28.97% lower median salary than the overall level.