The dataset we are going to analyze is college majors from the American Community Survey 2010-2012 Public Use Microdata Series regarding the different employment rates and salaries of a wide variety of majors in the US. Along with our dataset, we also decided to analyze the gender distribution of different majors. With our datasets, we are able to break down the different incomes, popularities and many other relevant properties of different college majors. Furthermore, this analysis will help colleges and academic suepervisors understand which majors are best suited for a particular student.

As a college student, we are aware that there are still a plethora of college students that are in the process of choosing a major which could be one of the biggest turning point of their lives. Many of which fear of choosing the wrong major and regret their decision later on in life. This analysis will provide the necessary information to help students decide their majors as well as understanding the potential of their future careers.

Throughout this analysis, we will use data provided by the American Community Survey 2010-2012 Public Use Microdata Series that contains data (employment status, gender, salary, major) from surverys of recently graduated students and graduate students of all ages. Moreover, further documentation regarding this data is available here.

Rank Major_code Major Total Men Women Major_category ShareWomen Sample_size Employed Full_time Part_time Full_time_year_round Unemployed Unemployment_rate Median P25th P75th College_jobs Non_college_jobs Low_wage_jobs
1 2419 PETROLEUM ENGINEERING 2339 2057 282 Engineering 0.1205643 36 1976 1849 270 1207 37 0.0183805 110000 95000 125000 1534 364 193
2 2416 MINING AND MINERAL ENGINEERING 756 679 77 Engineering 0.1018519 7 640 556 170 388 85 0.1172414 75000 55000 90000 350 257 50
3 2415 METALLURGICAL ENGINEERING 856 725 131 Engineering 0.1530374 3 648 558 133 340 16 0.0240964 73000 50000 105000 456 176 0
4 2417 NAVAL ARCHITECTURE AND MARINE ENGINEERING 1258 1123 135 Engineering 0.1073132 16 758 1069 150 692 40 0.0501253 70000 43000 80000 529 102 0
5 2405 CHEMICAL ENGINEERING 32260 21239 11021 Engineering 0.3416305 289 25694 23170 5180 16697 1672 0.0610977 65000 50000 75000 18314 4440 972
6 2418 NUCLEAR ENGINEERING 2573 2200 373 Engineering 0.1449670 17 1857 2038 264 1449 400 0.1772264 65000 50000 102000 1142 657 244
7 6202 ACTUARIAL SCIENCE 3777 2110 1667 Business 0.4413556 51 2912 2924 296 2482 308 0.0956522 62000 53000 72000 1768 314 259
8 5001 ASTRONOMY AND ASTROPHYSICS 1792 832 960 Physical Sciences 0.5357143 10 1526 1085 553 827 33 0.0211674 62000 31500 109000 972 500 220
9 2414 MECHANICAL ENGINEERING 91227 80320 10907 Engineering 0.1195589 1029 76442 71298 13101 54639 4650 0.0573423 60000 48000 70000 52844 16384 3253
10 2408 ELECTRICAL ENGINEERING 81527 65511 16016 Engineering 0.1964503 631 61928 55450 12695 41413 3895 0.0591738 60000 45000 72000 45829 10874 3170
Major_code Major Major_category Total Employed Employed_full_time_year_round Unemployed Unemployment_rate Median P25th P75th
1100 GENERAL AGRICULTURE Agriculture & Natural Resources 128148 90245 74078 2423 0.0261471 50000 34000 80000
1101 AGRICULTURE PRODUCTION AND MANAGEMENT Agriculture & Natural Resources 95326 76865 64240 2266 0.0286361 54000 36000 80000
1102 AGRICULTURAL ECONOMICS Agriculture & Natural Resources 33955 26321 22810 821 0.0302483 63000 40000 98000
1103 ANIMAL SCIENCES Agriculture & Natural Resources 103549 81177 64937 3619 0.0426789 46000 30000 72000
1104 FOOD SCIENCE Agriculture & Natural Resources 24280 17281 12722 894 0.0491884 62000 38500 90000
1105 PLANT SCIENCE AND AGRONOMY Agriculture & Natural Resources 79409 63043 51077 2070 0.0317909 50000 35000 75000
1106 SOIL SCIENCE Agriculture & Natural Resources 6586 4926 4042 264 0.0508671 63000 39400 88000
1199 MISCELLANEOUS AGRICULTURE Agriculture & Natural Resources 8549 6392 5074 261 0.0392304 52000 35000 75000
1301 ENVIRONMENTAL SCIENCE Biology & Life Science 106106 87602 65238 4736 0.0512898 52000 38000 75000
1302 FORESTRY Agriculture & Natural Resources 69447 48228 39613 2144 0.0425633 58000 40500 80000

Overview of data:

Most popular major for all ages : Business with 9858741 people in that major

Least popular major for all ages : Interdisciplinary with 45199 people in that major

Most popular major for recent graduate students : Business with 1302376 people in that major

Least popular major for recent graduate students : Interdisciplinary with 12296 people in that major

The average median salary for recent grads is: 40151.4450867;

The standard deviation for median salary of recent grads is 11470.1818021.

Outlier analysis does not fit for these two sets of data because all three graphs are showing a trend of mean values for each topic. Outliers may be identified inside of eacj major category, but they are still in the major category, so does not effect the trend and the result of problem analysis.

Does gender affect major decisions?

In this section, we will analyze whether or not gender has a correlation between major choices, and if it does, we will also analyze whether or not women opt for lesser paying jobs than men. Throughout this question analysis, we will use datasets from recently graduated students from a wide variety of universities.

Subsequently, we will analyze and compare the differences in the ratio of female recent graduates and major categories - the categories each major falls in, e.g. Computer and Mathematics. In order to acquire this data, we calculated the the percentage of women in each major by dividing the number of women women with the total per major and multiplying by a hundred. Moreover, we used dplyr’s group_by() method to summarize with the use of dplyr’s summarize() into the means of male ratio, female ratio, salary medians, 25th percentile and 75th percentile. Furthermore, we were also able to plot the mean salary medians, 25th percentile and 75th percentile of a variety of major categories in order to identify which kinds of majors reflect to higher salaries.

Then, we will present a scatterplot of median salaries over ratio of men and women to prove whether or not women opt to lesser paying majors by using dplyr’s gather() to plot both men and women ratios in the same scatterplot.

PETROLEUM ENGINEERING

Throughout the analysis, we found 1 outlier on the major median salary vs gender plot. We found out that PETROLEUM ENGINEERING is the outlier which has a median salary of 110000.

The results of our analysis proves that gender indeed is a factor when it comes to major decisions. Furthermore, we discovered that more women opt to lower paying majors than men. WE found out that majors such as engineering, bussiness and Computer and Mathematics are the highest paying major categories, however has the lowest ratios of women.

Do recent graduates receive higher or lower salaries and employment rate comparing to the average?

For this question, we are comparing the unemployment rate and median salary for recent grads and all-ages.

In order to analyze the data, we selected the columns needed for analysis from the two datasets, and renamed the columns when necessary. We then combined the two dataframes, grouped the data by major categories, and summarized the data for each category. Such process produced following data frame:

Major.Category Recent.Grads.Unemployment.Rate Recent.Grads.Median.Salary Overall.Unemployment.Rate Overall.Median.Salary
Agriculture & Natural Resources 0.0563283 36900.00 0.0395692 55000.00
Arts 0.0901727 33062.50 0.0876005 43525.00
Biology & Life Science 0.0609178 36421.43 0.0499360 50821.43
Business 0.0710635 43538.46 0.0544960 60615.38
Communications & Journalism 0.0755378 34500.00 0.0691245 49500.00
Computers & Mathematics 0.0842560 42745.45 0.0594370 66272.73
Education 0.0517020 32350.00 0.0467620 43831.25
Engineering 0.0633339 57382.76 0.0506300 77758.62
Health 0.0659202 36825.00 0.0472093 56458.33
Humanities & Liberal Arts 0.0810076 31913.33 0.0694287 46080.00
Industrial Arts & Consumer Services 0.0480713 36342.86 0.0585457 52642.86
Interdisciplinary 0.0708609 35000.00 0.0772690 43000.00
Law & Public Policy 0.0908048 42200.00 0.0678536 52800.00
Physical Sciences 0.0465111 41890.00 0.0545406 62400.00
Psychology & Social Work 0.0720648 30100.00 0.0778670 44555.56
Social Science 0.0957288 37344.44 0.0656857 53222.22

Other quantitative results we found are displayed below:

The average difference in median salary for all majors is: -28.97%.

The standard deviation is: 0.097239.

This plot displays the overall unemployment rate (with dark green dots) and the unemployment rate for recent grads (with yellow green dots), along with their differences shown in the form of a bar graph. From this visualized data, we can conclude that for most of the major categories, those who recently graduated has a higher unemployment rate than the overall average.

In this plot, the difference in median salary for each major category is illustrated - it shows that the median salary is lower for recent grads in every single major category.

This histogram is created using the percent difference in median salary for every single major. It portrays a symmetric and unimodal distribution, with an average difference of -28.97% and a standard deviation of 0.097239 - that is, on average, recent grads receives a -28.97% lower median salary than the overall level.