Data Analysis: Correlation Between Students’ Interests and Departments

Fatimatuz Zahra
3 min readMar 21, 2024

--

Project Preview: Correlation between Departements (Information Systems, Information Technology, Informatics) at the Faculty of Computer Science, University of Jember with Domicile Origin, Favorite Foods, Hobbies, and Gender.

Data Collecting

Survey sample data from Google Forms collected as many as 83 respondents with the criteria “UNEJ ComSci students from 2020 to 2023”. The class range validates student status as an average active student (as of March 2024).

Departements of UNEJ Computer Science students

For the full respondent file, click here.

Data Cleaning

Cleaning data utilizing Python: CSV, Pandas, and Numpy.

Import CSV file and apply Case Folding, the result is inserted into a new file.

import csv

def case_fold_csv(input_file, output_file):
with open(input_file, 'r', newline='', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile)
header = next(reader)
data = [[cell.lower() for cell in row] for row in reader]

with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(header)
writer.writerows(data)

case_fold_csv('korelasi_data_mahasiswa.csv', 'cf_korelasi_data_mahasiswa.csv')

Insert the CSV file into the DataFrame.

import pandas as pd

data_makanan = pd.read_csv("cf_korelasi_data_mahasiswa.csv", sep=",")
data_makanan.head(10)

Split columns that have more than one value, marked with a comma separator, into new rows.

df = pd.DataFrame(data_makanan)

def split_column(df, column_name, separator=','):
s = df[column_name].str.split(separator, expand=True).stack()
s.index = s.index.droplevel(-1)
s.name = column_name
return df.drop(column_name, axis=1).join(s)

df_new = split_column(df, 'Makanan')

print(df_new)

Insert it into a new CSV file.

output_csv_path = "makanan_cf_korelasi.csv"
df_new.to_csv(output_csv_path, index=False)

Data Visualization

Display data results with Tableau.

Tableau Application Projection in Data Visualization
Departements and Domicile Origin
Departements and Favorite Foods
Departements and Hobbies
Departements and Gender

Result: Data Correlation

Correlation between students’ Departements and Domicile Origin: “Jember” is the most common domicile of the three study programs. Out of 84 samples, 33 ComSci students came from Jember. It can be concluded that more than ⅓ (one third) of ComSci students are influenced by region in choosing a university. This is also supported by 7 students who come from “Banyuwangi” as the second highest position of student domicile origin. Geographically, Banyuwangi is one of the cities close to Jember.

Correlation between students’ Departements and Favorite Foods: “fried rice” with 27 respondents, “chicken” with 20 respondents, and “chicken noodle” with 16 respondents are the top three favorite foods that all departements are interested in. These three foods are easily found in the area of University of Jember so that many students, especially students of ComSci, make them their favorite.

Correlation between students’ Departements and Hobbies: the highest category is “reading” with a total of 18 samples, all of which are from the Information Systems departement. The assumption is that hobbies have a relationship in terms of interest in choosing a departement in ComSci, because among the three departements, Information Systems is the only field of science that unites technology and business science.

Correlation between students’ Departements and Gender: UNEJ ComSci students are dominated by women with 48 out of 82 samples. When compared to the other two departements, Information Systems has the most female students with 37 students. While the others are dominated by men. This shows a change in the trend of interest in computer science which was originally male-dominated.

--

--

No responses yet