Employee Data Analysis Project in SQL

In this SQL project, we will analyze employees data of various departments in acompany. The data are spread in 4 different tables whose schema are given below:

t_employees table1
emp_no birth_date first_name last_name gender hire_date
10008 1958-02-19 Sania Kalloufi M 1994-09-15
10011 1953-11-07 Mary Sluis F 1990-01-22
---- ---- ---- ---- ---- ----
t_departments table2
dept_no dept_name
d009 Customer Service
d005 Development
---- ----
t_dept_emp table3
emp_no dept_no from_date to_date
10001 d005 1986-06-26 9999-01-01
10002 d007 1996-08-03 9999-01-01
---- ---- ---- ----
t_salaries table4
emp_no Salary from_date to_date
10001 59461 1990-06-25 1991-06-25
10001 63546 1991-06-25 1992-06-24
---- ---- ---- ----

Q1.) Show the data about all the employees containing emp_no, present age, first_name, last_name gender and hire_year.

Here all the asked data is present in t_employees table. But we have asked hiring year and age. So we have to extract year from hiring_date and to get the age, we have to extract year from birth_date and have to subtract it from the current year (2023).

Given below is the SQL query and the resulted data:

SELECT emp_no, 2023 - YEAR(birth_date) as age,
first_name, last_name, gender, YEAR(hire_date) as hire_year
from t_employees;

Query Output:

emp_no age first_name last_name gender hire_year
10008 65 Saniya Kalloufi M 1994
10011 70 Mary Sluis F 1990
10012 63 Patricio Bridgland M 1992
---- ---- ---- ---- ---- ----

Q2.) Show Top 10 data of employee who is taking highest Salary, provided the data should contain emp_no, first_name, last_name, Gender and Salary.

Here we can see that the data we have asked is present in two different tables table1 and table4. So, we have to perform the JOIN in order to get the required data as combined. Also we have to perform order by command in salary to arrange the data in descending order.

Given below is the SQL query and the resulted data:

SELECT e.emp_no, e.first_name, e.last_name, e.gender, s.Salary
from t_employees e
JOIN t_salaries s ON
e.emp_no = s.emp_no order by Salary DESC LIMIT 10;

Query Output:

>
emp_nofirst_name last_name gender Salary
41822 Lihong Brookner M 143600
53197 Khalid Narahari M 142879
39964 Almudena Dratva M 140138
---- ---- ---- ---- ----

Q3.) Give the number of Male and number of Female employees joined after 1995 in all departments.

Here we can see that the data we have asked is present in single table ie. table1. Here we have to perform group by clause on gender column and have to count all the emp_no. Also here we have used where clause to filter ther result after 1995.

Given below is the SQL query and the resulted data:

select gender, count(emp_no) as total
from t_employees
where YEAR(hire_date) > 1995
group by gender;

Query Output:

>
gendertotal
M 13259
F 8706

Q4.) Compare the average salary of female versus male employees in the entire company until year 2002

Here we can see that the data we need is present in the table1, table2 and table4. But we cannot perform JOIN on table2 as there is no common column in between table1/table4 and table2. So we used table3 to connect table1 and table2 as there is a common column between table2 and table3.

Given below is the SQL query and the resulted data:

SELECT
e.gender,
d.dept_name,
ROUND(AVG(s.salary), 2) AS salary,
YEAR(s.from_date) AS calendar_year
FROM
t_salaries s
JOIN
t_employees e ON s.emp_no = e.emp_no
JOIN
t_dept_emp de ON de.emp_no = e.emp_no
JOIN
t_departments d ON d.dept_no = de.dept_no
GROUP BY d.dept_no , e.gender , calendar_year
HAVING calendar_year <= 2002
ORDER BY d.dept_no;

Query Output:

>
genderdept_name salary calendar_year
M Marketing 58895.85 1990
M Marketing 59232.75 1991
---- ---- ---- ----
M Marketing 77525.24 2002
F Marketing 57358.31 1990
F Marketing 57670.20 1991
---- ---- ---- ----