Visual Analytics – Global Homework Experts

BUS5VA Visual Analytics
Semester 2 2022
Assignment 2: Data visualisation using Tableau
Due: 25th September 2022 Sunday @ 11:59pm
INSTRUCTIONS
For the assignment 2 you are required to submit the Tableau packaged workbooks (save your
workbooks as
twbx files) and the pdf files printed from Tableau:
o You must submit one twbx for each task (two files for two tasks must be submitted for this
assignment). The
twbx files must be uploaded to LMS via the submission link. All twbx files
must be self-contained and can be opened with Tableau without any errors (data files are not
required to open the twbx file).
o You must submit one
pdf file for each task, printed from Tableau using File > Print to Pdf (Entire
workbook). The
pdf files must be uploaded to LMS via the submission link.
o You must submit
4 files in total: 2 twbx files and 2 pdf files
o You must write a short description for each individual visualisation (worksheet),
using caption,
below each visualisation to explain
“what” you are trying to show and “how” you have chosen
to show it
.
Example (not related to the datasets in this assignment): This visualisation is showing the
change in unemployment rate over time. I have used a line chart to show the change over
time. Colouring has been used to discriminate between different cities. The city of New York
has been highlighted by increasing the thickness of its line.
There are two (2) tasks in this assignment; each is worth 20 marks. Marks will be allocated for:
o Visualisations/charts are displayed properly without any errors
o Visualisation descriptions that are clear and concise; make reference to the relevant
visual
variables
used (e.g. colour, size, shape, etc.); and visualisation techniques (e.g. informative
titles, highlighting, annotating, faceting, etc.)
o Visualisations that are able to
effectively convey information using appropriate techniques.
o You can generate more than one visualisation for each subtask if you think they are needed to
fully address the subtask/question.
This assignment forms 40% of your total assessment for the subject.
The datasets needed for this assignment are located on the LMS in CSV format (in the Assignment 2
Datasets folder)
. For the second task, you need to unzip the files on LMS to get the csv files.
IMPORTANT INFORMATION
Standard plagiarism and collusion policy, and extension and special consideration policy of this
university apply to this assignment.
A cover sheet is NOT required. By submitting your work online, the declaration on the university’s
assignment cover sheet is implied and agreed to by you.
1
TASK 1 [20 marks]: Why do employees leave?
Employees are the backbone of any organization. Its performance is heavily based on the quality of the
employees and retaining them. With employee attrition, organizations are faced with a number of
challenges:
Expensive in terms of both money and time to train new employees
Loss of experienced employees
Impact on productivity
Impact on profit
The employee attrition used for this task and detailed descriptions can be found at
https://github.com/IBM/employee-attrition-aif360
https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset
The dataset (emp_attrition.csv) used in this task contains HR analytics data of employees that stay and
leave. The types of data include metrics such as education level, job satisfactions, and commute distance.
The goal is to uncover the factors that lead to employee attrition.
Use visualisations to address the below questions
1.1. Is there a correlation between monthly income and years at the company? (5 marks)
1.2 Is there a difference in attrition when employees have different travel patterns (in the Business
Travel field)? (5 marks)
1.3 What is the distribution of total working years across different education fields? (5 marks)
1.4 What factors lead to employee attritions? Focusing on four columns: Department, OverTime,
JobLevel, and JobSatisfaction (5 marks)
TASK 2 [20 marks]: Sport Analytics
Soccer match event dataset provides to the public the largest open collection of soccer-logs ever released,
collected by Wyscout (
https://wyscout.com/) containing all the spatio-temporal events (passes, shots, fouls,
etc.) that occur during all matches of an entire season of seven competitions (La Liga, Serie A, Bundesliga,
Premier League, Ligue 1, FIFA World Cup 2018, UEFA Euro Cup 2016). A match event contains information
about its position, time, outcome, player and characteristics.
Two datasets used in this task are:
Player dataset (players.csv): describes all players of the teams playing in seven national and
international soccer competitions (Italian, Spanish, French, German, English first divisions, World Cup
2018, European Cup 2016). See this
link for a detailed description of each attribute.
Event dataset (events_Spain.csv): describes all the events that occur during each match. Each event
refers to a ball touch and contains information such as players involved, event names, and event times.
See this
link for a detailed description of each attribute.
2
Create interactive visualisations to analyse this large dataset
2.1 Create visualisation(s) to rank players based on the number of shots they made. Highlight the
top 5 players with the most shots. Also allow users to see each player’s attributes such as weight,
height, full name, and nationality when users hover their mouse on a player of interest. (5 marks)
2.2 Compare the performance (number of shots, number of passes, number of free kicks, number
of fouls, number of offsides) of Lionel Messi (wyId: 3359) and Cristiano Ronaldo (wyId: 3322) who
are two greatest players in the last decade based on the events in the dataset. (5 marks)
2.3 Create a Tableau dashboard that (1) allows users to rank players based on a selected event
(focusing on the 5 events: Pass, Shot, Free Kick, Foul, Offside), and (2) compares the number of
passes and the number of shots recorded across all positions (‘Midfielder’, ‘Forward’, ‘Defender’,
‘Goalkeeper’). (10 marks)
Marking Rubrics

order now
Assessment criteria A (80-100%) B (70-79%) C (60-69%) Pass (50-59%)
Ability to describe
elements of data
visualisation
All visualisations
are described
clearly and
concisely.
Descriptions
include information
about all relevant
visual elements.
Some visualisations
are described
clearly and
concisely.
Descriptions
include some
information about
the relevant visual
elements.
Limited ability to
describe
visualisations.
Limited description
of visual elements.
The visualisations
have a description.
Descriptions briefly
refer to visual
elements.
Ability to create
data visualisations
in Tableau
All visualisations
(and all
interactivity) are
created
appropriately and
without error.
Correct data is
selected for
visualisations.
Some visualisations
(and some
interactivity)
created without
error.
Correct data is
selected for
visualisations.
Limited ability to
create
visualisations (or
interactivity).
Wrong data is
selected for
visualisations.
Able to identify
data visualisation
tools.
Able to identify
data importing
tools.
Correct
visualisation theory
applied to charts
and ability to
convey information
through data
visualisation
All visualisations
are informative,
efficient and
beautiful. All
visualisations
clearly convey the
appropriate
message.
Most visualisations
are informative,
efficient and
beautiful. Some
visualisations
convey the
appropriate
message.
Some charts are
informative,
efficient and
beautiful. Limited
ability to convey
messages through
data visualisation.
Able to identify
appropriate chart
types. Able to
identify
appropriate
messages.

3