29 Exercise: Data visualisation

We have covered a lot of visualisations in the lab today.

For this exercise you should:

Identity a trend in your data. For instance, the higher poll scores for a presidential candidate in a particular state.
Create a visualisation which communicates that trend well. Consider why it is a good visualisation. Could someone else misunderstand the visualisation?
Create another visualisation. The goal of the second visualisation is to be intentionally misleading.

Please work on this in the same groups as the exercises in the previous weeks. There are sections below which are to guide you. Feel free to disregard these if you want to work another way.

You can use any dataset from this or previous weeks. Also, feel free to explore new datasets from FiveThirtyFive, the sklean datasets or the UCI machine learning datasets.

You may find the following helpful:

Altair examples. In general, the Altair documentation can be tricky to understand. I tend to follow the examples or google round for solutions.
Seaborn examples.

29.1 Setup

Load any libraries and datasets needed for your visualisation.

29.2 Data wrangling

Subset or clean your data

29.3 Analysis

Carry out an analysis if required. E.g., are you running a PCA or other dimension reduction, or a linear regression to plot?

29.4 Visualisation

Create a visualisation which clearly show the trend you would like to show in your data.

29.5 Discussion

Why did you choose this visualisation? Do you think other will clearly see the trend you have identified?

Create a visualisation of this trend which you think will mislead the user?

How do you think this will mislead the user?