Module # 11 Professor Edward R. Tufte – modern pioneer in the field of data visualization

I first tried to run the first code provided in the assignment but I had issues with the visualization.

x <- 1967:1977
y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5)
pdf(width=10, height=6)
plot(y ~ x, axes=F, xlab="", ylab="", pch=16, type="b")
axis(1, at=x, label=x, tick=F, family="serif")
axis(2, at=seq(1,6,1), label=sprintf("$%s", seq(300,400,20)), tick=F, las=2, family="serif")
abline(h=6,lty=2)
abline(h=5,lty=2)
text(max(x), min(y)*2.5,"Per capita\nbudget expanditures\nin constant dollars", adj=1, 
     family="serif")
text(max(x), max(y)/1.08, labels="5%", family="serif")
dev.off()

This is the code I used and it was not successful.

I also tried to run this code and received an error message while trying to load ggExtra package.

p <- ggplot(faithful, aes(waiting, eruptions)) + geom_point() + theme_tufte(ticks=F)
> ggMarginal(p, type = “histogram”, fill=”transparent”)
Error in ggMarginal(p, type = “histogram”, fill = “transparent”) :
could not find function “ggMarginal”

The command library(ggExtra) did not work for me and I had to search for the package and install it.

I was successful using this code.

library(ggplot2)
library(ggExtra)
library(ggthemes)
p <- ggplot(faithful, aes(waiting, eruptions)) + geom_point() + theme_tufte(ticks=F)
ggMarginal(p, type = "histogram", fill="transparent")

This is the visual I created.

Using marginal histogram scatterplot in ggPlot2

 

Screen Shot 2019-03-24 at 12.39.56 PM

 

 

Module # 10 Time Series and Visualization

Screen Shot 2019-03-15 at 2.37.13 PM

I think this visual shows the information well and it is easy to interpret. The visual shows a trend in the data. As the years go by the hot dogs and buns eaten have increased.
Trend can also be important to look when we want to know or make predictions in the future.

This visual also shows what can translated as outliers by showing the bars with different colors. This can be useful in separating data that is out of the norm.

As one of the objectives of time series is interpret the information it is important for our end users to know what the data means.

Module # 9 Visual Multi Variances Analysis

Screen Shot 2019-03-15 at 1.24.32 PM

Visual 1

Screen Shot 2019-03-15 at 1.37.27 PM

Visual 2

  1. Alignment – I think alignment is very important because it helps align the data so it can become more clear. If the alignment is not provided or if its hard for the end user to  it can hard to see a connection between the elements.
  2. Repetition – Without repetition it can hard to know and you will not have consistency.
  3. Contrast – It is really important because it helps an individual point out the information.
  4. Proximity – Not organizing the data can be confusing. Elements that have similar relationships can be grouped and this will fall under proximity. By not doing this it can be hard to know what information to focus on.
  5. Balance – Balance is very important for a visual. By not having balance it can cause confusing if we are trying to display regression for example.

 

Visual 1 shows a good contrast because it is only two colors, white and black. The lines also show how the visual is separated.

Visual 2 does a good job at separating the elements and it has good balance because it shows the different sizes of elements.

Module # 8 Correlation Analysis and ggplot2

Screen Shot 2019-02-28 at 9.53.39 PM

My visual was constructed using mtcars data using R studio.

Correlation Analysis has been somewhat troublesome. Creating and interpreting the outcome has been challenging for me.

The visual shows decreasing correlation between cylinder and mpg on the mtcars data.

In my opinion, the visual is easy to read as it shows how the information decreases going right. The red line on the visual makes it easier for the audience to understand what the image is trying to interpret. The shadow on the red line also makes it easier to read.

Module # 7 Visual Distribution Analysis

Screen Shot 2019-02-23 at 1.41.49 PM

 

My visual was constructed using mtcars data using R studio.

I chose to do a histogram for the data to show the cylinders and MPG. Even though since the data is small, some will find it misleading.

In my opinion, this shows a clears explanation of the data and the color and clear to read. The picture shows somewhat of a skewed look to the left.

 

This is the code I used to create this visual

mtcars

hist(mtcars$mpg,breaks = 30 , main = “Cars”, xlab = “MPG”, ylab = “Cylinder”, col = “yellow”)

 

 

Module # 6 Visual Spotting Differences & Deviation Analysis through the eyes of open source R

This is my visual for module 6.

In the visual I implemented time spent on the TV and the amount of hours slept.
We can see from the visual that the less time spent watching tv the hours of sleep increase. By using different colors we see the difference in the amount of time.

One problem I have while creating this histogram was increasing the value of hours. Also spreading out the data.

 

 

Screen Shot 2019-02-17 at 4.23.37 PM

Module # 4 Visual Analytics Techniques & Practices Part II

At first I tried to place sum in columns and average positions in rows but it was not showing all of the information. On the image it was only showing one point.

I noticed after selecting the data and selecting different visuals under the Show Me tab, the application might change the Columns and Rows.

 

This my visual after using Average Position for Columns and Time for Rows.

Screen Shot 2019-02-03 at 12.56.18 PM