Visual Analytics

1. TODO Tutorial
2. References

1 TODO Tutorial

1.1 TODO Question 1

Figure 1: Graph to Fix

1.1.1 Possible Improvements

minimizing the intersectoin of edges
Using Colours to show the distinction between nodes

Possible tools that can be used are:

1.1.2 Generated DOT Graph

The easiest way to generate graph is to use the DOT library with PlantUML:

Expand Code Click Here

@startdot
strict digraph graphName {
concentrate=true
fillcolor=green
color=blue
style="filled, rounded"
 A [shape=box, fillcolor="#a31621", style="rounded, filled"]

 edge [
    arrowhead="none"
  ];

 node[
    fontname="Fira Code",
    shape="square",
    fixedsize=false,
    style=rounded
  ];


# A -> B [dir="both"]
A -> B
A -> C
A -> G
B [shape=box, fillcolor="#bfdbf7", style="rounded, filled"]
B -> F
B -> C
B -> H
B -> H
C [shape=box, fillcolor="#053c5e", style="rounded, filled"]
C -> A
C -> B
C -> D
D [shape=box, fillcolor="#1f7a8c", style="rounded, filled"]
D -> G
D -> C
E [shape=box, fillcolor="#eaf4d3", style="rounded, filled"]
E -> H
E -> A
F [shape=box, fillcolor="#0f5257", style="rounded, filled"]
F -> A
F -> B
G [shape=box, fillcolor="#0b3142", style="rounded, filled"]
G -> A
G -> D
H [shape=box, fillcolor="#9c92a3", style="rounded, filled", arrowType="dot"]
H -> E [arrowType="dot"]
H -> B
}
@enddot

Figure 2: DOT graph created using PlantUML

1.1.3 Sequence Diagram

Plant UML can also create a sequence diagram which is a different type of visualisation but is still an interesting way to visualise the relationships.

@startuml
    A-->B
    A-->C
    A-->G
    B-->F
    B-->C
    B-->H
    B-->H
    C-->A
    C-->B
    C-->D
    D-->G
    D-->C
    E-->H
    E-->A
    F-->A
    F-->B
    G-->A
    G-->D
    H-->E
    H-->B
@enduml

1.2 Question 2

Explore and find 3 good and 3 bad examples of graphs on the Internet, including a justification for each example.

1.2.1 Good Examples

Solar Winds
The proceeding image shown in figure 3 is an exemplar graph from a commercial piece of software called Solar Winds¹, it is an effective plot because:
- The nodes are colour coded
- The nodes use symbols as a way to express an independent categorical variable
- The edges do not overlap
- there is adequate spacing throughout the graph
Figure 3: Exemplar plot for Solar Winds product
Standord Paper Visualisation
The Proceeding Plot shown in figure 4 is a graph showing feature construction in a paper written at (McAuley and Leskovec, n.d.) presumably produced in TikZ, it is an effective graph because:
- the colours to denote the two main discrete variables are distinct
- The edges don't overlap
- The Nodes tesalate in a way that makes the graph easy to read
- Where a quasi-node is used to branch an edge to facilitate node tesallation it is denoted by a slight bulge making it clear to delineate the edges
Figure 4: Correlation Network Diagram of mtcars data
Unix Development
The proceeding plot shown in figure 1 is a graph taken from the graphviz documentation created using the DOT language. It is a effective graph because:
- the nodes are sufficiently spaced
- the edges do not overlap unless necessary
  - If the edges do overlap it is to minimise the cost of spacing

1.2.2 Bad Examples

igraph Library Example

The following plot shown in figure 6 from the R graph gallery² is bad example of a graph because the edges are hard to discern the labels are not clearly delineated, the edges cross over unnecessarily and the thickness of the edges is too small

Figure 6: Correlation Network Diagram of mtcars data
Research chronology
The proceeding graph is an exemplar plot of a service known as Research Chronology³.

It is a poor graph because:
- There are too many overlaps between edges
- There are too many colours to uniquely identify the discrete variables
- The plot needs to be further spaced
Figure 7: Graph Depiction of research chronology
Mapping the BlogoSphere

The proceeding plot shown in figure 7 is an attempt to map the relationships between blogs, it is a poor graph simply because it is too dense, the graph needs to be expanded and trimmed significantly before it could be effective.

1.3 TODO Question 3

What is a tree structure? Provide 3 examples of a tree structure in real life.

A tree structure is a general type of general graph, it will contain no cycles and will tend to have edges with direction, there will be a special designated root vertex.

1.3.1 File System

The most natural type of tree structure is the directory structure of a a computer, for example the tree structure of the provided DOI_Tree can be visualised

cd ~/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/DOI_Tree
tree

.
├── Data
│   ├── 2811.xml
│   ├── 3137_filesystem.xml
│   ├── 6814_filesystem.xml
│   ├── CategoryAustralia_6254.txt
│   ├── CategoryAustralia_6254.xml
│   ├── CategoryAustralia.xlsx
│   ├── CategoryUSA_20289.xml
│   ├── CategoryUSA.txt
│   ├── CategoryUSA.xlsx
│   ├── Safety events ontology New.txt
│   ├── Safety events ontology New.xml
│   └── treeml.dtd
├── data.xml
├── DOITreeVis.jar
├── META-INF
│   └── MANIFEST.MF
└── ReadMe.txt

2 directories, 16 files

1.3.2 Tree Machine Learning Technique

A more nuanced form of a tree structure is the machine learning technique whereby input variables are seperated to correspond to binned output variables in an effort to model the behaviour of a system without using regression. This technique can be used for continuous and discrete data, so for example the following code returns the plot shown in figure :

  library(tree)
  library(tidyverse)

  if(require('pacman')){
    library('pacman')
  }else{
    install.packages('pacman')
    library('pacman')
  }

  pacman::p_load(caret, scales, ggplot2, rmarkdown, shiny, ISLR, class, BiocManager, corrplot, plotly, tidyverse, latex2exp, stringr, reshape2, cowplot, ggpubr, rstudioapi, wesanderson, RColorBrewer, colorspace, gridExtra, grid, car, boot, colourpicker, tree, ggtree, mise, rpart, rpart.plot, knitr, MASS, magrittr)

  CSeat.tb <- as_tibble(Carseats)


  thresh <- (mean(CSeat.tb$Sales)+0.5*sd(CSeat.tb$Sales)) %>% round()
  CSeat.tb$CatSales <- ifelse(CSeat.tb$Sales > thresh, "High", "Low")
  CSeat.tb$CatSales <- factor(x = CSeat.tb$CatSales, levels = c("Low", "High"), ordered = TRUE)
  CSeat.tb <- CSeat.tb[,c(12, 2:11)]

CatSalesModTree.rpart <- rpart(formula = CatSales ~ ., data = CSeat.tb)
rpart.plot(CatSalesModTree.rpart, box.palette="OrGy", shadow.col="gray", nn=TRUE)

Figure 9: Example of Tree Plot using Built in Data

1.3.3 Genetic Lineage

Genetic Lineage is another example of a tree structure graph, for example a the tree of life would look something to the effect of:

Figure 10: Example of Tree of life

1.4 TODO Question 4

Explore the tree visualization techniques at http://treevis.net/ . Identify 10 connection (node-linked diagram) techniques and 10 enclosure techniques.

1.4.1 Techniques using connections or node links are:

Generalized Pythagoras Trees (2014) by Fabian Beck
- https://doi.org/10.5220/0004654500170028
StackTree Layout
Career Tree
Ordered Tree Drawing
GeoReferenced Tree Layout
Rhizome Navigation
Quadratic Programming Layout
Clustergram
Perspective Mapping
Planar Upward Drawing

1.4.2 Techniques using Enclosure Techniques include

1.5 TODO Question 5

Explore the visualizations of Treemap, DOI_Tree and D&CTreemap from the demo systems. In your opinion, give the pros and cons of each technique.

Personal note, thank you so much for making these simple .jar files rather than .exe's I hate using wine and I'm glad these are cross platform applications.

1.5.1 Treemap

file:///home/ryan/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/Treemap-4.0.5/Monitor.jar

The UI is very mouse heavy which makes it difficult to manipulate, the UI is also a little busy.

Overall though being able to double click through the enclosed regions to move through the bins and being able to change the partitioning method from strip to slice and dice makes it easy to manipulate.

1.5.2 DOI_Tree

file:///home/ryan/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/DOI_Tree/DOITreeVis.jar

Unfourtanetly I was unable to load the built in data sets I had created without having a crash, this was the output regardless:

prefuse.data.io.DataIOException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
 at prefuse.data.io.TreeMLReader.readGraph(TreeMLReader.java:46)
 at prefuse.data.io.AbstractGraphReader.readGraph(AbstractGraphReader.java:33)
 at visualization.Visualizer.setVisualization(Visualizer.java:423)
 at visualization.Visualizer.openDataFileAction(Visualizer.java:407)
 at visualization.Visualizer.actionPerformed(Visualizer.java:103)
 at java.desktop/javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1967)
 at java.desktop/javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2308)
 at java.desktop/javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:405)
 at java.desktop/javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:262)

So overall I would say that a definite downside to this tool is that it is a little unstable (or very possibly I'm using the wrong java, I'm just using whatever my distro shipped with)

1.5.3 D&C Treemap

file:///home/ryan/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/D&CTreemap/DCTreemap.jar

The aestheticis very pleasant as shown in figure but having to mouse over to determine the clusters is very inconvenient (and the font renders incorrectly on my system)

Figure 11: DCTreemap

2 References

Bibliography

McAuley, Julian, and Jure Leskovec. n.d. “Learning to Discover Social Circles in Ego Networks,” 9.

Footnotes:

https://www.solarwinds.com/network-performance-monitor/use-cases/network-visualization

https://www.r-graph-gallery.com/250-correlation-network-with-igraph.html

http://www.datadreamer.com/item/research2