Visual Analytics

Table of Contents

1 TODO Tutorial

1.1 TODO Question 1

Figure 1: Graph to Fix

1.1.1 Possible Improvements

  1. minimizing the intersectoin of edges
  2. Using Colours to show the distinction between nodes

Possible tools that can be used are:

1.1.2 Generated DOT Graph

The easiest way to generate graph is to use the DOT library with PlantUML:

Expand Code Click Here

@startdot
strict digraph graphName {
concentrate=true
fillcolor=green
color=blue
style="filled, rounded"
 A [shape=box, fillcolor="#a31621", style="rounded, filled"]

 edge [
    arrowhead="none"
  ];

 node[
    fontname="Fira Code",
    shape="square",
    fixedsize=false,
    style=rounded
  ];


# A -> B [dir="both"]
A -> B
A -> C
A -> G
B [shape=box, fillcolor="#bfdbf7", style="rounded, filled"]
B -> F
B -> C
B -> H
B -> H
C [shape=box, fillcolor="#053c5e", style="rounded, filled"]
C -> A
C -> B
C -> D
D [shape=box, fillcolor="#1f7a8c", style="rounded, filled"]
D -> G
D -> C
E [shape=box, fillcolor="#eaf4d3", style="rounded, filled"]
E -> H
E -> A
F [shape=box, fillcolor="#0f5257", style="rounded, filled"]
F -> A
F -> B
G [shape=box, fillcolor="#0b3142", style="rounded, filled"]
G -> A
G -> D
H [shape=box, fillcolor="#9c92a3", style="rounded, filled", arrowType="dot"]
H -> E [arrowType="dot"]
H -> B
}
@enddot

Figure 2: DOT graph created using PlantUML

1.1.3 Sequence Diagram

Plant UML can also create a sequence diagram which is a different type of visualisation but is still an interesting way to visualise the relationships.

@startuml
    A-->B
    A-->C
    A-->G
    B-->F
    B-->C
    B-->H
    B-->H
    C-->A
    C-->B
    C-->D
    D-->G
    D-->C
    E-->H
    E-->A
    F-->A
    F-->B
    G-->A
    G-->D
    H-->E
    H-->B
@enduml

1.2 Question 2

Explore and find 3 good and 3 bad examples of graphs on the Internet, including a justification for each example.

1.2.1 Good Examples

  1. Solar Winds

    The proceeding image shown in figure 3 is an exemplar graph from a commercial piece of software called Solar Winds1, it is an effective plot because:

    • The nodes are colour coded
    • The nodes use symbols as a way to express an independent categorical variable
    • The edges do not overlap
    • there is adequate spacing throughout the graph

    Figure 3: Exemplar plot for Solar Winds product

  2. Standord Paper Visualisation

    The Proceeding Plot shown in figure 4 is a graph showing feature construction in a paper written at (McAuley and Leskovec, n.d.) presumably produced in TikZ, it is an effective graph because:

    • the colours to denote the two main discrete variables are distinct
    • The edges don't overlap
    • The Nodes tesalate in a way that makes the graph easy to read
    • Where a quasi-node is used to branch an edge to facilitate node tesallation it is denoted by a slight bulge making it clear to delineate the edges

    Figure 4: Correlation Network Diagram of mtcars data

  3. Unix Development

    The proceeding plot shown in figure 1 is a graph taken from the graphviz documentation created using the DOT language. It is a effective graph because:

    • the nodes are sufficiently spaced
    • the edges do not overlap unless necessary
      • If the edges do overlap it is to minimise the cost of spacing

1.2.2 Bad Examples

  1. igraph Library Example

    The following plot shown in figure 6 from the R graph gallery2 is bad example of a graph because the edges are hard to discern the labels are not clearly delineated, the edges cross over unnecessarily and the thickness of the edges is too small

    Figure 6: Correlation Network Diagram of mtcars data

  2. Research chronology

    The proceeding graph is an exemplar plot of a service known as Research Chronology3.

    It is a poor graph because:

    • There are too many overlaps between edges
    • There are too many colours to uniquely identify the discrete variables
    • The plot needs to be further spaced

    Figure 7: Graph Depiction of research chronology

  3. Mapping the BlogoSphere

    The proceeding plot shown in figure 7 is an attempt to map the relationships between blogs, it is a poor graph simply because it is too dense, the graph needs to be expanded and trimmed significantly before it could be effective.

1.3 TODO Question 3

What is a tree structure? Provide 3 examples of a tree structure in real life.

A tree structure is a general type of general graph, it will contain no cycles and will tend to have edges with direction, there will be a special designated root vertex.

1.3.1 File System

The most natural type of tree structure is the directory structure of a a computer, for example the tree structure of the provided DOI_Tree can be visualised

cd ~/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/DOI_Tree
tree 
.
├── Data
│   ├── 2811.xml
│   ├── 3137_filesystem.xml
│   ├── 6814_filesystem.xml
│   ├── CategoryAustralia_6254.txt
│   ├── CategoryAustralia_6254.xml
│   ├── CategoryAustralia.xlsx
│   ├── CategoryUSA_20289.xml
│   ├── CategoryUSA.txt
│   ├── CategoryUSA.xlsx
│   ├── Safety events ontology New.txt
│   ├── Safety events ontology New.xml
│   └── treeml.dtd
├── data.xml
├── DOITreeVis.jar
├── META-INF
│   └── MANIFEST.MF
└── ReadMe.txt

2 directories, 16 files

1.3.2 Tree Machine Learning Technique

A more nuanced form of a tree structure is the machine learning technique whereby input variables are seperated to correspond to binned output variables in an effort to model the behaviour of a system without using regression. This technique can be used for continuous and discrete data, so for example the following code returns the plot shown in figure :

  library(tree)
  library(tidyverse)

  if(require('pacman')){
    library('pacman')
  }else{
    install.packages('pacman')
    library('pacman')
  }

  pacman::p_load(caret, scales, ggplot2, rmarkdown, shiny, ISLR, class, BiocManager, corrplot, plotly, tidyverse, latex2exp, stringr, reshape2, cowplot, ggpubr, rstudioapi, wesanderson, RColorBrewer, colorspace, gridExtra, grid, car, boot, colourpicker, tree, ggtree, mise, rpart, rpart.plot, knitr, MASS, magrittr)

  CSeat.tb <- as_tibble(Carseats)


  thresh <- (mean(CSeat.tb$Sales)+0.5*sd(CSeat.tb$Sales)) %>% round()
  CSeat.tb$CatSales <- ifelse(CSeat.tb$Sales > thresh, "High", "Low")
  CSeat.tb$CatSales <- factor(x = CSeat.tb$CatSales, levels = c("Low", "High"), ordered = TRUE)
  CSeat.tb <- CSeat.tb[,c(12, 2:11)]

CatSalesModTree.rpart <- rpart(formula = CatSales ~ ., data = CSeat.tb)
rpart.plot(CatSalesModTree.rpart, box.palette="OrGy", shadow.col="gray", nn=TRUE)

Figure 9: Example of Tree Plot using Built in Data

1.3.3 Genetic Lineage

Genetic Lineage is another example of a tree structure graph, for example a the tree of life would look something to the effect of:

Figure 10: Example of Tree of life

1.4 TODO Question 4

Explore the tree visualization techniques at http://treevis.net/ . Identify 10 connection (node-linked diagram) techniques and 10 enclosure techniques.

1.5 TODO Question 5

Explore the visualizations of Treemap, DOITree and D&CTreemap from the demo systems. In your opinion, give the pros and cons of each technique.

Personal note, thank you so much for making these simple .jar files rather than .exe's I hate using wine and I'm glad these are cross platform applications.

1.5.1 Treemap

file:///home/ryan/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/Treemap-4.0.5/Monitor.jar

The UI is very mouse heavy which makes it difficult to manipulate, the UI is also a little busy.

Overall though being able to double click through the enclosed regions to move through the bins and being able to change the partitioning method from strip to slice and dice makes it easy to manipulate.

1.5.2 DOITree

file:///home/ryan/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/DOI_Tree/DOITreeVis.jar

Unfourtanetly I was unable to load the built in data sets I had created without having a crash, this was the output regardless:

prefuse.data.io.DataIOException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
 at prefuse.data.io.TreeMLReader.readGraph(TreeMLReader.java:46)
 at prefuse.data.io.AbstractGraphReader.readGraph(AbstractGraphReader.java:33)
 at visualization.Visualizer.setVisualization(Visualizer.java:423)
 at visualization.Visualizer.openDataFileAction(Visualizer.java:407)
 at visualization.Visualizer.actionPerformed(Visualizer.java:103)
 at java.desktop/javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1967)
 at java.desktop/javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2308)
 at java.desktop/javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:405)
 at java.desktop/javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:262)

So overall I would say that a definite downside to this tool is that it is a little unstable (or very possibly I'm using the wrong java, I'm just using whatever my distro shipped with)

1.5.3 D&C Treemap

file:///home/ryan/Dropbox/Studies/2020Autumn/Visual_Analytics/03_Material/Lecture3Materials4Students/D&CTreemap/DCTreemap.jar

The aestheticis very pleasant as shown in figure but having to mouse over to determine the clusters is very inconvenient (and the font renders incorrectly on my system)

Figure 11: DCTreemap

2 References

Bibliography

McAuley, Julian, and Jure Leskovec. n.d. “Learning to Discover Social Circles in Ego Networks,” 9.

Footnotes:

Author: Ryan G

Created: 2020-03-25 Wed 19:38

Validate