First, make sure you have loaded package igraph.
#install.packages("igraph")
library(igraph)
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
##
## as_data_frame, groups, union
## The following objects are masked from 'package:purrr':
##
## compose, simplify
## The following object is masked from 'package:tidyr':
##
## crossing
## The following object is masked from 'package:tibble':
##
## as_data_frame
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
First, create a simple adjacency matrix with three rows and three columns
mat1 <- matrix(c(0, 1, 0, 0, 0, 1, 1,0, 0), nrow=3, ncol=3) ### matrix function
mat1
## [,1] [,2] [,3]
## [1,] 0 0 1
## [2,] 1 0 0
## [3,] 0 1 0
Use the igraph function graph_from_adjacency_matrix() to create a network object from your graph, then use the plot() function to plot.
mat2 <- graph_from_adjacency_matrix(mat1)
plot(mat2, edge.arrow.size = 1) ## set the size of the arrows
Alternatively, create the same network by telling igraph what links you would like.
mat3 <- graph(edges=c(1,3, 3,2, 2,1), n=3, directed=T ) # use graph function and list edges
plot(mat3, edge.arrow.size = 1)
## Network Aestetics
Many parts of a network can be sized and colored to help communicate results more clearly.
Here, for example, we color the nodes and change the size and position of the labels using vertex.color = and vertex.label.dist =
plot(mat3, edge.arrow.size = 1, vertex.color = "purple", vertex.size = 20, vertex.label.cex = 2, vertex.label.dist = 3.5)
Networks can also be generated randomly
Here we create an empty graph (no links):
eg <- make_empty_graph(50) ## make a graph with 50 nodes
plot(eg, vertex.size=10, vertex.label=NA, vertex.color = "plum") ## no node labels.
eg # view graph object
## IGRAPH 973d5b3 D--- 50 0 --
## + edges from 973d5b3:
And a full graph (all possible links = 780):
fg <- make_full_graph(40)
plot(fg, vertex.size=10, vertex.label=NA, vertex.color = "plum")
fg # view graph
## IGRAPH e9e5d6b U--- 40 780 -- Full graph
## + attr: name (g/c), loops (g/l)
## + edges from e9e5d6b:
## [1] 1-- 2 1-- 3 1-- 4 1-- 5 1-- 6 1-- 7 1-- 8 1-- 9 1--10 1--11 1--12
## [12] 1--13 1--14 1--15 1--16 1--17 1--18 1--19 1--20 1--21 1--22 1--23
## [23] 1--24 1--25 1--26 1--27 1--28 1--29 1--30 1--31 1--32 1--33 1--34
## [34] 1--35 1--36 1--37 1--38 1--39 1--40 2-- 3 2-- 4 2-- 5 2-- 6 2-- 7
## [45] 2-- 8 2-- 9 2--10 2--11 2--12 2--13 2--14 2--15 2--16 2--17 2--18
## [56] 2--19 2--20 2--21 2--22 2--23 2--24 2--25 2--26 2--27 2--28 2--29
## [67] 2--30 2--31 2--32 2--33 2--34 2--35 2--36 2--37 2--38 2--39 2--40
## [78] 3-- 4 3-- 5 3-- 6 3-- 7 3-- 8 3-- 9 3--10 3--11 3--12 3--13 3--14
## + ... omitted several edges
Or a tree graph:
tr <- make_tree(40, children = 3, mode = "undirected")
plot(tr, vertex.size=10, vertex.label=NA, vertex.color = "plum")
You can also generate mathmatical models of networks in igraph. For example, a very simple model can be generated by using sample_gnm() to generate a graph of a specified number of nodes (n) and links (m). Links will be generated with the same constant probability.
Erdos-Renyi random graph (Again, ‘n’ is number of nodes, ‘m’ is the number of edges).
er <- sample_gnm(n=100, m=40)
plot(er, vertex.size=5, vertex.label=NA, vertex.color = "plum") # vertex color "plum" :)
Barabasi-Albert scale-free graph (preferential attachment). This function builds a model with a simple stochastic algorithm where n = the number of nodes & power= the power of the preferential attachment. The default is 1, which gives linear attachment. Try changing the value of power = to 2 and 3 and see what happens! (m = the number of edges to add in each step).
ba <- sample_pa(n=100, power=1, m=1, directed=F)
plot(ba, vertex.size=6, vertex.label=NA, vertex.color = "plum")
You can read in your data directly as an adjacency matrix, but likely this is not the way that you have your data organized. Instead, it might be easier to have two files: a node file and an edge file.
In a node file, the first two columns are all of your from:to links. Column 1 is always from, Column 2 is always to (less important for undirected networks). The columns after that are your edge attributes (such as weight of link, volume, probability, name etc).
Here is an example of a simple node list, where all of the nodes are farmers. We include attributes about the node like age, gender and number of years farming.
Nodelist <- data.frame(
Names =c("Jim", "Carole", "Joe", "Michelle", "Jen", "Pete", "Paul", "Tim",
"Jess", "Mark", "Jill", "Cam", "Kate") ,
YearsFarming = c(8.5, 6.5, 4, 1, 3, 10, 5, 5, 5, 1, 1, 6, 6) ,
Age = c(22, 31, 25, 21, 22, 35, 42, 27, 26, 33, 26, 28, 22) ,
Gender = c("Male", "Female", "Male", "Female", "Female", "Male","Male","Male", "Female", "Male", "Female", "Male", "Female"))
Nodelist
## Names YearsFarming Age Gender
## 1 Jim 8.5 22 Male
## 2 Carole 6.5 31 Female
## 3 Joe 4.0 25 Male
## 4 Michelle 1.0 21 Female
## 5 Jen 3.0 22 Female
## 6 Pete 10.0 35 Male
## 7 Paul 5.0 42 Male
## 8 Tim 5.0 27 Male
## 9 Jess 5.0 26 Female
## 10 Mark 1.0 33 Male
## 11 Jill 1.0 26 Female
## 12 Cam 6.0 28 Male
## 13 Kate 6.0 22 Female
Now an edgelist- Who shared information in the 2017 growing season? How frequently?
Edgelist <- data.frame(
From = c("Jim", "Jim", "Jim", "Jill", "Kate", "Pete", "Pete", "Jess", "Jim", "Jim", "Pete"),
To = c("Carole", "Jen", "Pete", "Carole", "Joe", "Carole", "Paul", "Mark", "Cam", "Mark", "Tim")
)
Let’s make our farmer communication network!
FarmNetwork <- graph_from_data_frame(d = Edgelist, vertices = Nodelist, directed = T)
FarmNetwork
## IGRAPH a22c045 DN-- 13 11 --
## + attr: name (v/c), YearsFarming (v/n), Age (v/n), Gender (v/c)
## + edges from a22c045 (vertex names):
## [1] Jim ->Carole Jim ->Jen Jim ->Pete Jill->Carole Kate->Joe
## [6] Pete->Carole Pete->Paul Jess->Mark Jim ->Cam Jim ->Mark
## [11] Pete->Tim
E(FarmNetwork) # view edges
## + 11/11 edges from a22c045 (vertex names):
## [1] Jim ->Carole Jim ->Jen Jim ->Pete Jill->Carole Kate->Joe
## [6] Pete->Carole Pete->Paul Jess->Mark Jim ->Cam Jim ->Mark
## [11] Pete->Tim
V(FarmNetwork) # view nodes
## + 13/13 vertices, named, from a22c045:
## [1] Jim Carole Joe Michelle Jen Pete Paul
## [8] Tim Jess Mark Jill Cam Kate
Plot!
plot(FarmNetwork, edge.arrow.size = .5, vertex.color = "plum", vertex.label.dist = 2.5)
Much more information about making beautiful networks in R using igraph can be found at Katya Ognyanova’s Site. But briefly:
Let’s color our nodes based on gender
colrs <- c("gray70", "blue")
V(FarmNetwork)$color <- ifelse(V(FarmNetwork)$Gender == "Male", "orange", "dodgerblue") ## if male, make orange, if not, blue. Go gators!!!!
plot(FarmNetwork, edge.arrow.size = .5, vertex.label.dist = 2.5)
You can also size your nodes based on attributes:
V(FarmNetwork)$size <- V(FarmNetwork)$YearsFarming # size the nodes by number of years farming
plot(FarmNetwork, edge.arrow.size = .5, vertex.label.dist = 2.5)
Scale the node size up a bit..
V(FarmNetwork)$size <- V(FarmNetwork)$YearsFarming *2 ## scale by multiplying by 2
plot(FarmNetwork, edge.arrow.size = .5, vertex.label.dist = 2.5)
Use igraph “graph” function to plot a network directly as igraph object. We will use this as an example.
Net2 <- graph(edges=c(1,3, 3,2, 2,1, 2,4, 5,4), n=5, directed=T)
Net2
## IGRAPH de60807 D--- 5 5 --
## + edges from de60807:
## [1] 1->3 3->2 2->1 2->4 5->4
plot(Net2, edge.arrow.size = .5, vertex.color = "gold")
What is the node degree of the nodes in our graph, which is the sum of the number of both incoming and outgoing links.
deg1 <- degree(Net2, v = V(Net2), mode = c("all"))
deg1 ## node degree of all nodes in the network
## [1] 2 3 2 2 1
V(Net2)$size <- (deg1*10) #size the network nodes by their node degree
plot(Net2, edge.arrow.size = .5, vertex.color = "gold") ## is this what you expected?
#### Node eigenvector centrality
Eigenvector centrality- Takes into account not only how many links that the node has, but also the number of links that connected nodes have. It is an extension of degree centrality. Note: this could potentially be important in epidemiology because disease risk may become higher if a node is connected to more highly connected nodes, even if the node itself does not have many links.
eig1 <- eigen_centrality(Net2, directed = TRUE)
eig1 ## NOTE: this gives a "list" of vectors. To pull the eigenvector centrality scores we need to look at
## $vector
## [1] 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.233718e-16
##
## $value
## [1] 1
##
## $options
## $options$bmat
## [1] "I"
##
## $options$n
## [1] 5
##
## $options$which
## [1] "LR"
##
## $options$nev
## [1] 1
##
## $options$tol
## [1] 0
##
## $options$ncv
## [1] 0
##
## $options$ldv
## [1] 0
##
## $options$ishift
## [1] 1
##
## $options$maxiter
## [1] 1000
##
## $options$nb
## [1] 1
##
## $options$mode
## [1] 1
##
## $options$start
## [1] 1
##
## $options$sigma
## [1] 0
##
## $options$sigmai
## [1] 0
##
## $options$info
## [1] 0
##
## $options$iter
## [1] 10
##
## $options$nconv
## [1] 1
##
## $options$numop
## [1] 21
##
## $options$numopb
## [1] 0
##
## $options$numreo
## [1] 11
eig1$vector #like this!
## [1] 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.233718e-16
V(Net2)$size <- (eig1$vector*10) #size the network nodes by eigenvector centrality
plot(Net2, edge.arrow.size = .5, vertex.color = "gold") ## is this what you expected?
What is the betweenness centrality of the nodes in our graph, which is the number of shortest paths through the network of which a node is a part
bet1 <- betweenness(Net2, v = V(Net2), directed = TRUE)
bet1 ## node degree of all nodes in the network
## [1] 1 3 2 0 0
V(Net2)$size <- (bet1*10) #size the network nodes by their node degree
plot(Net2, edge.arrow.size = .5, vertex.color = "gold") ## is this what you expected?
What is the closeness centrality of the nodes in our graph, The inverse of the average length of the shortest path to/from all the other nodes in the network.
cls1 <- closeness(Net2, v = V(Net2), mode = "all")
cls1 ## closeness centrality of all nodes in the network
## [1] 0.1428571 0.2000000 0.1428571 0.1666667 0.1111111
V(Net2)$size <- (cls1*100) #size the network nodes by their node closeness
plot(Net2, edge.arrow.size = .5, vertex.color = "gold") ## is this what you expected?
Calculate graph density (ratio of edges to number of possible edges), diameter (length of the longest path across the graph), mean distance (mean path length)
igraph::graph.density(Net2) #graph density
## [1] 0.25
diameter(Net2) # diameter
## [1] 3
mean_distance(Net2) ##mean path length
## [1] 1.6
igraph::vertex_connectivity(Net2)
## [1] 0
igraph::transitivity(Net2)
## [1] 0.5
One way to see if my network has an structure to it that is different than what would be generated is to compare to many randomaly generated graphs of the same size (nodes and links).
Lets go back to our farmer example!!
library(igraph)
plot(FarmNetwork)
Degree_Distribution <- igraph::degree(FarmNetwork, mode = "total")
hist(Degree_Distribution)
same number of nodes and links
new1 <- sample_gnm(13, 11, directed = FALSE, loops = FALSE)
h1<- igraph::degree(new1)
hist(h1)
Make a loop to generate 50 random graphs with that same number of nodes and links!
degamat <- NULL
n <- 50
for(i in 1:n){
newmatrix <- sample_gnm(13,11, directed = FALSE, loops = FALSE)
degmat <- igraph::degree(newmatrix)
degamat<-rbind(degamat,degmat)
}
degamat
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## degmat 2 5 2 1 2 2 2 1 0 1 1 2
## degmat 3 3 2 0 1 2 0 1 1 2 4 1
## degmat 1 1 1 1 2 3 3 1 3 1 1 2
## degmat 1 1 2 3 2 2 1 2 2 0 1 1
## degmat 2 3 0 2 0 6 1 1 1 3 1 1
## degmat 2 1 0 3 3 1 3 3 2 2 0 0
## degmat 2 1 2 4 3 1 1 1 2 1 2 0
## degmat 2 1 1 3 3 2 1 4 1 2 0 1
## degmat 1 2 1 1 1 1 3 0 1 2 5 3
## degmat 1 4 2 1 1 1 0 1 0 3 3 3
## degmat 3 2 0 2 1 2 1 1 3 1 3 0
## degmat 0 3 2 0 3 1 1 4 2 2 2 1
## degmat 1 3 0 2 3 1 3 2 0 1 3 2
## degmat 1 3 0 1 1 4 2 1 1 2 1 3
## degmat 2 1 1 3 4 1 2 2 2 1 1 2
## degmat 1 1 2 4 0 2 0 4 1 3 2 0
## degmat 1 3 2 1 4 1 2 1 1 3 0 3
## degmat 3 1 1 1 1 2 2 3 2 0 1 4
## degmat 0 2 1 2 1 1 2 3 3 3 2 1
## degmat 2 0 4 1 1 2 3 3 2 2 1 0
## degmat 2 3 1 2 2 1 4 0 1 3 0 2
## degmat 1 2 2 1 2 3 1 0 2 3 0 2
## degmat 3 2 1 1 2 3 0 1 3 4 2 0
## degmat 1 1 1 1 3 0 1 3 5 3 2 0
## degmat 2 1 0 1 2 2 3 2 2 1 2 2
## degmat 1 2 3 0 4 0 2 1 1 0 2 2
## degmat 2 0 1 2 1 3 4 2 1 1 1 0
## degmat 1 3 2 2 1 3 0 1 3 2 3 0
## degmat 1 2 3 0 1 2 2 0 2 4 1 2
## degmat 2 3 3 3 3 1 2 1 1 1 2 0
## degmat 4 3 2 0 1 1 2 1 2 1 1 2
## degmat 3 2 0 2 1 2 1 2 1 2 2 1
## degmat 1 0 1 1 3 5 2 2 2 1 2 0
## degmat 1 3 1 2 0 3 2 2 2 1 1 3
## degmat 1 2 1 1 2 4 0 2 2 1 1 2
## degmat 1 2 4 2 1 1 2 0 1 2 2 3
## degmat 3 2 2 1 0 2 3 1 3 2 1 2
## degmat 3 1 1 2 1 2 1 0 1 2 2 4
## degmat 0 2 3 1 4 2 3 1 0 0 2 3
## degmat 4 0 2 3 2 1 2 1 1 1 2 2
## degmat 3 1 3 2 1 2 2 2 0 1 2 1
## degmat 1 4 0 3 3 0 3 1 1 1 2 1
## degmat 2 2 0 3 3 0 2 0 1 2 3 2
## degmat 4 0 3 2 1 1 1 1 4 2 1 1
## degmat 1 1 4 2 1 2 2 0 3 1 2 2
## degmat 0 3 1 2 2 2 2 2 1 1 3 2
## degmat 3 3 0 1 0 1 2 2 5 1 1 1
## degmat 2 2 5 0 2 1 3 1 2 1 1 1
## degmat 2 2 2 2 1 1 2 1 1 3 1 3
## degmat 2 1 3 1 2 1 2 0 2 2 0 4
## [,13]
## degmat 1
## degmat 2
## degmat 2
## degmat 4
## degmat 1
## degmat 2
## degmat 2
## degmat 1
## degmat 1
## degmat 2
## degmat 3
## degmat 1
## degmat 1
## degmat 2
## degmat 0
## degmat 2
## degmat 0
## degmat 1
## degmat 1
## degmat 1
## degmat 1
## degmat 3
## degmat 0
## degmat 1
## degmat 2
## degmat 4
## degmat 4
## degmat 1
## degmat 2
## degmat 0
## degmat 2
## degmat 3
## degmat 2
## degmat 1
## degmat 3
## degmat 1
## degmat 0
## degmat 2
## degmat 1
## degmat 1
## degmat 2
## degmat 2
## degmat 2
## degmat 1
## degmat 1
## degmat 1
## degmat 2
## degmat 1
## degmat 1
## degmat 2
hist(degamat, xlim = c(0,7), breaks = 7)
Graph and compare the degree distribution of our surveyed graph with degree distribution of our random networks.
* How do they compare? * Do we think there are underlyng social processes that are driving link formation in this network? * What could they be? * You might say that a few people are hightly connected but most are more sparsley connected than we would expect by random.
par(mfrow=c(1,1),
mar=c(2,2,2,2))
hist(Degree_Distribution, xlab = "Node Degree", xlim = c(0,7), breaks = 3, main = "Observed")
hist(degamat, xlim = c(0,7), breaks = 7, xlab = "Node Degree", main = "Simulated")