1# IGraph R package 2# Copyright (C) 2005-2012 Gabor Csardi <csardi.gabor@gmail.com> 3# 334 Harvard street, Cambridge, MA 02139 USA 4# 5# This program is free software; you can redistribute it and/or modify 6# it under the terms of the GNU General Public License as published by 7# the Free Software Foundation; either version 2 of the License, or 8# (at your option) any later version. 9# 10# This program is distributed in the hope that it will be useful, 11# but WITHOUT ANY WARRANTY; without even the implied warranty of 12# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13# GNU General Public License for more details. 14# 15# You should have received a copy of the GNU General Public License 16# along with this program; if not, write to the Free Software 17# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 18# 02110-1301 USA 19# 20################################################################### 21 22################################################################### 23# Community structure 24################################################################### 25 26#' Functions to deal with the result of network community detection 27#' 28#' igraph community detection functions return their results as an object from 29#' the \code{communities} class. This manual page describes the operations of 30#' this class. 31#' 32#' Community structure detection algorithms try to find dense subgraphs in 33#' directed or undirected graphs, by optimizing some criteria, and usually 34#' using heuristics. 35#' 36#' igraph implements a number of community detection methods (see them below), 37#' all of which return an object of the class \code{communities}. Because the 38#' community structure detection algorithms are different, \code{communities} 39#' objects do not always have the same structure. Nevertheless, they have some 40#' common operations, these are documented here. 41#' 42#' The \code{print} generic function is defined for \code{communities}, it 43#' prints a short summary. 44#' 45#' The \code{length} generic function call be called on \code{communities} and 46#' returns the number of communities. 47#' 48#' The \code{sizes} function returns the community sizes, in the order of their 49#' ids. 50#' 51#' \code{membership} gives the division of the vertices, into communities. It 52#' returns a numeric vector, one value for each vertex, the id of its 53#' community. Community ids start from one. Note that some algorithms calculate 54#' the complete (or incomplete) hierarchical structure of the communities, and 55#' not just a single partitioning. For these algorithms typically the 56#' membership for the highest modularity value is returned, but see also the 57#' manual pages of the individual algorithms. 58#' 59#' \code{communities} is also the name of a function, that returns a list of 60#' communities, each identified by their vertices. The vertices will have 61#' symbolic names if the \code{add.vertex.names} igraph option is set, and the 62#' graph itself was named. Otherwise numeric vertex ids are used. 63#' 64#' \code{modularity} gives the modularity score of the partitioning. (See 65#' \code{\link{modularity.igraph}} for details. For algorithms that do not 66#' result a single partitioning, the highest modularity value is returned. 67#' 68#' \code{algorithm} gives the name of the algorithm that was used to calculate 69#' the community structure. 70#' 71#' \code{crossing} returns a logical vector, with one value for each edge, 72#' ordered according to the edge ids. The value is \code{TRUE} iff the edge 73#' connects two different communities, according to the (best) membership 74#' vector, as returned by \code{membership()}. 75#' 76#' \code{is_hierarchical} checks whether a hierarchical algorithm was used to 77#' find the community structure. Some functions only make sense for 78#' hierarchical methods (e.g. \code{merges}, \code{cut_at} and 79#' \code{as.dendrogram}). 80#' 81#' \code{merges} returns the merge matrix for hierarchical methods. An error 82#' message is given, if a non-hierarchical method was used to find the 83#' community structure. You can check this by calling \code{is_hierarchical} on 84#' the \code{communities} object. 85#' 86#' \code{cut_at} cuts the merge tree of a hierarchical community finding method, 87#' at the desired place and returns a membership vector. The desired place can 88#' be expressed as the desired number of communities or as the number of merge 89#' steps to make. The function gives an error message, if called with a 90#' non-hierarchical method. 91#' 92#' \code{as.dendrogram} converts a hierarchical community structure to a 93#' \code{dendrogram} object. It only works for hierarchical methods, and gives 94#' an error message to others. See \code{\link[stats]{dendrogram}} for details. 95#' 96#' \code{as.hclust} is similar to \code{as.dendrogram}, but converts a 97#' hierarchical community structure to a \code{hclust} object. 98#' 99#' \code{as_phylo} converts a hierarchical community structure to a \code{phylo} 100#' object, you will need the \code{ape} package for this. 101#' 102#' \code{show_trace} works (currently) only for communities found by the leading 103#' eigenvector method (\code{\link{cluster_leading_eigen}}), and 104#' returns a character vector that gives the steps performed by the algorithm 105#' while finding the communities. 106#' 107#' \code{code_len} is defined for the InfoMAP method 108#' (\code{\link{cluster_infomap}} and returns the code length of the 109#' partition. 110#' 111#' It is possibly to call the \code{plot} function on \code{communities} 112#' objects. This will plot the graph (and uses \code{\link{plot.igraph}} 113#' internally), with the communities shown. By default it colores the vertices 114#' according to their communities, and also marks the vertex groups 115#' corresponding to the communities. It passes additional arguments to 116#' \code{\link{plot.igraph}}, please see that and also 117#' \code{\link{igraph.plotting}} on how to change the plot. 118#' 119#' @rdname communities 120#' @aliases communities membership algorithm crossing cutat merges sizes cut_at 121#' is.hierarchical print.communities plot.communities length.communities 122#' as.dendrogram.communities as.hclust.communities code_len 123#' asPhylo asPhylo.communities showtrace code.length 124#' as_phylo as_phylo.communities show_trace is_hierarchical 125#' @param communities,x,object A \code{communities} object, the result of an 126#' igraph community detection function. 127#' @param graph An igraph graph object, corresponding to \code{communities}. 128#' @param y An igraph graph object, corresponding to the communities in 129#' \code{x}. 130#' @param no Integer scalar, the desired number of communities. If too low or 131#' two high, then an error message is given. Exactly one of \code{no} and 132#' \code{steps} must be supplied. 133#' @param steps The number of merge operations to perform to produce the 134#' communities. Exactly one of \code{no} and \code{steps} must be supplied. 135#' @param col A vector of colors, in any format that is accepted by the regular 136#' R plotting methods. This vector gives the colors of the vertices explicitly. 137#' @param mark.groups A list of numeric vectors. The communities can be 138#' highlighted using colored polygons. The groups for which the polygons are 139#' drawn are given here. The default is to use the groups given by the 140#' communities. Supply \code{NULL} here if you do not want to highlight any 141#' groups. 142#' @param edge.color The colors of the edges. By default the edges within 143#' communities are colored green and other edges are red. 144#' @param hang Numeric scalar indicating how the height of leaves should be 145#' computed from the heights of their parents; see \code{\link{plot.hclust}}. 146#' @param use.modularity Logical scalar, whether to use the modularity values 147#' to define the height of the branches. 148#' @param \dots Additional arguments. \code{plot.communities} passes these to 149#' \code{\link{plot.igraph}}. The other functions silently ignore 150#' them. 151#' @param membership Numeric vector, one value for each vertex, the membership 152#' vector of the community structure. Might also be \code{NULL} if the 153#' community structure is given in another way, e.g. by a merge matrix. 154#' @param algorithm If not \code{NULL} (meaning an unknown algorithm), then a 155#' character scalar, the name of the algorithm that produced the community 156#' structure. 157#' @param merges If not \code{NULL}, then the merge matrix of the hierarchical 158#' community structure. See \code{merges} below for more information on its 159#' format. 160#' @param modularity Numeric scalar or vector, the modularity value of the 161#' community structure. It can also be \code{NULL}, if the modularity of the 162#' (best) split is not available. 163#' @return \code{print} returns the \code{communities} object itself, 164#' invisibly. 165#' 166#' \code{length} returns an integer scalar. 167#' 168#' \code{sizes} returns a numeric vector. 169#' 170#' \code{membership} returns a numeric vector, one number for each vertex in 171#' the graph that was the input of the community detection. 172#' 173#' \code{modularity} returns a numeric scalar. 174#' 175#' \code{algorithm} returns a character scalar. 176#' 177#' \code{crossing} returns a logical vector. 178#' 179#' \code{is_hierarchical} returns a logical scalar. 180#' 181#' \code{merges} returns a two-column numeric matrix. 182#' 183#' \code{cut_at} returns a numeric vector, the membership vector of the 184#' vertices. 185#' 186#' \code{as.dendrogram} returns a \code{\link[stats]{dendrogram}} object. 187#' 188#' \code{show_trace} returns a character vector. 189#' 190#' \code{code_len} returns a numeric scalar for communities found with the 191#' InfoMAP method and \code{NULL} for other methods. 192#' 193#' \code{plot} for \code{communities} objects returns \code{NULL}, invisibly. 194#' 195#' #' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 196#' @seealso See \code{\link{plot_dendrogram}} for plotting community structure 197#' dendrograms. 198#' 199#' See \code{\link{compare}} for comparing two community structures 200#' on the same graph. 201#' 202#' The different methods for finding communities, they all return a 203#' \code{communities} object: \code{\link{cluster_edge_betweenness}}, 204#' \code{\link{cluster_fast_greedy}}, 205#' \code{\link{cluster_label_prop}}, 206#' \code{\link{cluster_leading_eigen}}, 207#' \code{\link{cluster_louvain}}, \code{\link{cluster_leiden}}, 208#' \code{\link{cluster_optimal}}, \code{\link{cluster_spinglass}}, 209#' \code{\link{cluster_walktrap}}. 210#' @keywords graphs 211#' @export 212#' @examples 213#' 214#' karate <- make_graph("Zachary") 215#' wc <- cluster_walktrap(karate) 216#' modularity(wc) 217#' membership(wc) 218#' plot(wc, karate) 219#' 220 221membership <- function(communities) { 222 if (!is.null(communities$membership)) { 223 res <- communities$membership 224 } else if (!is.null(communities$merges) && 225 !is.null(communities$modularity)) { 226 res <- community.to.membership2(communities$merges, communities$vcount, 227 which.max(communities$modularity)) 228 } else { 229 stop("Cannot calculate community membership") 230 } 231 if (igraph_opt("add.vertex.names") && !is.null(communities$names)) { 232 names(res) <- communities$names 233 } 234 class(res) <- "membership" 235 res 236} 237 238#' @method print membership 239#' @export 240 241print.membership <- function(x, ...) print(unclass(x), ...) 242 243#' Declare a numeric vector as a membership vector 244#' 245#' This is useful if you want to use functions defined on 246#' membership vectors, but your membership vector does not 247#' come from an igraph clustering method. 248#' 249#' @param x The input vector. 250#' @return The input vector, with the \code{membership} class added. 251#' @export 252#' @examples 253#' ## Compare to the correct clustering 254#' g <- (make_full_graph(10) + make_full_graph(10)) %>% 255#' rewire(each_edge(p = 0.2)) 256#' correct <- rep(1:2, each = 10) %>% as_membership 257#' fc <- cluster_fast_greedy(g) 258#' compare(correct, fc) 259#' compare(correct, membership(fc)) 260 261as_membership <- function(x) add_class(x, "membership") 262 263#' @rdname communities 264#' @method print communities 265#' @export 266 267print.communities <- function(x, ...) { 268 269 noc <- if (!is.null(x$membership)) max(membership(x)) else NA 270 mod <- if (!is.null(x$modularity)) { 271 modularity(x) %>% format(digits = 2) 272 } else { 273 NA_real_ 274 } 275 alg <- x$algorithm %||% "unknown" 276 277 cat("IGRAPH clustering ", alg, ", groups: ", noc, ", mod: ", mod, "\n", sep="") 278 279 if (!is.null(x$membership)) { 280 grp <- groups(x) 281 cat("+ groups:\n") 282 hp <- function(o) { 283 head_print(o, max_lines = igraph_opt("auto.print.lines"), 284 omitted_footer = "+ ... omitted several groups/vertices\n",) 285 } 286 indent_print(grp, .printer = hp, .indent = " ") 287 288 } else { 289 cat(" + groups not available\n") 290 } 291 292 invisible(x) 293} 294 295#' Creates a communities object. 296#' 297#' This is useful to integrate the results of community finding algorithms 298#' that are not included in igraph. 299#' 300#' @param graph The graph of the community structure. 301#' @param membership The membership vector of the community structure, a 302#' numeric vector denoting the id of the community for each vertex. It 303#' might be \code{NULL} for hierarchical community structures. 304#' @param algorithm Character string, the algorithm that generated 305#' the community structure, it can be arbitrary. 306#' @param merges A merge matrix, for hierarchical community structures (or 307#' \code{NULL} otherwise. 308#' @param modularity Modularity value of the community structure. If this 309#' is \code{TRUE} and the membership vector is available, then it the 310#' modularity values is calculated automatically. 311#' @return A \code{communities} object. 312#' 313#' @aliases create.communities 314#' 315#' @export 316 317make_clusters <- function(graph, membership = NULL, algorithm = NULL, 318 merges = NULL, modularity = TRUE) { 319 320 stopifnot(is.null(membership) || is.numeric(membership)) 321 stopifnot(is.null(algorithm) || 322 (is.character(algorithm) && length(algorithm)==1)) 323 stopifnot(is.null(merges) || 324 (is.matrix(merges) && is.numeric(merges) && ncol(merges)==2)) 325 stopifnot(is.null(modularity) || 326 (is.logical(modularity) && length(modularity) == 1) || 327 (is.numeric(modularity) && 328 length(modularity) %in% c(1, length(membership)))) 329 330 if (is.logical(modularity)) { 331 if (modularity && !is.null(membership)) { 332 modularity <- modularity(graph, membership) 333 } else { 334 modularity <- NULL 335 } 336 } 337 338 res <- list(membership=membership, 339 algorithm=if (is.null(algorithm)) "unknown" else algorithm, 340 modularity=modularity) 341 if (!is.null(merges)) { 342 res$merges <- merges 343 } 344 if (!is.null(membership)) { 345 res$vcount <- length(membership) 346 } else if (!is.null(merges)) { 347 res$vcount <- nrow(merges) + 1 348 } 349 class(res) <- "communities" 350 res 351} 352 353#' @export 354 355modularity <- function(x, ...) 356 UseMethod("modularity") 357 358#' Modularity of a community structure of a graph 359#' 360#' This function calculates how modular is a given division of a graph into 361#' subgraphs. 362#' 363#' \code{modularity} calculates the modularity of a graph with respect to the 364#' given \code{membership} vector. 365#' 366#' The modularity of a graph with respect to some division (or vertex types) 367#' measures how good the division is, or how separated are the different vertex 368#' types from each other. It defined as \deqn{Q=\frac{1}{2m} \sum_{i,j} 369#' (A_{ij}-\frac{k_ik_j}{2m})\delta(c_i,c_j),}{Q=1/(2m) * sum( (Aij-ki*kj/(2m) 370#' ) delta(ci,cj),i,j),} here \eqn{m} is the number of edges, \eqn{A_{ij}}{Aij} 371#' is the element of the \eqn{A} adjacency matrix in row \eqn{i} and column 372#' \eqn{j}, \eqn{k_i}{ki} is the degree of \eqn{i}, \eqn{k_j}{kj} is the degree 373#' of \eqn{j}, \eqn{c_i}{ci} is the type (or component) of \eqn{i}, 374#' \eqn{c_j}{cj} that of \eqn{j}, the sum goes over all \eqn{i} and \eqn{j} 375#' pairs of vertices, and \eqn{\delta(x,y)}{delta(x,y)} is 1 if \eqn{x=y} and 0 376#' otherwise. 377#' 378#' If edge weights are given, then these are considered as the element of the 379#' \eqn{A} adjacency matrix, and \eqn{k_i}{ki} is the sum of weights of 380#' adjacent edges for vertex \eqn{i}. 381#' 382#' \code{modularity_matrix} calculates the modularity matrix. This is a dense matrix, 383#' and it is defined as the difference of the adjacency matrix and the 384#' configuration model null model matrix. In other words element 385#' \eqn{M_{ij}}{M[i,j]} is given as \eqn{A_{ij}-d_i 386#' d_j/(2m)}{A[i,j]-d[i]d[j]/(2m)}, where \eqn{A_{ij}}{A[i,j]} is the (possibly 387#' weighted) adjacency matrix, \eqn{d_i}{d[i]} is the degree of vertex \eqn{i}, 388#' and \eqn{m} is the number of edges (or the total weights in the graph, if it 389#' is weighed). 390#' 391#' @aliases modularity 392#' @param x,graph The input graph. 393#' @param membership Numeric vector, one value for each vertex, the membership 394#' vector of the community structure. 395#' @param weights If not \code{NULL} then a numeric vector giving edge weights. 396#' @param \dots Additional arguments, none currently. 397#' @return For \code{modularity} a numeric scalar, the modularity score of the 398#' given configuration. 399#' 400#' For \code{modularity_matrix} a numeric square matrix, its order is the number of 401#' vertices in the graph. 402#' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 403#' @seealso \code{\link{cluster_walktrap}}, 404#' \code{\link{cluster_edge_betweenness}}, 405#' \code{\link{cluster_fast_greedy}}, \code{\link{cluster_spinglass}}, 406#' \code{\link{cluster_louvain}} and \code{\link{cluster_leiden}} for 407#' various community detection methods. 408#' @references Clauset, A.; Newman, M. E. J. & Moore, C. Finding community 409#' structure in very large networks, \emph{Physical Review E} 2004, 70, 066111 410#' @method modularity igraph 411#' @export 412#' @keywords graphs 413#' @examples 414#' 415#' g <- make_full_graph(5) %du% make_full_graph(5) %du% make_full_graph(5) 416#' g <- add_edges(g, c(1,6, 1,11, 6, 11)) 417#' wtc <- cluster_walktrap(g) 418#' modularity(wtc) 419#' modularity(g, membership(wtc)) 420#' 421 422modularity.igraph <- function(x, membership, weights=NULL, ...) { 423 # Argument checks 424 if (!is_igraph(x)) { stop("Not a graph object") } 425 if (is.null(membership) || (!is.numeric(membership) && !is.factor(membership))) { 426 stop("Membership is not a numerical vector") 427 } 428 membership <- as.numeric(membership) 429 if (!is.null(weights)) weights <- as.numeric(weights) 430 431 on.exit( .Call(C_R_igraph_finalizer) ) 432 # Function call 433 res <- .Call(C_R_igraph_modularity, x, membership-1, weights) 434 res 435} 436 437#' @rdname communities 438#' @method modularity communities 439#' @export 440 441modularity.communities <- function(x, ...) { 442 if (!is.null(x$modularity)) { 443 max(x$modularity) 444 } else { 445 stop("Modularity was not calculated") 446 } 447} 448 449#' @rdname modularity.igraph 450#' @aliases mod.matrix 451#' @export 452 453modularity_matrix <- function(graph, membership, weights=NULL) { 454 # Argument checks 455 if (!is_igraph(graph)) { stop("Not a graph object") } 456 if (!missing(membership)) { 457 warning("The membership argument is deprecated; modularity_matrix does not need it") 458 } 459 460 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 461 weights <- E(graph)$weight 462 } 463 if (!is.null(weights) && any(!is.na(weights))) { 464 weights <- as.numeric(weights) 465 } else { 466 weights <- NULL 467 } 468 469 on.exit( .Call(C_R_igraph_finalizer) ) 470 # Function call 471 res <- .Call(C_R_igraph_modularity_matrix, graph, weights) 472 473 res 474} 475 476#' @rdname communities 477#' @method length communities 478#' @export 479 480length.communities <- function(x) { 481 m <- membership(x) 482 max(m) 483} 484 485#' @rdname communities 486#' @export 487 488sizes <- function(communities) { 489 m <- membership(communities) 490 table(`Community sizes`=m) 491} 492 493#' @rdname communities 494#' @export 495 496algorithm <- function(communities) { 497 communities$algorithm 498} 499 500#' @rdname communities 501#' @export 502 503merges <- function(communities) { 504 if (!is.null(communities$merges)) { 505 communities$merges 506 } else { 507 stop("Not a hierarchical community structure") 508 } 509} 510 511#' @rdname communities 512#' @export 513 514crossing <- function(communities, graph) { 515 m <- membership(communities) 516 el <- as_edgelist(graph, names=FALSE) 517 m1 <- m[el[,1]] 518 m2 <- m[el[,2]] 519 res <- m1 != m2 520 if (!is.null(names(m1))) { 521 names(res) <- paste(names(m1), names(m2), sep="|") 522 } 523 res 524} 525 526#' @rdname communities 527#' @export 528 529code_len <- function(communities) { 530 communities$codelength 531} 532 533#' @rdname communities 534#' @export 535 536is_hierarchical <- function(communities) { 537 ! is.null(communities$merges) 538} 539 540complete.dend <- function(comm, use.modularity) { 541 merges <- comm$merges 542 if (nrow(merges) < comm$vcount-1) { 543 if (use.modularity) { 544 stop(paste("`use.modularity' requires a full dendrogram,", 545 "i.e. a connected graph")) 546 } 547 miss <- seq_len(comm$vcount + nrow(merges))[-as.vector(merges)] 548 miss <- c(miss, seq_len(length(miss)-2) + comm$vcount+nrow(merges)) 549 miss <- matrix(miss, byrow=TRUE, ncol=2) 550 merges <- rbind(merges, miss) 551 } 552 storage.mode(merges) <- "integer" 553 554 merges 555} 556 557# The following functions were adapted from the stats R package 558 559#' @rdname communities 560#' @importFrom stats as.dendrogram 561#' @method as.dendrogram communities 562#' @export 563 564as.dendrogram.communities <- function(object, hang=-1, use.modularity=FALSE, 565 ...) { 566 if (!is_hierarchical(object)) { 567 stop("Not a hierarchical community structure") 568 } 569 570 .memberDend <- function(x) { 571 r <- attr(x,"x.member") 572 if(is.null(r)) { 573 r <- attr(x,"members") 574 if(is.null(r)) r <- 1:1 575 } 576 r 577 } 578 579 ## If multiple components, then we merge them in arbitrary order 580 merges <- complete.dend(object, use.modularity) 581 582 storage.mode(merges) <- "integer" 583 584 if (is.null(object$names)) { 585 object$names <- 1:(nrow(merges)+1) 586 } 587 z <- list() 588 if (!use.modularity || is.null(object$modularity)) { 589 object$height <- 1:nrow(merges) 590 } else { 591 object$height <- object$modularity[-1] 592 object$height <- cumsum(object$height - min(object$height)) 593 } 594 nMerge <- length(oHgt <- object$height) 595 if (nMerge != nrow(merges)) 596 stop("'merge' and 'height' do not fit!") 597 hMax <- oHgt[nMerge] 598 one <- 1L 599 two <- 2L 600 leafs <- nrow(merges)+1 601 for (k in 1:nMerge) { 602 x <- merges[k, ]# no sort() anymore! 603 if (any(neg <- x < leafs+1)) 604 h0 <- if (hang < 0) 0 else max(0, oHgt[k] - hang * hMax) 605 if (all(neg)) { # two leaves 606 zk <- as.list(x) 607 attr(zk, "members") <- two 608 attr(zk, "midpoint") <- 0.5 # mean( c(0,1) ) 609 objlabels <- object$names[x] 610 attr(zk[[1]], "label") <- objlabels[1] 611 attr(zk[[2]], "label") <- objlabels[2] 612 attr(zk[[1]], "members") <- attr(zk[[2]], "members") <- one 613 attr(zk[[1]], "height") <- attr(zk[[2]], "height") <- h0 614 attr(zk[[1]], "leaf") <- attr(zk[[2]], "leaf") <- TRUE 615 } 616 else if (any(neg)) { # one leaf, one node 617 X <- as.character(x) 618 ## Originally had "x <- sort(..) above => leaf always left, x[1]; 619 ## don't want to assume this 620 isL <- x[1] < leafs+1 ## is leaf left? 621 zk <- 622 if(isL) list(x[1], z[[X[2]]]) 623 else list(z[[X[1]]], x[2]) 624 attr(zk, "members") <- attr(z[[X[1 + isL]]], "members") + one 625 attr(zk, "midpoint") <- 626 (.memberDend(zk[[1]]) + attr(z[[X[1 + isL]]], "midpoint"))/2 627 attr(zk[[2 - isL]], "members") <- one 628 attr(zk[[2 - isL]], "height") <- h0 629 attr(zk[[2 - isL]], "label") <- object$names[x[2 - isL]] 630 attr(zk[[2 - isL]], "leaf") <- TRUE 631 } 632 else { # two nodes 633 x <- as.character(x) 634 zk <- list(z[[x[1]]], z[[x[2]]]) 635 attr(zk, "members") <- attr(z[[x[1]]], "members") + 636 attr(z[[x[2]]], "members") 637 attr(zk, "midpoint") <- (attr(z[[x[1]]], "members") + 638 attr(z[[x[1]]], "midpoint") + 639 attr(z[[x[2]]], "midpoint"))/2 640 } 641 attr(zk, "height") <- oHgt[k] 642 z[[k <- as.character(k+leafs)]] <- zk 643 } 644 z <- z[[k]] 645 class(z) <- "dendrogram" 646 z 647} 648 649#' @rdname communities 650#' @importFrom stats as.hclust 651#' @method as.hclust communities 652#' @export 653 654as.hclust.communities <- function(x, hang=-1, use.modularity=FALSE, 655 ...) { 656 as.hclust(as.dendrogram(x, hang=hang, use.modularity=use.modularity)) 657} 658 659#' @rdname communities 660#' @export 661 662as_phylo <- function(x, ...) 663 UseMethod("as_phylo") 664 665#' @rdname communities 666#' @method as_phylo communities 667#' @export 668 669as_phylo.communities <- function(x, use.modularity=FALSE, ...) { 670 671 if (!is_hierarchical(x)) { 672 stop("Not a hierarchical community structure") 673 } 674 675 ## If multiple components, then we merge them in arbitrary order 676 merges <- complete.dend(x, use.modularity) 677 678 if (!use.modularity || is.null(x$modularity)) { 679 height <- 1:nrow(merges) 680 } else { 681 height <- x$modularity[-1] 682 height <- cumsum(height - min(height)) 683 } 684 685 if (is.null(x$names)) { 686 labels <- 1:(nrow(merges)+1) 687 } else { 688 labels <- x$names 689 } 690 691 N <- nrow(merges) 692 edge <- matrix(0L, 2*N, 2) 693 edge.length <- numeric(2*N) 694 node <- integer(N) 695 node[N] <- N + 2L 696 cur.nod <- N + 3L 697 j <- 1L 698 for (i in N:1) { 699 edge[j:(j+1), 1] <- node[i] 700 for (l in 1:2) { 701 k <- j + l -1L 702 y <- merges[i, l] 703 if (y > N+1) { 704 edge[k, 2] <- node[y-N-1] <- cur.nod 705 cur.nod <- cur.nod + 1L 706 edge.length[k] <- height[i] - height[y-N-1] 707 } else { 708 edge[k, 2] <- y 709 edge.length[k] <- height[i] 710 } 711 } 712 j <- j + 2L 713 } 714 715 obj <- list(edge=edge, edge.length=edge.length/2, tip.label=labels, 716 Nnode=N) 717 class(obj) <- "phylo" 718 ape::reorder.phylo(obj) 719} 720 721#' @rdname communities 722#' @export 723 724cut_at <- function(communities, no, steps) { 725 726 if (!inherits(communities, "communities")) { 727 stop("Not a community structure") 728 } 729 if (!is_hierarchical(communities)) { 730 stop("Not a hierarchical communitity structure") 731 } 732 733 if ((!missing(no) && !missing(steps)) || 734 ( missing(no) && missing(steps))) { 735 stop("Please give either `no' or `steps' (but not both)") 736 } 737 738 if (!missing(steps)) { 739 mm <- merges(communities) 740 if (steps > nrow(mm)) { 741 warning("Cannot make that many steps") 742 steps <- nrow(mm) 743 } 744 community.to.membership2(mm, communities$vcount, steps) 745 } else { 746 mm <- merges(communities) 747 noc <- communities$vcount - nrow(mm) # final number of communities 748 if (no<noc) { 749 warning("Cannot have that few communities") 750 no=noc 751 } 752 steps <- communities$vcount-no 753 community.to.membership2(mm, communities$vcount, steps) 754 } 755} 756 757#' @rdname communities 758#' @export 759 760show_trace <- function(communities) { 761 762 if (!inherits(communities, "communities")) { 763 stop("Not a community structure") 764 } 765 if (is.null(communities$history)) { 766 stop("History was not recorded") 767 } 768 769 res <- character() 770 i <- 1 771 while (i <= length(communities$history)) { 772 if (communities$history[i] == 2) { # IGRAPH_LEVC_HIST_SPLIT 773 resnew <- paste("Splitting community", communities$history[i+1], 774 "into two.") 775 i <- i + 2 776 } else if (communities$history[i]==3) { # IGRAPH_LEVC_HIST_FAILED 777 resnew <- paste("Failed splitting community", 778 communities$history[i+1], "into two.") 779 i <- i + 2 780 } else if (communities$history[i]==4) { # IGRAPH_LEVC_START_FULL 781 resnew <- "Starting with the whole graph as a community." 782 i <- i + 1 783 } else if (communities$history[i]==5) { # IGRAPH_LEVC_START_GIVEN 784 resnew <- paste("Starting from the", communities$history[i+1], 785 "given communities.") 786 i <- i + 2 787 } 788 789 res <- c(res, resnew) 790 } 791 res 792} 793 794##################################################################### 795 796community.to.membership2 <- function(merges, vcount, steps) { 797 mode(merges) <- "numeric" 798 mode(vcount) <- "numeric" 799 mode(steps) <- "numeric" 800 on.exit( .Call(C_R_igraph_finalizer) ) 801 res <- .Call(C_R_igraph_community_to_membership2, merges-1, vcount, steps) 802 res+1 803} 804 805##################################################################### 806 807 808 809#' Finding communities in graphs based on statistical meachanics 810#' 811#' This function tries to find communities in graphs via a spin-glass model and 812#' simulated annealing. 813#' 814#' This function tries to find communities in a graph. A community is a set of 815#' nodes with many edges inside the community and few edges between outside it 816#' (i.e. between the community itself and the rest of the graph.) 817#' 818#' This idea is reversed for edges having a negative weight, ie. few negative 819#' edges inside a community and many negative edges between communities. Note 820#' that only the \sQuote{neg} implementation supports negative edge weights. 821#' 822#' The \code{spinglass.cummunity} function can solve two problems related to 823#' community detection. If the \code{vertex} argument is not given (or it is 824#' \code{NULL}), then the regular community detection problem is solved 825#' (approximately), i.e. partitioning the vertices into communities, by 826#' optimizing the an energy function. 827#' 828#' If the \code{vertex} argument is given and it is not \code{NULL}, then it 829#' must be a vertex id, and the same energy function is used to find the 830#' community of the the given vertex. See also the examples below. 831#' 832#' @aliases spinglass.community 833#' @param graph The input graph, can be directed but the direction of the edges 834#' is neglected. 835#' @param weights The weights of the edges. Either a numeric vector or 836#' \code{NULL}. If it is null and the input graph has a \sQuote{weight} edge 837#' attribute then that will be used. If \code{NULL} and no such attribute is 838#' present then the edges will have equal weights. Set this to \code{NA} if the 839#' graph was a \sQuote{weight} edge attribute, but you don't want to use it for 840#' community detection. A larger edge weight means a stronger connection 841#' for this function. 842#' @param vertex This parameter can be used to calculate the community of a 843#' given vertex without calculating all communities. Note that if this argument 844#' is present then some other arguments are ignored. 845#' @param spins Integer constant, the number of spins to use. This is the upper 846#' limit for the number of communities. It is not a problem to supply a 847#' (reasonably) big number here, in which case some spin states will be 848#' unpopulated. 849#' @param parupdate Logical constant, whether to update the spins of the 850#' vertices in parallel (synchronously) or not. This argument is ignored if the 851#' second form of the function is used (ie. the \sQuote{\code{vertex}} argument 852#' is present). It is also not implemented in the \dQuote{neg} implementation. 853#' @param start.temp Real constant, the start temperature. This argument is 854#' ignored if the second form of the function is used (ie. the 855#' \sQuote{\code{vertex}} argument is present). 856#' @param stop.temp Real constant, the stop temperature. The simulation 857#' terminates if the temperature lowers below this level. This argument is 858#' ignored if the second form of the function is used (ie. the 859#' \sQuote{\code{vertex}} argument is present). 860#' @param cool.fact Cooling factor for the simulated annealing. This argument 861#' is ignored if the second form of the function is used (ie. the 862#' \sQuote{\code{vertex}} argument is present). 863#' @param update.rule Character constant giving the \sQuote{null-model} of the 864#' simulation. Possible values: \dQuote{simple} and \dQuote{config}. 865#' \dQuote{simple} uses a random graph with the same number of edges as the 866#' baseline probability and \dQuote{config} uses a random graph with the same 867#' vertex degrees as the input graph. 868#' @param gamma Real constant, the gamma argument of the algorithm. This 869#' specifies the balance between the importance of present and non-present 870#' edges in a community. Roughly, a comunity is a set of vertices having many 871#' edges inside the community and few edges outside the community. The default 872#' 1.0 value makes existing and non-existing links equally important. Smaller 873#' values make the existing links, greater values the missing links more 874#' important. 875#' @param implementation Character scalar. Currently igraph contains two 876#' implementations for the Spin-glass community finding algorithm. The faster 877#' original implementation is the default. The other implementation, that takes 878#' into account negative weights, can be chosen by supplying \sQuote{neg} here. 879#' @param gamma.minus Real constant, the gamma.minus parameter of the 880#' algorithm. This specifies the balance between the importance of present and 881#' non-present negative weighted edges in a community. Smaller values of 882#' gamma.minus, leads to communities with lesser negative intra-connectivity. 883#' If this argument is set to zero, the algorithm reduces to a graph coloring 884#' algorithm, using the number of spins as the number of colors. This argument 885#' is ignored if the \sQuote{orig} implementation is chosen. 886#' @return If the \code{vertex} argument is not given, ie. the first form is 887#' used then a \code{\link{cluster_spinglass}} returns a 888#' \code{\link{communities}} object. 889#' 890#' If the \code{vertex} argument is present, ie. the second form is used then a 891#' named list is returned with the following components: 892#' \item{community}{Numeric vector giving the ids of the vertices in the same 893#' community as \code{vertex}.} \item{cohesion}{The cohesion score of the 894#' result, see references.} \item{adhesion}{The adhesion score of the result, 895#' see references.} \item{inner.links}{The number of edges within the community 896#' of \code{vertex}.} \item{outer.links}{The number of edges between the 897#' community of \code{vertex} and the rest of the graph. } 898#' @author Jorg Reichardt for the original code and Gabor Csardi 899#' \email{csardi.gabor@@gmail.com} for the igraph glue code. 900#' 901#' Changes to the original function for including the possibility of negative 902#' ties were implemented by Vincent Traag (\url{http://www.traag.net/}). 903#' @seealso \code{\link{communities}}, \code{\link{components}} 904#' @references J. Reichardt and S. Bornholdt: Statistical Mechanics of 905#' Community Detection, \emph{Phys. Rev. E}, 74, 016110 (2006), 906#' \url{https://arxiv.org/abs/cond-mat/0603718} 907#' 908#' M. E. J. Newman and M. Girvan: Finding and evaluating community structure in 909#' networks, \emph{Phys. Rev. E} 69, 026113 (2004) 910#' 911#' V.A. Traag and Jeroen Bruggeman: Community detection in networks with 912#' positive and negative links, \url{https://arxiv.org/abs/0811.2329} (2008). 913#' @export 914#' @keywords graphs 915#' @examples 916#' 917#' g <- sample_gnp(10, 5/10) %du% sample_gnp(9, 5/9) 918#' g <- add_edges(g, c(1, 12)) 919#' g <- induced_subgraph(g, subcomponent(g, 1)) 920#' cluster_spinglass(g, spins=2) 921#' cluster_spinglass(g, vertex=1) 922#' 923cluster_spinglass <- function(graph, weights=NULL, vertex=NULL, spins=25, 924 parupdate=FALSE, start.temp=1, 925 stop.temp=0.01, cool.fact=0.99, 926 update.rule=c("config", "random", "simple"), 927 gamma=1.0, implementation=c("orig", "neg"), 928 gamma.minus=1.0) { 929 930 if (!is_igraph(graph)) { 931 stop("Not a graph object") 932 } 933 934 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 935 weights <- E(graph)$weight 936 } 937 if (!is.null(weights) && any(!is.na(weights))) { 938 weights <- as.numeric(weights) 939 } else { 940 weights <- NULL 941 } 942 943 update.rule <- igraph.match.arg(update.rule) 944 update.rule <- switch(update.rule, "simple"=0, "random"=0, "config"=1) 945 implementation <- switch(igraph.match.arg(implementation), 946 "orig"=0, "neg"=1) 947 948 on.exit( .Call(C_R_igraph_finalizer) ) 949 if (is.null(vertex)) { 950 res <- .Call(C_R_igraph_spinglass_community, graph, weights, 951 as.numeric(spins), as.logical(parupdate), 952 as.numeric(start.temp), 953 as.numeric(stop.temp), as.numeric(cool.fact), 954 as.numeric(update.rule), as.numeric(gamma), 955 as.numeric(implementation), as.numeric(gamma.minus)) 956 res$algorithm <- "spinglass" 957 res$vcount <- vcount(graph) 958 res$membership <- res$membership + 1 959 if (igraph_opt("add.vertex.names") && is_named(graph)) { 960 res$names <- vertex_attr(graph, "name") 961 } 962 class(res) <- "communities" 963 } else { 964 res <- .Call(C_R_igraph_spinglass_my_community, graph, weights, 965 as.igraph.vs(graph, vertex)-1, as.numeric(spins), 966 as.numeric(update.rule), as.numeric(gamma)) 967 res$community <- res$community + 1 968 } 969 res 970} 971 972#' Finding community structure of a graph using the Leiden algorithm of Traag, 973#' van Eck & Waltman. 974#' 975#' @param graph The input graph, only undirected graphs are supported. 976#' @param objective_function Whether to use the Constant Potts Model (CPM) or 977#' modularity. Must be either \code{"CPM"} or \code{"modularity"}. 978#' @param weights Optional edge weights to be used. Can be a vector or an edge 979#' attribute name. If the graph has a \code{weight} edge attribute, then this 980#' is used by default. Supply \code{NA} here if the graph has a \code{weight} 981#' edge attribute, but you want to ignore it. 982#' @param resolution_parameter The resolution parameter to use. Higher 983#' resolutions lead to more smaller communities, while lower resolutions lead 984#' to fewer larger communities. 985#' @param beta Parameter affecting the randomness in the Leiden algorithm. 986#' This affects only the refinement step of the algorithm. 987#' @param initial_membership If provided, the Leiden algorithm 988#' will try to improve this provided membership. If no argument is 989#' provided, the aglorithm simply starts from the singleton partition. 990#' @param n_iterations the number of iterations to iterate the Leiden 991#' algorithm. Each iteration may improve the partition further. 992#' @param vertex_weights the vertex weights used in the Leiden algorithm. 993#' If this is not provided, it will be automatically determined on the 994#' basis of whether you want to use CPM or modularity. If you do provide 995#' this, please make sure that you understand what you are doing. 996#' @return \code{cluster_leiden} returns a \code{\link{communities}} 997#' object, please see the \code{\link{communities}} manual page for details. 998#' @author Vincent Traag 999#' @seealso See \code{\link{communities}} for extracting the membership, 1000#' modularity scores, etc. from the results. 1001#' 1002#' Other community detection algorithms: \code{\link{cluster_walktrap}}, 1003#' \code{\link{cluster_spinglass}}, 1004#' \code{\link{cluster_leading_eigen}}, 1005#' \code{\link{cluster_edge_betweenness}}, 1006#' \code{\link{cluster_fast_greedy}}, 1007#' \code{\link{cluster_label_prop}} 1008#' \code{\link{cluster_louvain}} 1009#' @references Traag, V. A., Waltman, L., & van Eck, N. J. (2019). From Louvain 1010#' to Leiden: guaranteeing well-connected communities. Scientific 1011#' reports, 9(1), 5233. doi: 10.1038/s41598-019-41695-z 1012#' @export 1013#' @keywords graphs 1014#' @examples 1015#' g <- graph.famous("Zachary") 1016#' # By default CPM is used 1017#' g <- cluster_leiden(g, resolution_parameter=0.06) 1018#' 1019cluster_leiden <- function(graph, objective_function=c("CPM", "modularity"), 1020 weights=NULL, resolution_parameter=1, beta=0.01, 1021 initial_membership=NULL, n_iterations=2, vertex_weights=NULL) 1022{ 1023 if (!is_igraph(graph)) { 1024 stop("Not a graph object") 1025 } 1026 1027 # Parse objective function argument 1028 objective_function <- igraph.match.arg(objective_function) 1029 objective_function <- switch(objective_function, "cpm"=0, "modularity"=1) 1030 1031 # Parse edge weights argument 1032 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 1033 weights <- E(graph)$weight 1034 } 1035 if (!is.null(weights) && !any(is.na(weights))) { 1036 weights <- as.numeric(weights) 1037 } else { 1038 weights <- NULL 1039 } 1040 1041 # Parse initial_membership argument 1042 if (!is.null(initial_membership) && !any(is.na(initial_membership))) { 1043 initial_membership <- as.numeric(initial_membership) 1044 } else { 1045 initial_membership <- NULL 1046 } 1047 1048 # Parse node weights argument 1049 if (!is.null(vertex_weights) && !any(is.na(vertex_weights))) { 1050 vertex_weights <- as.numeric(vertex_weights) 1051 if (objective_function == 1) { # Using modularity 1052 warning("Providing node weights contradicts using modularity") 1053 } 1054 } else { 1055 if (objective_function == 1) { # Using modularity 1056 # Set correct node weights 1057 vertex_weights <- strength(graph, weights=weights) 1058 # Also correct resolution parameter 1059 resolution_parameter <- resolution_parameter/sum(vertex_weights) 1060 } 1061 } 1062 1063 on.exit( .Call(C_R_igraph_finalizer) ) 1064 membership <- initial_membership 1065 if (n_iterations > 0) { 1066 for (i in 1:n_iterations) { 1067 res <- .Call(C_R_igraph_community_leiden, graph, weights, 1068 vertex_weights, as.numeric(resolution_parameter), 1069 as.numeric(beta), !is.null(membership), 1070 membership) 1071 membership <- res$membership 1072 } 1073 } else { 1074 prev_quality <- -Inf 1075 quality <- 0.0 1076 while (prev_quality < quality) { 1077 prev_quality <- quality; 1078 res <- .Call(C_R_igraph_community_leiden, graph, weights, 1079 vertex_weights, as.numeric(resolution_parameter), 1080 as.numeric(beta), !is.null(membership), 1081 membership) 1082 membership <- res$membership 1083 quality <- res$quality 1084 } 1085 } 1086 res$algorithm <- "leiden" 1087 res$vcount <- vcount(graph) 1088 res$membership <- res$membership + 1 1089 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1090 res$names <- vertex_attr(graph, "name") 1091 } 1092 class(res) <- "communities" 1093 res 1094} 1095 1096#' Community detection algorithm based on interacting fluids 1097#' 1098#' The algorithm detects communities based on the simple idea of 1099#' several fluids interacting in a non-homogeneous environment 1100#' (the graph topology), expanding and contracting based on their 1101#' interaction and density. 1102#' 1103#' @param graph The input graph. The graph must be simple and connected. 1104#' Empty graphs are not supported as well as single vertex graphs. 1105#' Edge directions are ignored. Weights are not considered. 1106#' @param no.of.communities The number of communities to be found. Must be 1107#' greater than 0 and fewer than number of vertices in the graph. 1108#' @return \code{cluster_fluid_communities} returns a \code{\link{communities}} 1109#' object, please see the \code{\link{communities}} manual page for details. 1110#' @author Ferran Parés 1111#' @seealso See \code{\link{communities}} for extracting the membership, 1112#' modularity scores, etc. from the results. 1113#' 1114#' Other community detection algorithms: \code{\link{cluster_walktrap}}, 1115#' \code{\link{cluster_spinglass}}, 1116#' \code{\link{cluster_leading_eigen}}, 1117#' \code{\link{cluster_edge_betweenness}}, 1118#' \code{\link{cluster_fast_greedy}}, 1119#' \code{\link{cluster_label_prop}} 1120#' \code{\link{cluster_louvain}}, 1121#' \code{\link{cluster_leiden}} 1122#' @references Parés F, Gasulla DG, et. al. (2018) Fluid Communities: A Competitive, 1123#' Scalable and Diverse Community Detection Algorithm. In: Complex Networks 1124#' & Their Applications VI: Proceedings of Complex Networks 2017 (The Sixth 1125#' International Conference on Complex Networks and Their Applications), 1126#' Springer, vol 689, p 229, doi: 10.1007/978-3-319-72150-7_19 1127#' @export 1128#' @keywords graphs 1129#' @examples 1130#' g <- graph.famous("Zachary") 1131#' comms <- cluster_fluid_communities(g, 2) 1132 1133cluster_fluid_communities <- cluster_fluid_communities 1134 1135#' Community strucure via short random walks 1136#' 1137#' This function tries to find densely connected subgraphs, also called 1138#' communities in a graph via random walks. The idea is that short random walks 1139#' tend to stay in the same community. 1140#' 1141#' This function is the implementation of the Walktrap community finding 1142#' algorithm, see Pascal Pons, Matthieu Latapy: Computing communities in large 1143#' networks using random walks, https://arxiv.org/abs/physics/0512106 1144#' 1145#' @aliases walktrap.community 1146#' @param graph The input graph, edge directions are ignored in directed 1147#' graphs. 1148#' @param weights The edge weights. Larger edge weights increase the 1149#' probability that an edge is selected by the random walker. In other 1150#' words, larger edge weights correspond to stronger connections. 1151#' @param steps The length of the random walks to perform. 1152#' @param merges Logical scalar, whether to include the merge matrix in the 1153#' result. 1154#' @param modularity Logical scalar, whether to include the vector of the 1155#' modularity scores in the result. If the \code{membership} argument is true, 1156#' then it will be always calculated. 1157#' @param membership Logical scalar, whether to calculate the membership vector 1158#' for the split corresponding to the highest modularity value. 1159#' @return \code{cluster_walktrap} returns a \code{\link{communities}} 1160#' object, please see the \code{\link{communities}} manual page for details. 1161#' @author Pascal Pons (\url{http://psl.pons.free.fr/}) and Gabor Csardi 1162#' \email{csardi.gabor@@gmail.com} for the R and igraph interface 1163#' @seealso See \code{\link{communities}} on getting the actual membership 1164#' vector, merge matrix, modularity score, etc. 1165#' 1166#' \code{\link{modularity}} and \code{\link{cluster_fast_greedy}}, 1167#' \code{\link{cluster_spinglass}}, 1168#' \code{\link{cluster_leading_eigen}}, 1169#' \code{\link{cluster_edge_betweenness}}, \code{\link{cluster_louvain}}, 1170#' and \code{\link{cluster_leiden}} for other community detection 1171#' methods. 1172#' @references Pascal Pons, Matthieu Latapy: Computing communities in large 1173#' networks using random walks, https://arxiv.org/abs/physics/0512106 1174#' @export 1175#' @keywords graphs 1176#' @examples 1177#' 1178#' g <- make_full_graph(5) %du% make_full_graph(5) %du% make_full_graph(5) 1179#' g <- add_edges(g, c(1,6, 1,11, 6, 11)) 1180#' cluster_walktrap(g) 1181#' 1182cluster_walktrap <- function(graph, weights=E(graph)$weight, steps=4, 1183 merges=TRUE, modularity=TRUE, 1184 membership=TRUE) { 1185 if (!is_igraph(graph)) { 1186 stop("Not a graph object!") 1187 } 1188 1189 if (membership && !modularity) { 1190 modularity <- TRUE 1191 } 1192 1193 if (!is.null(weights)) { 1194 weights <- as.numeric(weights) 1195 } 1196 1197 on.exit( .Call(C_R_igraph_finalizer) ) 1198 res <- .Call(C_R_igraph_walktrap_community, graph, weights, as.numeric(steps), 1199 as.logical(merges), as.logical(modularity), as.logical(membership)) 1200 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1201 res$names <- V(graph)$name 1202 } 1203 1204 res$vcount <- vcount(graph) 1205 res$algorithm <- "walktrap" 1206 res$membership <- res$membership + 1 1207 res$merges <- res$merges + 1 1208 class(res) <- "communities" 1209 res 1210} 1211 1212 1213 1214#' Community structure detection based on edge betweenness 1215#' 1216#' Many networks consist of modules which are densely connected themselves but 1217#' sparsely connected to other modules. 1218#' 1219#' The edge betweenness score of an edge measures the number of shortest paths 1220#' through it, see \code{\link{edge_betweenness}} for details. The idea of the 1221#' edge betweenness based community structure detection is that it is likely 1222#' that edges connecting separate modules have high edge betweenness as all the 1223#' shortest paths from one module to another must traverse through them. So if 1224#' we gradually remove the edge with the highest edge betweenness score we will 1225#' get a hierarchical map, a rooted tree, called a dendrogram of the graph. The 1226#' leafs of the tree are the individual vertices and the root of the tree 1227#' represents the whole graph. 1228#' 1229#' \code{cluster_edge_betweenness} performs this algorithm by calculating the 1230#' edge betweenness of the graph, removing the edge with the highest edge 1231#' betweenness score, then recalculating edge betweenness of the edges and 1232#' again removing the one with the highest score, etc. 1233#' 1234#' \code{edge.betweeness.community} returns various information collected 1235#' through the run of the algorithm. See the return value down here. 1236#' 1237#' @aliases edge.betweenness.community cluster_edge_betweenness 1238#' @param graph The graph to analyze. 1239#' @param weights The edge weights. Supply \code{NULL} to omit edge weights. By 1240#' default the \sQuote{\code{weight}} edge attribute is used, if it is present. 1241#' Edge weights are used to calculate weighted edge betweenness. This means 1242#' that edges are interpreted as distances, not as connection strengths. 1243#' @param directed Logical constant, whether to calculate directed edge 1244#' betweenness for directed graphs. It is ignored for undirected graphs. 1245#' @param edge.betweenness Logical constant, whether to return the edge 1246#' betweenness of the edges at the time of their removal. 1247#' @param merges Logical constant, whether to return the merge matrix 1248#' representing the hierarchical community structure of the network. This 1249#' argument is called \code{merges}, even if the community structure algorithm 1250#' itself is divisive and not agglomerative: it builds the tree from top to 1251#' bottom. There is one line for each merge (i.e. split) in matrix, the first 1252#' line is the first merge (last split). The communities are identified by 1253#' integer number starting from one. Community ids smaller than or equal to 1254#' \eqn{N}, the number of vertices in the graph, belong to singleton 1255#' communities, ie. individual vertices. Before the first merge we have \eqn{N} 1256#' communities numbered from one to \eqn{N}. The first merge, the first line of 1257#' the matrix creates community \eqn{N+1}, the second merge creates community 1258#' \eqn{N+2}, etc. 1259#' @param bridges Logical constant, whether to return a list the edge removals 1260#' which actually splitted a component of the graph. 1261#' @param modularity Logical constant, whether to calculate the maximum 1262#' modularity score, considering all possibly community structures along the 1263#' edge-betweenness based edge removals. 1264#' @param membership Logical constant, whether to calculate the membership 1265#' vector corresponding to the highest possible modularity score. 1266#' @return \code{cluster_edge_betweenness} returns a 1267#' \code{\link{communities}} object, please see the \code{\link{communities}} 1268#' manual page for details. 1269#' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 1270#' @seealso \code{\link{edge_betweenness}} for the definition and calculation 1271#' of the edge betweenness, \code{\link{cluster_walktrap}}, 1272#' \code{\link{cluster_fast_greedy}}, 1273#' \code{\link{cluster_leading_eigen}} for other community detection 1274#' methods. 1275#' 1276#' See \code{\link{communities}} for extracting the results of the community 1277#' detection. 1278#' @references M Newman and M Girvan: Finding and evaluating community 1279#' structure in networks, \emph{Physical Review E} 69, 026113 (2004) 1280#' @export 1281#' @keywords graphs 1282#' @examples 1283#' 1284#' g <- sample_pa(100, m = 2, directed = FALSE) 1285#' eb <- cluster_edge_betweenness(g) 1286#' 1287#' g <- make_full_graph(10) %du% make_full_graph(10) 1288#' g <- add_edges(g, c(1,11)) 1289#' eb <- cluster_edge_betweenness(g) 1290#' eb 1291#' 1292cluster_edge_betweenness <- function(graph, weights=E(graph)$weight, 1293 directed=TRUE, 1294 edge.betweenness=TRUE, 1295 merges=TRUE, bridges=TRUE, 1296 modularity=TRUE, 1297 membership=TRUE) { 1298 if (!is_igraph(graph)) { 1299 stop("Not a graph object!") 1300 } 1301 1302 if (!is.null(weights)) { 1303 weights <- as.numeric(weights) 1304 } 1305 1306 on.exit( .Call(C_R_igraph_finalizer) ) 1307 res <- .Call(C_R_igraph_community_edge_betweenness, graph, weights, 1308 as.logical(directed), 1309 as.logical(edge.betweenness), 1310 as.logical(merges), as.logical(bridges), 1311 as.logical(modularity), as.logical(membership)) 1312 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1313 res$names <- V(graph)$name 1314 } 1315 res$vcount <- vcount(graph) 1316 res$algorithm <- "edge betweenness" 1317 res$membership <- res$membership + 1 1318 res$merges <- res$merges + 1 1319 res$removed.edges <- res$removed.edges + 1 1320 res$bridges <- res$bridges + 1 1321 class(res) <- "communities" 1322 res 1323} 1324 1325#' Community structure via greedy optimization of modularity 1326#' 1327#' This function tries to find dense subgraph, also called communities in 1328#' graphs via directly optimizing a modularity score. 1329#' 1330#' This function implements the fast greedy modularity optimization algorithm 1331#' for finding community structure, see A Clauset, MEJ Newman, C Moore: Finding 1332#' community structure in very large networks, 1333#' http://www.arxiv.org/abs/cond-mat/0408187 for the details. 1334#' 1335#' @aliases fastgreedy.community 1336#' @param graph The input graph 1337#' @param merges Logical scalar, whether to return the merge matrix. 1338#' @param modularity Logical scalar, whether to return a vector containing the 1339#' modularity after each merge. 1340#' @param membership Logical scalar, whether to calculate the membership vector 1341#' corresponding to the maximum modularity score, considering all possible 1342#' community structures along the merges. 1343#' @param weights If not \code{NULL}, then a numeric vector of edge weights. 1344#' The length must match the number of edges in the graph. By default the 1345#' \sQuote{\code{weight}} edge attribute is used as weights. If it is not 1346#' present, then all edges are considered to have the same weight. 1347#' Larger edge weights correspond to stronger connections. 1348#' @return \code{cluster_fast_greedy} returns a \code{\link{communities}} 1349#' object, please see the \code{\link{communities}} manual page for details. 1350#' @author Tamas Nepusz \email{ntamas@@gmail.com} and Gabor Csardi 1351#' \email{csardi.gabor@@gmail.com} for the R interface. 1352#' @seealso \code{\link{communities}} for extracting the results. 1353#' 1354#' See also \code{\link{cluster_walktrap}}, 1355#' \code{\link{cluster_spinglass}}, 1356#' \code{\link{cluster_leading_eigen}} and 1357#' \code{\link{cluster_edge_betweenness}}, \code{\link{cluster_louvain}} 1358#' \code{\link{cluster_leiden}} for other methods. 1359#' @references A Clauset, MEJ Newman, C Moore: Finding community structure in 1360#' very large networks, http://www.arxiv.org/abs/cond-mat/0408187 1361#' @export 1362#' @keywords graphs 1363#' @examples 1364#' 1365#' g <- make_full_graph(5) %du% make_full_graph(5) %du% make_full_graph(5) 1366#' g <- add_edges(g, c(1,6, 1,11, 6, 11)) 1367#' fc <- cluster_fast_greedy(g) 1368#' membership(fc) 1369#' sizes(fc) 1370#' 1371cluster_fast_greedy <- function(graph, merges=TRUE, modularity=TRUE, 1372 membership=TRUE, 1373 weights=E(graph)$weight) { 1374 if (!is_igraph(graph)) { 1375 stop("Not a graph object") 1376 } 1377 1378 if (!is.null(weights)) { 1379 weights <- as.numeric(weights) 1380 } 1381 1382 on.exit( .Call(C_R_igraph_finalizer) ) 1383 res <- .Call(C_R_igraph_community_fastgreedy, graph, as.logical(merges), 1384 as.logical(modularity), as.logical(membership), weights) 1385 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1386 res$names <- V(graph)$name 1387 } 1388 res$algorithm <- "fast greedy" 1389 res$vcount <- vcount(graph) 1390 res$membership <- res$membership + 1 1391 res$merges <- res$merges + 1 1392 class(res) <- "communities" 1393 res 1394} 1395 1396igraph.i.levc.arp <- function(externalP, externalE) { 1397 f <- function(v) { 1398 v <- as.numeric(v) 1399 .Call(C_R_igraph_i_levc_arp, externalP, externalE, v) 1400 } 1401 f 1402} 1403 1404 1405 1406#' Community structure detecting based on the leading eigenvector of the 1407#' community matrix 1408#' 1409#' This function tries to find densely connected subgraphs in a graph by 1410#' calculating the leading non-negative eigenvector of the modularity matrix of 1411#' the graph. 1412#' 1413#' The function documented in these section implements the \sQuote{leading 1414#' eigenvector} method developed by Mark Newman, see the reference below. 1415#' 1416#' The heart of the method is the definition of the modularity matrix, 1417#' \code{B}, which is \code{B=A-P}, \code{A} being the adjacency matrix of the 1418#' (undirected) network, and \code{P} contains the probability that certain 1419#' edges are present according to the \sQuote{configuration model}. In other 1420#' words, a \code{P[i,j]} element of \code{P} is the probability that there is 1421#' an edge between vertices \code{i} and \code{j} in a random network in which 1422#' the degrees of all vertices are the same as in the input graph. 1423#' 1424#' The leading eigenvector method works by calculating the eigenvector of the 1425#' modularity matrix for the largest positive eigenvalue and then separating 1426#' vertices into two community based on the sign of the corresponding element 1427#' in the eigenvector. If all elements in the eigenvector are of the same sign 1428#' that means that the network has no underlying comuunity structure. Check 1429#' Newman's paper to understand why this is a good method for detecting 1430#' community structure. 1431#' 1432#' @aliases leading.eigenvector.community 1433#' @param graph The input graph. Should be undirected as the method needs a 1434#' symmetric matrix. 1435#' @param steps The number of steps to take, this is actually the number of 1436#' tries to make a step. It is not a particularly useful parameter. 1437#' @param weights An optional weight vector. The \sQuote{weight} edge attribute 1438#' is used if present. Supply \sQuote{\code{NA}} here if you want to ignore the 1439#' \sQuote{weight} edge attribute. Larger edge weights correspond to stronger 1440#' connections between vertices. 1441#' @param start \code{NULL}, or a numeric membership vector, giving the start 1442#' configuration of the algorithm. 1443#' @param options A named list to override some ARPACK options. 1444#' @param callback If not \code{NULL}, then it must be callback function. This 1445#' is called after each iteration, after calculating the leading eigenvector of 1446#' the modularity matrix. See details below. 1447#' @param extra Additional argument to supply to the callback function. 1448#' @param env The environment in which the callback function is evaluated. 1449#' @return \code{cluster_leading_eigen} returns a named list with the 1450#' following members: \item{membership}{The membership vector at the end of the 1451#' algorithm, when no more splits are possible.} \item{merges}{The merges 1452#' matrix starting from the state described by the \code{membership} member. 1453#' This is a two-column matrix and each line describes a merge of two 1454#' communities, the first line is the first merge and it creates community 1455#' \sQuote{\code{N}}, \code{N} is the number of initial communities in the 1456#' graph, the second line creates community \code{N+1}, etc. } 1457#' \item{options}{Information about the underlying ARPACK computation, see 1458#' \code{\link{arpack}} for details. } 1459#' @section Callback functions: The \code{callback} argument can be used to 1460#' supply a function that is called after each eigenvector calculation. The 1461#' following arguments are supplied to this function: \describe{ 1462#' \item{membership}{The actual membership vector, with zero-based indexing.} 1463#' \item{community}{The community that the algorithm just tried to split, 1464#' community numbering starts with zero here.} 1465#' \item{value}{The eigenvalue belonging to the leading eigenvector the 1466#' algorithm just found.} 1467#' \item{vector}{The leading eigenvector the algorithm just found.} 1468#' \item{multiplier}{An R function that can be used to multiple the actual 1469#' modularity matrix with an arbitrary vector. Supply the vector as an 1470#' argument to perform this multiplication. This function can be used 1471#' with ARPACK.} 1472#' \item{extra}{The \code{extra} argument that was passed to 1473#' \code{cluster_leading_eigen}. } 1474#' The callback function should return a scalar number. If this number 1475#' is non-zero, then the clustering is terminated. 1476#' } 1477#' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 1478#' @seealso \code{\link{modularity}}, \code{\link{cluster_walktrap}}, 1479#' \code{\link{cluster_edge_betweenness}}, 1480#' \code{\link{cluster_fast_greedy}}, \code{\link[stats]{as.dendrogram}} 1481#' @references MEJ Newman: Finding community structure using the eigenvectors 1482#' of matrices, Physical Review E 74 036104, 2006. 1483#' @export 1484#' @keywords graphs 1485#' @examples 1486#' 1487#' g <- make_full_graph(5) %du% make_full_graph(5) %du% make_full_graph(5) 1488#' g <- add_edges(g, c(1,6, 1,11, 6, 11)) 1489#' lec <- cluster_leading_eigen(g) 1490#' lec 1491#' 1492#' cluster_leading_eigen(g, start=membership(lec)) 1493#' 1494cluster_leading_eigen <- function(graph, steps=-1, weights=NULL, 1495 start=NULL, 1496 options=arpack_defaults, 1497 callback=NULL, extra=NULL, 1498 env=parent.frame()){ 1499 1500 # Argument checks 1501 if (!is_igraph(graph)) { stop("Not a graph object") } 1502 steps <- as.integer(steps) 1503 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 1504 weights <- E(graph)$weight 1505 } 1506 if (!is.null(weights) && any(!is.na(weights))) { 1507 weights <- as.numeric(weights) 1508 } else { 1509 weights <- NULL 1510 } 1511 if (!is.null(start)) { start <- as.numeric(start)-1 } 1512 options.tmp <- arpack_defaults; options.tmp[ names(options) ] <- options ; options <- options.tmp 1513 1514 on.exit( .Call(C_R_igraph_finalizer) ) 1515 # Function call 1516 res <- .Call(C_R_igraph_community_leading_eigenvector, graph, steps, 1517 weights, options, start, callback, extra, env, 1518 environment(igraph.i.levc.arp)) 1519 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1520 res$names <- V(graph)$name 1521 } 1522 res$algorithm <- "leading eigenvector" 1523 res$vcount <- vcount(graph) 1524 res$membership <- res$membership + 1 1525 res$merges <- res$merges + 1 1526 res$history <- res$history + 1 1527 class(res) <- "communities" 1528 res 1529} 1530 1531#' Finding communities based on propagating labels 1532#' 1533#' This is a fast, nearly linear time algorithm for detecting community 1534#' structure in networks. In works by labeling the vertices with unique labels 1535#' and then updating the labels by majority voting in the neighborhood of the 1536#' vertex. 1537#' 1538#' This function implements the community detection method described in: 1539#' Raghavan, U.N. and Albert, R. and Kumara, S.: Near linear time algorithm to 1540#' detect community structures in large-scale networks. Phys Rev E 76, 036106. 1541#' (2007). This version extends the original method by the ability to take edge 1542#' weights into consideration and also by allowing some labels to be fixed. 1543#' 1544#' From the abstract of the paper: \dQuote{In our algorithm every node is 1545#' initialized with a unique label and at every step each node adopts the label 1546#' that most of its neighbors currently have. In this iterative process densely 1547#' connected groups of nodes form a consensus on a unique label to form 1548#' communities.} 1549#' 1550#' @aliases label.propagation.community 1551#' @param graph The input graph, should be undirected to make sense. 1552#' @param weights An optional weight vector. It should contain a positive 1553#' weight for all the edges. The \sQuote{weight} edge attribute is used if 1554#' present. Supply \sQuote{\code{NA}} here if you want to ignore the 1555#' \sQuote{weight} edge attribute. Larger edge weights correspond to 1556#' stronger connections. 1557#' @param initial The initial state. If \code{NULL}, every vertex will have a 1558#' different label at the beginning. Otherwise it must be a vector with an 1559#' entry for each vertex. Non-negative values denote different labels, negative 1560#' entries denote vertices without labels. 1561#' @param fixed Logical vector denoting which labels are fixed. Of course this 1562#' makes sense only if you provided an initial state, otherwise this element 1563#' will be ignored. Also note that vertices without labels cannot be fixed. 1564#' @return \code{cluster_label_prop} returns a 1565#' \code{\link{communities}} object, please see the \code{\link{communities}} 1566#' manual page for details. 1567#' @author Tamas Nepusz \email{ntamas@@gmail.com} for the C implementation, 1568#' Gabor Csardi \email{csardi.gabor@@gmail.com} for this manual page. 1569#' @seealso \code{\link{communities}} for extracting the actual results. 1570#' 1571#' \code{\link{cluster_fast_greedy}}, \code{\link{cluster_walktrap}}, 1572#' \code{\link{cluster_spinglass}}, \code{\link{cluster_louvain}} and 1573#' \code{\link{cluster_leiden}} for other community detection methods. 1574#' @references Raghavan, U.N. and Albert, R. and Kumara, S.: Near linear time 1575#' algorithm to detect community structures in large-scale networks. \emph{Phys 1576#' Rev E} 76, 036106. (2007) 1577#' @export 1578#' @keywords graphs 1579#' @examples 1580#' 1581#' g <- sample_gnp(10, 5/10) %du% sample_gnp(9, 5/9) 1582#' g <- add_edges(g, c(1, 12)) 1583#' cluster_label_prop(g) 1584#' 1585cluster_label_prop <- function(graph, weights=NULL, initial=NULL, 1586 fixed=NULL) { 1587 # Argument checks 1588 if (!is_igraph(graph)) { stop("Not a graph object") } 1589 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 1590 weights <- E(graph)$weight 1591 } 1592 if (!is.null(weights) && any(!is.na(weights))) { 1593 weights <- as.numeric(weights) 1594 } else { 1595 weights <- NULL 1596 } 1597 if (!is.null(initial)) initial <- as.numeric(initial) 1598 if (!is.null(fixed)) fixed <- as.logical(fixed) 1599 1600 on.exit( .Call(C_R_igraph_finalizer) ) 1601 # Function call 1602 res <- .Call(C_R_igraph_community_label_propagation, graph, weights, initial, fixed) 1603 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1604 res$names <- V(graph)$name 1605 } 1606 res$vcount <- vcount(graph) 1607 res$algorithm <- "label propagation" 1608 res$membership <- res$membership + 1 1609 class(res) <- "communities" 1610 res 1611} 1612 1613 1614 1615#' Finding community structure by multi-level optimization of modularity 1616#' 1617#' This function implements the multi-level modularity optimization algorithm 1618#' for finding community structure, see references below. It is based on the 1619#' modularity measure and a hierarchical approach. 1620#' 1621#' This function implements the multi-level modularity optimization algorithm 1622#' for finding community structure, see VD Blondel, J-L Guillaume, R Lambiotte 1623#' and E Lefebvre: Fast unfolding of community hierarchies in large networks, 1624#' \url{https://arxiv.org/abs/0803.0476} for the details. 1625#' 1626#' It is based on the modularity measure and a hierarchical approach. 1627#' Initially, each vertex is assigned to a community on its own. In every step, 1628#' vertices are re-assigned to communities in a local, greedy way: each vertex 1629#' is moved to the community with which it achieves the highest contribution to 1630#' modularity. When no vertices can be reassigned, each community is considered 1631#' a vertex on its own, and the process starts again with the merged 1632#' communities. The process stops when there is only a single vertex left or 1633#' when the modularity cannot be increased any more in a step. 1634#' 1635#' This function was contributed by Tom Gregorovic. 1636#' 1637#' @aliases multilevel.community 1638#' @param graph The input graph. 1639#' @param weights Optional positive weight vector. If the graph has a 1640#' \code{weight} edge attribute, then this is used by default. Supply \code{NA} 1641#' here if the graph has a \code{weight} edge attribute, but you want to ignore 1642#' it. Larger edge weights correspond to stronger connections. 1643#' @return \code{cluster_louvain} returns a \code{\link{communities}} 1644#' object, please see the \code{\link{communities}} manual page for details. 1645#' @author Tom Gregorovic, Tamas Nepusz \email{ntamas@@gmail.com} 1646#' @seealso See \code{\link{communities}} for extracting the membership, 1647#' modularity scores, etc. from the results. 1648#' 1649#' Other community detection algorithms: \code{\link{cluster_walktrap}}, 1650#' \code{\link{cluster_spinglass}}, 1651#' \code{\link{cluster_leading_eigen}}, 1652#' \code{\link{cluster_edge_betweenness}}, 1653#' \code{\link{cluster_fast_greedy}}, 1654#' \code{\link{cluster_label_prop}} 1655#' \code{\link{cluster_leiden}} 1656#' @references Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, 1657#' Etienne Lefebvre: Fast unfolding of communities in large networks. J. Stat. 1658#' Mech. (2008) P10008 1659#' @export 1660#' @keywords graphs 1661#' @examples 1662#' 1663#' # This is so simple that we will have only one level 1664#' g <- make_full_graph(5) %du% make_full_graph(5) %du% make_full_graph(5) 1665#' g <- add_edges(g, c(1,6, 1,11, 6, 11)) 1666#' cluster_louvain(g) 1667#' 1668cluster_louvain <- function(graph, weights=NULL) { 1669 # Argument checks 1670 if (!is_igraph(graph)) { stop("Not a graph object") } 1671 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 1672 weights <- E(graph)$weight 1673 } 1674 if (!is.null(weights) && any(!is.na(weights))) { 1675 weights <- as.numeric(weights) 1676 } else { 1677 weights <- NULL 1678 } 1679 1680 on.exit( .Call(C_R_igraph_finalizer) ) 1681 # Function call 1682 res <- .Call(C_R_igraph_community_multilevel, graph, weights) 1683 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1684 res$names <- V(graph)$name 1685 } 1686 res$vcount <- vcount(graph) 1687 res$algorithm <- "multi level" 1688 res$membership <- res$membership + 1 1689 res$memberships <- res$memberships + 1 1690 class(res) <- "communities" 1691 res 1692} 1693 1694 1695 1696#' Optimal community structure 1697#' 1698#' This function calculates the optimal community structure of a graph, by 1699#' maximizing the modularity measure over all possible partitions. 1700#' 1701#' This function calculates the optimal community structure for a graph, in 1702#' terms of maximal modularity score. 1703#' 1704#' The calculation is done by transforming the modularity maximization into an 1705#' integer programming problem, and then calling the GLPK library to solve 1706#' that. Please the reference below for details. 1707#' 1708#' Note that modularity optimization is an NP-complete problem, and all known 1709#' algorithms for it have exponential time complexity. This means that you 1710#' probably don't want to run this function on larger graphs. Graphs with up to 1711#' fifty vertices should be fine, graphs with a couple of hundred vertices 1712#' might be possible. 1713#' 1714#' @section Examples: 1715#' \preformatted{ 1716#' 1717#' ## Zachary's karate club 1718#' g <- make_graph("Zachary") 1719#' 1720#' ## We put everything into a big 'try' block, in case 1721#' ## igraph was compiled without GLPK support 1722#' 1723#' ## The calculation only takes a couple of seconds 1724#' oc <- cluster_optimal(g) 1725#' 1726#' ## Double check the result 1727#' print(modularity(oc)) 1728#' print(modularity(g, membership(oc))) 1729#' 1730#' ## Compare to the greedy optimizer 1731#' fc <- cluster_fast_greedy(g) 1732#' print(modularity(fc)) 1733#' } 1734#' 1735#' @aliases optimal.community 1736#' @param graph The input graph. Edge directions are ignored for directed 1737#' graphs. 1738#' @param weights Optional positive weight vector for optimizing weighted 1739#' modularity. If the graph has a \code{weight} edge attribute, then this is 1740#' used by default. Supply \code{NA} to ignore the weights of a weighted graph. 1741#' Larger edge weights correspond to stronger connections. 1742#' @return \code{cluster_optimal} returns a \code{\link{communities}} object, 1743#' please see the \code{\link{communities}} manual page for details. 1744#' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 1745#' @seealso \code{\link{communities}} for the documentation of the result, 1746#' \code{\link{modularity}}. See also \code{\link{cluster_fast_greedy}} for a 1747#' fast greedy optimizer. 1748#' @references Ulrik Brandes, Daniel Delling, Marco Gaertler, Robert Gorke, 1749#' Martin Hoefer, Zoran Nikoloski, Dorothea Wagner: On Modularity Clustering, 1750#' \emph{IEEE Transactions on Knowledge and Data Engineering} 20(2):172-188, 1751#' 2008. 1752#' @export 1753#' @keywords graphs 1754 1755cluster_optimal <- function(graph, weights=NULL) { 1756 # Argument checks 1757 if (!is_igraph(graph)) { stop("Not a graph object") } 1758 if (is.null(weights) && "weight" %in% edge_attr_names(graph)) { 1759 weights <- E(graph)$weight 1760 } 1761 if (!is.null(weights) && any(!is.na(weights))) { 1762 weights <- as.numeric(weights) 1763 } else { 1764 weights <- NULL 1765 } 1766 1767 on.exit( .Call(C_R_igraph_finalizer) ) 1768 # Function call 1769 res <- .Call(C_R_igraph_community_optimal_modularity, graph, weights) 1770 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1771 res$names <- V(graph)$name 1772 } 1773 res$vcount <- vcount(graph) 1774 res$algorithm <- "optimal" 1775 res$membership <- res$membership + 1 1776 class(res) <- "communities" 1777 res 1778} 1779 1780 1781 1782#' Infomap community finding 1783#' 1784#' Find community structure that minimizes the expected description length of a 1785#' random walker trajectory 1786#' 1787#' Please see the details of this method in the references given below. 1788#' 1789#' @aliases infomap.community 1790#' @param graph The input graph. 1791#' @param e.weights If not \code{NULL}, then a numeric vector of edge weights. 1792#' The length must match the number of edges in the graph. By default the 1793#' \sQuote{\code{weight}} edge attribute is used as weights. If it is not 1794#' present, then all edges are considered to have the same weight. 1795#' Larger edge weights correspond to stronger connections. 1796#' @param v.weights If not \code{NULL}, then a numeric vector of vertex 1797#' weights. The length must match the number of vertices in the graph. By 1798#' default the \sQuote{\code{weight}} vertex attribute is used as weights. If 1799#' it is not present, then all vertices are considered to have the same weight. 1800#' A larger vertex weight means a larger probability that the random surfer 1801#' jumps to that vertex. 1802#' @param nb.trials The number of attempts to partition the network (can be any 1803#' integer value equal or larger than 1). 1804#' @param modularity Logical scalar, whether to calculate the modularity score 1805#' of the detected community structure. 1806#' @return \code{cluster_infomap} returns a \code{\link{communities}} object, 1807#' please see the \code{\link{communities}} manual page for details. 1808#' @author Martin Rosvall wrote the original C++ code. This was ported to 1809#' be more igraph-like by Emmanuel Navarro. The R interface and 1810#' some cosmetics was done by Gabor Csardi \email{csardi.gabor@@gmail.com}. 1811#' @seealso Other community finding methods and \code{\link{communities}}. 1812#' @references The original paper: M. Rosvall and C. T. Bergstrom, Maps of 1813#' information flow reveal community structure in complex networks, \emph{PNAS} 1814#' 105, 1118 (2008) \doi{10.1073/pnas.0706851105}, \url{https://arxiv.org/abs/0707.0609} 1815#' 1816#' A more detailed paper: M. Rosvall, D. Axelsson, and C. T. Bergstrom, The map 1817#' equation, \emph{Eur. Phys. J. Special Topics} 178, 13 (2009). 1818#' \doi{10.1140/epjst/e2010-01179-1}, \url{https://arxiv.org/abs/0906.1405}. 1819#' @export 1820#' @keywords graphs 1821#' @examples 1822#' 1823#' ## Zachary's karate club 1824#' g <- make_graph("Zachary") 1825#' 1826#' imc <- cluster_infomap(g) 1827#' membership(imc) 1828#' communities(imc) 1829#' 1830cluster_infomap <- function(graph, e.weights=NULL, v.weights=NULL, 1831 nb.trials=10, modularity=TRUE) { 1832 1833 # Argument checks 1834 if (!is_igraph(graph)) { stop("Not a graph object") } 1835 if (is.null(e.weights) && "weight" %in% edge_attr_names(graph)) { 1836 e.weights <- E(graph)$weight 1837 } 1838 if (!is.null(e.weights) && any(!is.na(e.weights))) { 1839 e.weights <- as.numeric(e.weights) 1840 } else { 1841 e.weights <- NULL 1842 } 1843 if (is.null(v.weights) && "weight" %in% vertex_attr_names(graph)) { 1844 v.weights <- V(graph)$weight 1845 } 1846 if (!is.null(v.weights) && any(!is.na(v.weights))) { 1847 v.weights <- as.numeric(v.weights) 1848 } else { 1849 v.weights <- NULL 1850 } 1851 nb.trials <- as.integer(nb.trials) 1852 1853 on.exit( .Call(C_R_igraph_finalizer) ) 1854 # Function call 1855 res <- .Call(C_R_igraph_community_infomap, graph, e.weights, 1856 v.weights, nb.trials) 1857 1858 if (igraph_opt("add.vertex.names") && is_named(graph)) { 1859 res$names <- V(graph)$name 1860 } 1861 res$vcount <- vcount(graph) 1862 res$algorithm <- "infomap" 1863 res$membership <- res$membership + 1 1864 if (modularity) { 1865 res$modularity <- modularity(graph, res$membership, weights=e.weights) 1866 } 1867 class(res) <- "communities" 1868 res 1869} 1870 1871#' @rdname communities 1872#' @method plot communities 1873#' @export 1874#' @importFrom graphics plot 1875 1876plot.communities <- function(x, y, 1877 col=membership(x), 1878 mark.groups=communities(x), 1879 edge.color=c("black", "red")[crossing(x,y)+1], 1880 ...) { 1881 1882 plot(y, vertex.color=col, mark.groups=mark.groups, 1883 edge.color=edge.color, 1884 ...) 1885} 1886 1887 1888 1889#' @rdname plot_dendrogram.communities 1890#' @aliases dendPlot 1891#' @export 1892 1893plot_dendrogram <- function(x, mode=igraph_opt("dend.plot.type"), ...) 1894 UseMethod("plot_dendrogram") 1895 1896 1897 1898#' Community structure dendrogram plots 1899#' 1900#' Plot a hierarchical community structure as a dendrogram. 1901#' 1902#' \code{plot_dendrogram} supports three different plotting functions, selected via 1903#' the \code{mode} argument. By default the plotting function is taken from the 1904#' \code{dend.plot.type} igraph option, and it has for possible values: 1905#' \itemize{ \item \code{auto} Choose automatically between the plotting 1906#' functions. As \code{plot.phylo} is the most sophisticated, that is choosen, 1907#' whenever the \code{ape} package is available. Otherwise \code{plot.hclust} 1908#' is used. \item \code{phylo} Use \code{plot.phylo} from the \code{ape} 1909#' package. \item \code{hclust} Use \code{plot.hclust} from the \code{stats} 1910#' package. \item \code{dendrogram} Use \code{plot.dendrogram} from the 1911#' \code{stats} package. } 1912#' 1913#' The different plotting functions take different sets of arguments. When 1914#' using \code{plot.phylo} (\code{mode="phylo"}), we have the following syntax: 1915#' \preformatted{ 1916#' plot_dendrogram(x, mode="phylo", colbar = palette(), 1917#' edge.color = NULL, use.edge.length = FALSE, \dots) 1918#' } The extra arguments not documented above: \itemize{ 1919#' \item \code{colbar} Color bar for the edges. 1920#' \item \code{edge.color} Edge colors. If \code{NULL}, then the 1921#' \code{colbar} argument is used. 1922#' \item \code{use.edge.length} Passed to \code{plot.phylo}. 1923#' \item \code{dots} Attitional arguments to pass to \code{plot.phylo}. 1924#' } 1925#' 1926#' The syntax for \code{plot.hclust} (\code{mode="hclust"}): \preformatted{ 1927#' plot_dendrogram(x, mode="hclust", rect = 0, colbar = palette(), 1928#' hang = 0.01, ann = FALSE, main = "", sub = "", xlab = "", 1929#' ylab = "", \dots) 1930#' } The extra arguments not documented above: \itemize{ 1931#' \item \code{rect} A numeric scalar, the number of groups to mark on 1932#' the dendrogram. The dendrogram is cut into exactly \code{rect} 1933#' groups and they are marked via the \code{rect.hclust} command. Set 1934#' this to zero if you don't want to mark any groups. 1935#' \item \code{colbar} The colors of the rectangles that mark the 1936#' vertex groups via the \code{rect} argument. 1937#' \item \code{hang} Where to put the leaf nodes, this corresponds to the 1938#' \code{hang} argument of \code{plot.hclust}. 1939#' \item \code{ann} Whether to annotate the plot, the \code{ann} 1940#' argument of \code{plot.hclust}. 1941#' \item \code{main} The main title of the plot, the \code{main} argument 1942#' of \code{plot.hclust}. 1943#' \item \code{sub} The sub-title of the plot, the \code{sub} argument of 1944#' \code{plot.hclust}. 1945#' \item \code{xlab} The label on the horizontal axis, passed to 1946#' \code{plot.hclust}. 1947#' \item \code{ylab} The label on the vertical axis, passed to 1948#' \code{plot.hclust}. 1949#' \item \code{dots} Attitional arguments to pass to \code{plot.hclust}. 1950#' } 1951#' 1952#' The syntax for \code{plot.dendrogram} (\code{mode="dendrogram"}): 1953#' \preformatted{ 1954#' plot_dendrogram(x, \dots) 1955#' } The extra arguments are simply passed to \code{as.dendrogram}. 1956#' 1957#' @param x An object containing the community structure of a graph. See 1958#' \code{\link{communities}} for details. 1959#' @param mode Which dendrogram plotting function to use. See details below. 1960#' @param \dots Additional arguments to supply to the dendrogram plotting 1961#' function. 1962#' @param use.modularity Logical scalar, whether to use the modularity values 1963#' to define the height of the branches. 1964#' @param palette The color palette to use for colored plots. 1965#' @return Returns whatever the return value was from the plotting function, 1966#' \code{plot.phylo}, \code{plot.dendrogram} or \code{plot.hclust}. 1967#' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 1968#' @method plot_dendrogram communities 1969#' @export 1970#' @keywords graphs 1971#' @examples 1972#' 1973#' karate <- make_graph("Zachary") 1974#' fc <- cluster_fast_greedy(karate) 1975#' plot_dendrogram(fc) 1976#' 1977plot_dendrogram.communities <- function(x, 1978 mode=igraph_opt("dend.plot.type"), ..., 1979 use.modularity=FALSE, 1980 palette=categorical_pal(8)) { 1981 mode <- igraph.match.arg(mode, c("auto", "phylo", "hclust", "dendrogram")) 1982 1983 old_palette <- palette(palette) 1984 on.exit(palette(old_palette), add = TRUE) 1985 1986 if (mode=="auto") { 1987 have_ape <- requireNamespace("ape", quietly = TRUE) 1988 mode <- if (have_ape) "phylo" else "hclust" 1989 } 1990 1991 if (mode=="hclust") { 1992 dendPlotHclust(x, use.modularity=use.modularity, ...) 1993 } else if (mode=="dendrogram") { 1994 dendPlotDendrogram(x, use.modularity=use.modularity, ...) 1995 } else if (mode=="phylo") { 1996 dendPlotPhylo(x, use.modularity=use.modularity, ...) 1997 } 1998} 1999 2000#' @importFrom grDevices palette 2001#' @importFrom graphics plot 2002#' @importFrom stats rect.hclust 2003 2004dendPlotHclust <- function(communities, rect=length(communities), 2005 colbar=palette(), hang=-1, ann=FALSE, 2006 main="", sub="", xlab="", ylab="", ..., 2007 use.modularity=FALSE) { 2008 hc <- as.hclust(communities, hang=hang, use.modularity=use.modularity) 2009 ret <- plot(hc, hang=hang, ann=ann, main=main, sub=sub, xlab=xlab, 2010 ylab=ylab, ...) 2011 if (rect > 0) { 2012 rect.hclust(hc, k=rect, border=colbar) 2013 } 2014 invisible(ret) 2015} 2016 2017#' @importFrom graphics plot 2018 2019dendPlotDendrogram <- function(communities, hang=-1, ..., 2020 use.modularity=FALSE) { 2021 plot(as.dendrogram(communities, hang=hang, use.modularity=use.modularity), 2022 ...) 2023} 2024 2025#' @importFrom grDevices palette 2026#' @importFrom graphics plot 2027 2028dendPlotPhylo <- function(communities, colbar=palette(), 2029 col=colbar[membership(communities)], 2030 mark.groups=communities(communities), 2031 use.modularity=FALSE, 2032 edge.color="#AAAAAAFF", 2033 edge.lty=c(1,2), ...) { 2034 2035 phy <- as_phylo(communities, use.modularity=use.modularity) 2036 2037 getedges <- function(tip) { 2038 repeat { 2039 ee <- which(! phy$edge[,1] %in% tip & phy$edge[,2] %in% tip) 2040 if (length(ee)<=1) { break } 2041 tip <- c(tip, unique(phy$edge[ee,1])) 2042 } 2043 ed <- which(phy$edge[,1] %in% tip & phy$edge[,2] %in% tip) 2044 eds <- phy$edge[ed, 1] 2045 good <- which(phy$edge[ed,1] %in% which(tabulate(eds) != 1)) 2046 ed[good] 2047 } 2048 gredges <- lapply(mark.groups, getedges) 2049 2050 if (length(mark.groups) > 0) { 2051 ecol <- rep(edge.color, nrow(phy$edge)) 2052 for (gr in seq_along(gredges)) { 2053 ecol[gredges[[gr]]] <- colbar[gr] 2054 } 2055 } else { 2056 ecol <- edge.color 2057 } 2058 2059 elty <- rep(edge.lty[2], nrow(phy$edge)) 2060 elty[ unlist(gredges) ] <- edge.lty[1] 2061 2062 plot(phy, edge.color=ecol, edge.lty=elty, tip.color=col, ...) 2063} 2064 2065#' Compares community structures using various metrics 2066#' 2067#' This function assesses the distance between two community structures. 2068#' 2069#' 2070#' @aliases compare.communities compare.membership compare 2071#' @param comm1 A \code{\link{communities}} object containing a community 2072#' structure; or a numeric vector, the membership vector of the first community 2073#' structure. The membership vector should contain the community id of each 2074#' vertex, the numbering of the communities starts with one. 2075#' @param comm2 A \code{\link{communities}} object containing a community 2076#' structure; or a numeric vector, the membership vector of the second 2077#' community structure, in the same format as for the previous argument. 2078#' @param method Character scalar, the comparison method to use. Possible 2079#' values: \sQuote{vi} is the variation of information (VI) metric of Meila 2080#' (2003), \sQuote{nmi} is the normalized mutual information measure proposed 2081#' by Danon et al. (2005), \sQuote{split.join} is the split-join distance of 2082#' can Dongen (2000), \sQuote{rand} is the Rand index of Rand (1971), 2083#' \sQuote{adjusted.rand} is the adjusted Rand index by Hubert and Arabie 2084#' (1985). 2085#' @return A real number. 2086#' @author Tamas Nepusz \email{ntamas@@gmail.com} 2087#' @seealso See \code{\link{cluster_walktrap}}, 2088#' \code{\link{cluster_spinglass}}, 2089#' \code{\link{cluster_leading_eigen}}, 2090#' \code{\link{cluster_edge_betweenness}}, 2091#' \code{\link{cluster_fast_greedy}}, 2092#' \code{\link{cluster_label_prop}} 2093#' \code{\link{cluster_louvain}} 2094#' \code{\link{cluster_leiden}} 2095#' for various community detection methods. 2096#' @references Meila M: Comparing clusterings by the variation of information. 2097#' In: Scholkopf B, Warmuth MK (eds.). \emph{Learning Theory and Kernel 2098#' Machines: 16th Annual Conference on Computational Learning Theory and 7th 2099#' Kernel Workshop}, COLT/Kernel 2003, Washington, DC, USA. Lecture Notes in 2100#' Computer Science, vol. 2777, Springer, 2003. ISBN: 978-3-540-40720-1. 2101#' 2102#' Danon L, Diaz-Guilera A, Duch J, Arenas A: Comparing community structure 2103#' identification. \emph{J Stat Mech} P09008, 2005. 2104#' 2105#' van Dongen S: Performance criteria for graph clustering and Markov cluster 2106#' experiments. Technical Report INS-R0012, National Research Institute for 2107#' Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000. 2108#' 2109#' Rand WM: Objective criteria for the evaluation of clustering methods. 2110#' \emph{J Am Stat Assoc} 66(336):846-850, 1971. 2111#' 2112#' Hubert L and Arabie P: Comparing partitions. \emph{Journal of 2113#' Classification} 2:193-218, 1985. 2114#' @export 2115#' @keywords graphs 2116#' @examples 2117#' 2118#' g <- make_graph("Zachary") 2119#' sg <- cluster_spinglass(g) 2120#' le <- cluster_leading_eigen(g) 2121#' compare(sg, le, method="rand") 2122#' compare(membership(sg), membership(le)) 2123#' 2124compare <- function(comm1, comm2, method=c("vi", "nmi", 2125 "split.join", "rand", 2126 "adjusted.rand")) 2127 UseMethod("compare") 2128 2129#' @method compare communities 2130#' @export 2131 2132compare.communities <- function(comm1, comm2, 2133 method=c("vi", "nmi", "split.join", "rand", 2134 "adjusted.rand")) { 2135 2136 i_compare(comm1, comm2, method) 2137} 2138 2139#' @method compare membership 2140#' @export 2141 2142compare.membership <- function(comm1, comm2, 2143 method=c("vi", "nmi", "split.join", "rand", 2144 "adjusted.rand")) { 2145 2146 i_compare(comm1, comm2, method) 2147} 2148 2149#' @method compare default 2150#' @export 2151 2152compare.default <- compare.membership 2153 2154i_compare <- function (comm1, comm2, method=c("vi", "nmi", "split.join", 2155 "rand", "adjusted.rand")) { 2156 2157 comm1 <- if (inherits(comm1, "communities")) { 2158 as.numeric(membership(comm1)) 2159 } else { 2160 as.numeric(as.factor(comm1)) 2161 } 2162 comm2 <- if (inherits(comm2, "communities")) { 2163 as.numeric(membership(comm2)) 2164 } else { 2165 as.numeric(as.factor(comm2)) 2166 } 2167 method <- switch(igraph.match.arg(method), vi = 0, nmi = 1, 2168 split.join = 2, rand = 3, adjusted.rand = 4) 2169 on.exit(.Call(C_R_igraph_finalizer) ) 2170 res <- .Call(C_R_igraph_compare_communities, comm1, comm2, method) 2171 res 2172} 2173 2174#' Split-join distance of two community structures 2175#' 2176#' The split-join distance between partitions A and B is the sum of the 2177#' projection distance of A from B and the projection distance of B from 2178#' A. The projection distance is an asymmetric measure and it is defined as 2179#' follows: 2180#' 2181#' First, each set in partition A is evaluated against all sets in 2182#' partition B. For each set in partition A, the best matching set in 2183#' partition B is found and the overlap size is calculated. (Matching is 2184#' quantified by the size of the overlap between the two sets). Then, the 2185#' maximal overlap sizes for each set in A are summed together and 2186#' subtracted from the number of elements in A. 2187#' 2188#' The split-join distance will be returned as two numbers, the first is 2189#' the projection distance of the first partition from the 2190#' second, while the second number is the projection distance of the second 2191#' partition from the first. This makes it easier to detect whether a 2192#' partition is a subpartition of the other, since in this case, the 2193#' corresponding distance will be zero. 2194#' 2195#' @param comm1 The first community structure. 2196#' @param comm2 The second community structure. 2197#' @return Two integer numbers, see details below. 2198#' 2199#' @references 2200#' van Dongen S: Performance criteria for graph clustering and Markov 2201#' cluster experiments. Technical Report INS-R0012, National Research 2202#' Institute for Mathematics and Computer Science in the Netherlands, 2203#' Amsterdam, May 2000. 2204#' 2205#' @export 2206 2207split_join_distance <- function(comm1, comm2) { 2208 comm1 <- if (inherits(comm1, "communities")) { 2209 as.numeric(membership(comm1)) 2210 } else { 2211 as.numeric(comm1) 2212 } 2213 comm2 <- if (inherits(comm2, "communities")) { 2214 as.numeric(membership(comm2)) 2215 } else { 2216 as.numeric(comm2) 2217 } 2218 on.exit(.Call(C_R_igraph_finalizer) ) 2219 res <- .Call(C_R_igraph_split_join_distance, comm1, comm2) 2220 unlist(res) 2221} 2222 2223#' Groups of a vertex partitioning 2224#' 2225#' Create a list of vertex groups from some graph clustering or community 2226#' structure. 2227#' 2228#' Currently two methods are defined for this function. The default method 2229#' works on the output of \code{\link{components}}. (In fact it works on any 2230#' object that is a list with an entry called \code{membership}.) 2231#' 2232#' The second method works on \code{\link{communities}} objects. 2233#' 2234#' @aliases groups groups.default groups.communities 2235#' @param x Some object that represents a grouping of the vertices. See details 2236#' below. 2237#' @return A named list of numeric or character vectors. The names are just 2238#' numbers that refer to the groups. The vectors themselves are numeric or 2239#' symbolic vertex ids. 2240#' @seealso \code{\link{components}} and the various community finding 2241#' functions. 2242#' @examples 2243#' g <- make_graph("Zachary") 2244#' fgc <- cluster_fast_greedy(g) 2245#' groups(fgc) 2246#' 2247#' g2 <- make_ring(10) + make_full_graph(5) 2248#' groups(components(g2)) 2249#' @export 2250 2251groups <- function(x) 2252 UseMethod("groups") 2253 2254#' @method groups default 2255#' @export 2256 2257groups.default <- function(x) { 2258 vids <- names(x$membership) 2259 if (is.null(vids)) vids <- seq_along(x$membership) 2260 tapply(vids, x$membership, simplify=FALSE, function(x) x) 2261} 2262 2263#' @method groups communities 2264#' @export 2265 2266groups.communities <- function(x) { 2267 m <- membership(x) 2268 groups.default(list(membership = m)) 2269} 2270 2271#' @export 2272 2273communities <- groups.communities 2274 2275#' @method "[" communities 2276#' @export 2277 2278`[.communities` <- function(x, i) { 2279 groups(x)[i] 2280} 2281 2282#' @method "[[" communities 2283#' @export 2284 2285`[[.communities` <- function(x, i) { 2286 groups(x)[[i]] 2287} 2288 2289 2290#' Contract several vertices into a single one 2291#' 2292#' This function creates a new graph, by merging several vertices into one. The 2293#' vertices in the new graph correspond to sets of vertices in the input graph. 2294#' 2295#' The attributes of the graph are kept. Graph and edge attributes are 2296#' unchanged, vertex attributes are combined, according to the 2297#' \code{vertex.attr.comb} parameter. 2298#' 2299#' @aliases contract.vertices contract 2300#' @param graph The input graph, it can be directed or undirected. 2301#' @param mapping A numeric vector that specifies the mapping. Its elements 2302#' correspond to the vertices, and for each element the id in the new graph is 2303#' given. 2304#' @param vertex.attr.comb Specifies how to combine the vertex attributes in 2305#' the new graph. Please see \code{\link{attribute.combination}} for details. 2306#' @return A new graph object. 2307#' @author Gabor Csardi \email{csardi.gabor@@gmail.com} 2308#' @keywords graphs 2309#' @examples 2310#' 2311#' g <- make_ring(10) 2312#' g$name <- "Ring" 2313#' V(g)$name <- letters[1:vcount(g)] 2314#' E(g)$weight <- runif(ecount(g)) 2315#' 2316#' g2 <- contract(g, rep(1:5, each=2), 2317#' vertex.attr.comb=toString) 2318#' 2319#' ## graph and edge attributes are kept, vertex attributes are 2320#' ## combined using the 'toString' function. 2321#' print(g2, g=TRUE, v=TRUE, e=TRUE) 2322#' 2323#' @export 2324 2325contract <- contract 2326