Network Science Tool Resources

Below are tools for network science chosen by COMBINE fellows as particularly useful. Tool descriptions and resources are curated by fellows.

Circos | Cytoscape | Graph-tool | JavaScript Libraries for Network Analysis | MNE – MEG and EEG Analysis and Visualization | Network Visualization using igraph in R | NetworkX | Stanford Network Analysis Project (SNAP)

Network Visualization using igraph in R
Leonard Campanello and Anshuman Swain

What are its uses?
Visualization is extremely important to convey information about networks to an audience. Most of the times, it is represented as an illegible hairball or clumsy graphics — which might affect the way in which the work is perceived. That being said, there are numerous paid and free resources available for effective network visualization. What sets igraph package in R apart from others, is the versatility of the package and its support within the comprehensive R framework.

Who will find it useful?
Anyone who has some familiarity with R can use the tool — it’s free. The visualizations thus created, can be exported to any format and can be used in any publication, website, blog et cetera under the free license.

What are its strengths and weaknesses?
Igraph is made for R and thus, can use the strong statistical framework that R offers. Moreover, the color, font, and shape repertoire in R is very comprehensive (and flexible) and can accommodate complex graphics- thus enabling faster customization of the network plots. The package also offers various modes and methods of network visualization depending on the needs of the user — making it a one-stop solution for networks. As igraph is a statistical package on networks — all computations can be done within the framework and then plotted / used for visualization (reducing extra input or conversion woes). The only weakness being, you have to be well-versed in R code to make all the advanced customizations.

Where can I learn more?
One can refer to the igraph package manual for more details. For tutorials on the issue, one can search R Network Visualization Workshop: Polnet 2015, held in Portland, OR in 2015.

Cytoscape
Jacob Isbell, Hadi Vafaei, and Haley Wight

What is it?
Features:

  • Desktop app: Cytoscape
  • Web application: Cytoscape.js
  • Easily assign colors/labels/attributes to both nodes and edges.
  • Compute network summary statistics.
  • Perform more complex network analyses, e.g., clustering.
  • Apps are available for network and molecular profiling analyses, new layouts, additional file format support, scripting, and connection with databases.

Cytoscape vs. Cytoscape.js:
The Cytoscape desktop application has an intuitive user interface that does not require prior programming knowledge. While the majority of operations have been designed within the Java based app, Cytoscape.js is the web-based successor. Cytoscape.js is a network data visualization and analysis engine for web applications. It is a pure JavaScript library and no plugin is required for the web browsers. Cytoscape and cytoscape.js share design-level concepts, such as Visual Styles, but their code bases are completely independent to each other. It is designed to be a building block for complex data visualization web applications in HTML5. Using this library does require some web development knowledge.

Audience
Those interested in visualizing molecular interaction networks and biological pathways. Knowledge of computer programming is not required.

Strengths and Weaknesses
Strengths:

  • Cytoscape has a large user base, especially among biologists.
  • Cytoscape has consistent maintenance and support (last release: v3.7.0 10/22/2018).

Weaknesses:

  • Doesn’t handle very, very large networks well and may crash.
  • Complex layouts will take longer to compute as compared to gephi and other tools. While cytoscape will compute basic network statistics, if computationally intensively data analysis needs to be done, it should be done externally and imported into cytoscape.

More information
CytoscapeTutorials ; Cytoscape.jsDemo

Circos
Danielle Middlebrooks and Muzi Li

What is it?
Circos is a free software package used for visualizing data and information. This package was created by Martin Krzywinski, who released it to the world in 2009 with his paper: “Circos: an Information Aesthetic for Comparative Genomics”. Circos is written in Perl and can be deployed on any operating system for which Perl is available. It produces bitmap (PNG) and vector (SVG) images using plain text configuration and input files which makes it very appealing for its ease of use.
It was initially designed for displaying genomic data (particularly cancer genomics and comparative genomics) and molecular biology. However, it can create figures from data from a variety of fields such as visualizing migration and mathematical artwork. Circos has been accepted by the biological community as a standard for displaying sequence relationships and genome rearrangements.

Audience
Circos will be most useful for those who want to represent their network data in a circular fashion. Installing the program will require a substantial effort and knowledge about Linux/Unix kernels.

Strengths and Weaknesses
There are many strengths and few weaknesses to this particular software package. The circular layout supports larger data domain by using layering of different data sets to create highly informative infographics with texture and visual appeal. Circular layouts have a huge advantage over linear ones such as having equal focus across scale, preserving continuity in data tracks, and preserving adjacency among similar data points. It also attempts to balance flexibility with ease-of-use. Circos makes no assumptions about your data and uses extremely simple input data format. This makes image creation and customization easy and being controlled through a plain-text configuration files allows for no interactive user interface. There is a clear trend in literature to include as much information as possible on single plots in order to demonstrate the amount of work that went into a project. This may lead people to go overboard and plot many different datasets with clashing color schemes and tiny points on a single image. Some would view this as a weakness. However, the key with all visualization is to focus on showing only the interesting patterns in the data, rather than plotting everything possible. One clear limitation and possible weakness is the ability to only visualize data in a circular layout. If a circular layout is not ideal for your dataset, then Circos may not be the tool for you.

More information (and getting started)
Some programming knowledge of shell language is the only background needed to use this resource. After downloading and unpacking Circos, the users need to have perl and all required perl modules installed. Once the users have determined how to present their data, they can parse the data into Circos format and create a configuration file for generating a circos plot. Various R packages now exist that let you create a circos plot by writing R code. These include Circlize, RCircos, CIRCUS, and OmicCircos. Each of these packages have strengths and weaknesses relative to each other. Click here for tutorials.

MNE – MEG and EEG Analysis and Visualization
Phillip Alvarez and Peng Zan

What is it?
MNE-Python is an open-source Python software for exploring, visualizing and analyzing human neurophysiological data such as magnetoencephalography (MEG), electroencephalography (EEG), stereoelectroencephalography (sEEG) and ECoG. In general, it allows analysis of trial-based time-sequence data recorded from the human brain, from which the functional networks can be derived either in sensor space or neural source space.

Audience
It is designed to aid neuroscience researchers who analyze and visualize electrophysiological data and derive functional network from the multi-channel temporal recording. However, it can be used by any researcher who is trying to estimate connectivity between channels/sensors based on time-sequence data.

Strengths and Weaknesses
Strengths
It can be used to visualize multi-channel recordings and perform multiple network analyses between sensors or between neural sources, such as spectral connectivity, phase slope index and seed target indices, etc. Below are several commands for these tasks.
MNE data objects: MNE-Python commands mostly operate on MNE objects such as Raw, Epoch, Evoked, Info and SourceEstimate. The Info object, which is always embedded in other objects, contains metadata of date, subject information, sampling rate, sensor positions and other recording details. Other objects append information from the Info object to their own data.
Visualize spatial-temporal data: Given the embedded dimensional data, the Info object facilitates visualization of other objects. For example, you can plot multi-channel recordings with the simple command raw.plot(), and multi-trial data with epoch.plot(), and the sampling rate and time information from raw.info or epoch.info would help label the time axis.
Functional connectivity in sensor space: Mne.connectivity module contains multiple functions computing different network measures based on sensor-space time sequences, for example, spectral_connectivity and seed_target_indices, etc, which operate on Epoch object.
Neural source estimation: To compute network connectivity in neural source space, a linear mapping from sensor space is required, which is called an inverse problem. With MNE, you can easily integrate structural MRI data and MEG recordings to solve the inverse problem or source localization. The algorithm used is minimum norm estimation (MNE), which is also the source of toolbox’s name.
Functional connectivity in source space: Based on source-space data, one can now compute network connectivity in source space by different measures. For example, coherence (seed_target_indices), full-spectrum connectivity (spectral_connectivity), and phase slope index (phase_slope_index). Finally, you can visualize the network in a circular graph by plot_connectivity_circle, or an all-to-all connectivity 3D brain plot from mayavi.mlab.
Weaknesses
This toolbox is targeted at researchers in neuroscience, especially those dealing with time-sequence data, therefore it may not be a general use toolbox for network visualization. This can be compensated for by integrating other toolboxes such as mayavi, networkx and eelbrain.

More information
Get started here: https://martinos.org/mne/stable/documentation.html
Various examples are here: https://martinos.org/mne/stable/auto_examples/index.html and here: https://github.com/mne-tools/mne-python/tree/master/examples
Various tutorial are here: https://github.com/mne-tools/mne-python/tree/master/tutorials

JavaScript Libraries for Network Analysis
Sigfried Gold and Brook Stacy

What is it?
Software developers or research programmers use JavaScript and its wealth of libraries to build custom applications that can be deployed to any user with access to the internet and a web browser. The tools listed below serve a different overall purpose from those most commonly used by scientists and data analysts. They are not best for one-off analysis of a dataset, but for building reusable tools, often for others.
D3.js is the most widely used and influential library for building interactive visualizations in JavaScript. Its prodigious examples gallery provides working code with small sample datasets that can be applied to countless network visualization modalities.
Sigma.js is a JavaScript library dedicated to graph drawing. It makes easy to publish networks on Web pages, and allows developers to integrate network exploration in rich Web applications. Unlike Cytoscape.js, described above, it generates graphs primarily using canvas rather than SVG and can handle much larger graphs performantly.
Graphology is a specification and reference implementation for a robust & multipurpose JavaScript Graph object. It aims at supporting various kinds of graphs with the same unified interface. If you are developing custom graph analytics functions, graphology provides a standardized API to build on with basic functions for adding nodes and edges, configuring graph parameters, querying graph data, and more.

Audience
Anyone interested in representing their network data in a user-friendly way, while having rich database of resources on how to customize visualizations. Those interested in interactive visualizations will get a lot out of using these libraries. Programming knowledge will be critical.

Strengths and Weaknesses
The advantages of browser-based programming in JavaScript include:

  • Browsers are ubiquitous. Your users will not need to install Python or other programming platforms or libraries. Your application with supply all its dependencies over the web.
  • Modern browsers (Chrome, Firefox, etc.) provide a rich platform for highly interactive applications with beautiful and flexible graphics and a huge supply of open-source libraries. Helpful resources for learning more about these include:
  • For resources specific to graph/network analysis, try this curated list of JavaScript graph drawing libraries.

The primary disadvantages of JavaScript and browsers for network analytics are:

  • They do not provide the scientific depth, performance, and scalability of platforms like Python, C/C++, R, or Fortran;
  • They are less familiar to scientific users;
  • The JavaScript ecosystem of frameworks and libraries grows so fast that it is a full-time job attempting to keep up with it. As fast as science-focused ecosystems (e.g., around Numpy or matplotlib in Python) are growing, they are more coherent and stable than JavaScript’s.

More information
Main information about D3.js is here: https://d3js.org/ with examples here: https://github.com/d3/d3/wiki/Gallery/
Main information about Sigma.js is here: http://sigmajs.org/
Main information about Graphology is here: https://graphology.github.io/
More resources about browser usage can be found in Strengths and Weaknesses above

Stanford Network Analysis Project (SNAP)
Abhilash Sahoo and Kevin Armengol

What is it?
SNAP is a general purpose network analysis tool and graph mining library. The SNAP website is also a repository of network science related information

What are its uses?
– Manipulate large graphs and calculate structural properties.
– Large network datasets available for mining e.g., gene-disease, gene-gene, cell-cell, drug interactions, social, economic, geological, web-based, etc.
– Written in C++ and scales to graphs with 100+ million nodes and 100+ billion edges.
– Also available in python interface.
– Available through NodeXL, a graphical front-end for network analysis in Microsoft Office and Excel.

Who will find it useful?
– Those with large network datasets that require computational efficiency.
– Students interested in learning network science using a hands-on approach.
– For people with moderate coding experience and familiarity with basic network analysis.

Strengths of SNAP:
– SNAP (Both C++ and python versions) library supports efficient analysis of massive network datasets, provide access to rich pre-built functionalities and structural attributes.
– It can convert between multiple formats (e.g., xml and graphml) for easy transferability and provides an easy-to-use graphical front-end through NodeXL to be used by non programmers. SNAP does not have large outside dependencies and works on all operating systems (Windows and Mac OS X, Linux and other Unix variants). The SNAP website contains large volumes of network science related information (datasets, tutorials, courses and events) for easy access and learning. The website also contains a number of network analysis based projects by the SNAP group.

Weaknesses of SNAP:
– The SNAP library currently does not have a wide adoption in network science community. Therefore, it is not easy to get good community based support for this tool.
– SNAP library also does not provide an easy-to-use solution for information visualization. Although, ability to convert network data across many file format can be useful to transfer information into specialized network visualisation tools.

Where can I learn more?
– The website for SNAP can be accessed at http://snap.stanford.edu/. This website contains links to C++ and python versions of SNAP network analysis and graph mining tools, large network datasets, network science related tutorials, events, courses and links to other commonly used software tools.

Graph-tool
Cindy Li and Lauren Weiss

What are its uses?
Graph-tool is a python package for analysis, manipulation, and visualization of networks. Implemented in C++, Graph-tool enables easy and fast computations of network properties and statistics for python users of all levels. Graph-tool also provides many options of network visualization, including animated and interactive visualization.

Who will find it useful?
With its rich and powerful features, Graph-tool could attract a wide range of audience interested in analyzing and visualizing networks – sociologists studying social networks, epidemiologists seeking to visualize outbreaks, and business analysts building consumer cluster models, just to name a few.

What are its strengths and weaknesses?
Graph-tool is advantageous in many ways. The core data structures and algorithms of Graph-tool are implemented in C++, making extensive use of template metaprogramming, based heavily on the Boost Graph Library. As a result, the efficiency of Graph-tool is comparable to that of a pure C/C++ library both in memory usage and computation time. This makes Graph-tool accessible for those with python experience without having to understand C++. In addition, there is extensive Graph-tool documentation available, including troubleshooting and a wealth of other resources on GitHub.

        However, Graph-tool also has its weaknesses. The installation of Graph-tool can be challenging. Graph-tool is not available through common installation and environment management tools like pip install or conda install. Currently, the easiest way to download and install Graph-tool is through Docker, which requires additional configurations of the environment. As for the efficiency of Graph-tool, although implementation in C++ and the resulting computation speed is a strength of this software, the temptation, particularly that for new users, to execute main loops through python, etc. rather than C++ negates this speed advantage. A large amount of the graph-tool troubleshooting on StackOverflow, for example, seek to understand why graph-tool-based scripts are running so much slower than similar modules like NetworkX.

Where can I learn more?
If you are interested in reading more about Graph-tool, here is a list of useful resources:
–    Graph-tool documentation: https://graph-tool.skewed.de/static/doc/index.html
–    Graph-tool GitHub page: https://git.skewed.de/count0/graph-tool
–    Introduction to graph-tool: http://connor-johnson.com/2015/04/14/introduction-to-graph-tool/
–    Mosky Liu’s “Graph-Tool in Practice”: https://www.slideshare.net/moskytw/graphtool-in-practice

NetworkX
Domenick Braccia and Chelsea Haakenson

What is it and what are its uses?
NetworkX (https://networkx.github.io/) is a python package for creating, manipulating and studying features of complex networks. This tool is useful for studying the structure and dynamics of networks and can handle the computational challenges of large, nonstandard data sets. It includes functions for visualizing networks in a variety of ways and in addition to pre-made algorithms for calculating network measures such as degree and centrality. Furthermore, programming with networkX can interface to existing algorithms and code written in other languages (C, C++, and FORTRAN) to facilitate collaborative projects.

Who will find it useful?
NetworkX is intended for a variety of people – including mathematicians, physicists, biologists, computer scientists, and social scientists. However, use of this tool requires coding, so previous coding experience is recommended.

What are its strengths and weaknesses?
Some of the greatest strengths of NetworkX are its versatility and the large community that works with this open-source tool. However, it is subject to problems with scalability, memory issues, and potential hairball effects.

Where can I learn more?
Main documentation is here: https://networkx.github.io/documentation/stable/
A tutorial to get you started is here: https://networkx.github.io/documentation/stable/tutorial.html
An example of this tool being used in the context of social network creation can be found here: https://blog.dominodatalab.com/social-network-analysis-with-networkx/ .