Gene Expression Data Analysis Algorithms

To facilitate analysis of experimental data, Ariadne has included a proprietary implementation of the Gene Set Enrichment Analysis (GSEA) methodology in the Pathway Studio Desktop and Enterprise Editions.  Based on its GSEA implementation, Ariadne has also developed Sub-Network Enrichment Analysis algorithm that helps to find common effectors for the most differentially expressed genes. 

Gene Set Enrichment Analysis (GSEA)

Ariadne's implementation of GSEA finds pathways and groups from different pathway collections and gene ontologies which were the most affected in the experiment.

Around 2003, researchers at the Broad Institute met a challenge in gene expression research that was not met by traditional profiling approaches. While gene expression microarrays seemed to be a correct instrument to generate experimental data, shortcomings were recognized with traditional approach to microarray data analysis as not fully characterizing the biological response.  The Broad researchers created a solution to this problem with a new algorithmic approach called Gene Set Enrichment Analysis.

The following two papers describe the original work:

  • Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., et al. (2003). PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34, 267-273.
  • Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Proc Natl Acad Sci USA. 2005, 102(43):15545-15550.

Gene Set Enrichment Analysis

GSEA focuses on gene sets, groups of genes that share common biological function, chromosomal location, or regulation.  GSEA methodology finds value in revealing low differential levels of transcript presence. There is no need for a prior selection of genes based on levels of transcript presence. This method therefore serves to reveal changes in transcripts whose relative concentrations may not only be regulated at the level of transcription.

Based on the algorithmic descriptions contained in the original papers, Ariadne has included a proprietary implementation of GSEA in Pathway Studio.  The algorithm includes the following general modifications to the described algorithms:

  1. Ariadne offers the Mann-Whitney U Test in addition to the Kolmogorov-Smirnov algorithm described in the original paper.
  2. Ariadne Gene Set Enrichment Analysis can be executed against reference pathways, provided by Ariadne, pathways built by the user or organization, gene sets defined by the Gene Ontology Consortium’s GO, and gene sets defined by Ariadne gene ontology.  In addition, the users may import gene ontology for use by Ariadne’s GSEA.

Sub-Network Enrichment Analysis

Pathway Studio also includes a proprietary algorithm called Sub-Network Enrichment Analysis (SNEA) to leverage the relationship information contained in the ResNet Database to identify key regulators, targets, and binding partners based on experimental data.  The SNEA employs Ariadne’s GSEA algorithm implementation as a key component of SNEA.  SNEA downplays the importance of highly promiscuous targets that happen to be downstream of many different regulators. 

During execution, SNEA builds sub-networks from the relationships in the ResNet database based on the criterion specified by user, such as “Expression Targets,” or “Binding Partners.”  SNEA applies the Mann-Whitney version of GSEA algorithm to each sub-network and calculates a p-Value.  Pathway Studio presents the list of sub-networks and the central “seed” of each network to the end-user, prioritized by p-value.  Users can use this information to build networks and continue interpretation of their experimental data results.

Sub-Network Enrichment Analysis (SNEA)

Sub-network Enrichment Analysis helps finding common regulators for the most differentially expressed genes, even if they were not present on the chip.