Project

General

Profile

Advanced Analysis

To open the Project designer window go to Data & Analysis>Analyze.

Open Project Designer

In the project designer, a new project can be created by typing its name in the top left field and then pressing enter on your keyboard.

Create Project

Select created project and panels with available analysis will appear.

Project Designer Window

The project designer window shows several available types of analysis:

Differential Gene Expression and Gene Sets

To perform the differential gene expression analysis, go to DESeq. DESeq or DESeq2 will be automatically choose based on replicates number. DESeq allows the statistical analysis to be performed even if the data include monoplicates. If you have at least duplicates for all of the sample conditions, DESeq2 will be performed because it utilizes more powerful statistical aproaches. As an example, we will perform the analysis described in our paper (Kartashov, Barski, in preparation).

To add raw data to the project, first create a condition by typing the name in the top right box (highlighted in purple below).
Create Replicate Group

Next, identify datasets related to this condition. Open the RNA-Seq data tab by clicking the >> button (highlighted in red above).

In the “RNA-Seq data” tab, all of the RNA-seq experiments available to you are listed. These can be filtered by folder or searched by a keyword.
DESeq Main Panel

To add a dataset to the specified condition, drag it with your mouse from the “RNA-Seq data” tab to the appropriate subfolder of the Raw Data folder of the “Genes Lists” tab.

After defining the conditions and adding the desired raw datasets to these conditions, click the Run DESeq icon icon in the far right column to set up DESeq analysis. In the next window,
Run DESeq
list the conditions that you want to compare. To include additional conditions, use the “+”. Provide a unique name for analysis and select whether you want to compare expression by isoform, TSS or gene. Series type is used when more than 2 conditions are compared. Assuming that we have 3 conditions (1, 2 and 3), selecting “Pairwise series” will perform all pairwise comparisons (1-2, 1-3, 2-3), whereas “Time series” will perform 1-2 and 2-3 and “Kinetics series” will perform 1-2 and 1-3. After setting up the conditions, click the run button.

DESeq analysis can take up to 10 minutes. After the analysis is complete, data can be saved in a .csv file by clicking the Save icon.

After the analysis is complete, we can filter genes and create gene sets by clicking the filter Filter Icon icon.
Filter Window

In order to create gene sets, DESeq results can be filtered using parameters such as p-value, p-adjusted, RPKMs, and chromosome, as well as logical operators (e.g. AND/OR).

Average Tag Density Profiles and Heatmaps

To compare the chromatin environment between gene sets using tag density profiles, select the “ATDP & Heatmaps” option in the Project designer window. Gene sets that were created as specified in the Differential Gene Expression and Gene Sets section of this guide are already present in this view. ChIP-Seq datasets can be added to the project in a manner similar to how RNA-Seq data are added (see Differential Gene Expression and Gene Sets section of this guide). To create average tag density profiles from an available gene set, select the graph icon of that gene set.
Run ATDP

In the Average Tag Density settings window, the graphs can be set up in the “ATDP input” area by providing the name for the graph as a whole and a combination of gene set, ChIP-Seq dataset and line name for each plotted line.

The graph will be calculated in a few minutes and can be viewed by pressing the magnifier icon.

Whether the level of modification (tag density) between gene sets is significantly different can be ascertained using the MWW test: highlight the area where you want to compare tag density and confirm the area dimensions in the next window.
ATDP TSS

The resulting box-plot will show the tag density distribution between groups and MWW-based p-values.
Box plots

Other tabs in the Average Tag Density Profile tab of the Project Designer window will show a similar graph for the gene body and tag density heatmaps.
ATDP gene body
Heatmaps

In the heatmaps, ordering of genes and the color scale can be adjusted using buttons above the graphs.

Differential ChIP-Seq Enrichment - MANorm

Areas of the genome that are differentially modified can be identified using MANorm. The set-up window can be opened by clicking the MANorm icon in the Project designer window. Adding raw data, and identifying conditions are done similarly to the DESeq analysis (see Differential Gene Expression and Gene Sets section of this guide).

MANorm

MANorm Run

MANorm analysis can take up to an hour. The results can be viewed or saved using the icons next to the analysis result. In addition to displaying the islands, the table shows the neighboring genes and where the island is located relative to these genes. We recommend filtering the list on the basis of both the p-value and rescaled M (log2 fold change).

IMPORTANT NOTES for interpretation of MANorm results

We have identified a couple of bugs that can affect the results. However, if used with caution, MANorm is still a good tool for this analysis, because it can adjust for enrichment when comparing ChIP-Seq experiments.

A. MANorm has a bug when calculating low p-values. When -logP is 0, this is actually a significant p-value. Thus when selecting for significant changes use conditions like ( -log P>2 or =0).

B. MANorm is not commutative. In other words, comparison of experiment A with experiment B produces results different than comparison of B with A.

When trying to identify the islands that are stronger in condition A or in condition B, one can go like this:
1. Do both MANorm (A vs B) and MAnorm (B vs A).
2. Apply p-value threshold: select those islands that satisfy the p-value condition (e.g. -log P>2 or =0).
3. Next, use the same M-value cut-off for both comparisons. (e.g. use M>1 in A vs. B to identify islands that are stronger in A and M>1 in B vs. A to identify islands that are stronger in B or alternatively use M<-1 in A vs. B to identify islands that are stronger in B and M<-1 in B vs. A to identify islands that are stronger in A).
Steps 2 and 3 can be done in excel.
If doing multiple comparisons, make sure to set them up in the same direction.

R language

Run R
IDR
IDR errors
PCA Source

Fig2.png - Create Project (14.7 KB) Andrey Kartashov, 05/13/2015 10:18 AM

Fig1.png - Open Project Designer (18.7 KB) Andrey Kartashov, 05/13/2015 10:18 AM

Fig3.png - Project Designer Window (119 KB) Andrey Kartashov, 05/13/2015 10:24 AM