GitHub - theandygross/TCGA_differential_expression: Differential expression analysis on TCGA samples. (original) (raw)
README
This repository contains all of the analysis notebook required for the reproduction of the manuscript:
Andrew M. Gross, Jason F. Kreisberg, Trey Ideker
Analysis
All analysis for the manuscript is recorded in a series of Jupyter (formerly IPython) Notebooks.
To view please follow the Github or NBviewer links.
Dependencies
This code uses a number of features in the scientific python stack as well as a small set of standard R libraries. Thus far, this code has only been tested in a Linux enviroment, it may take some modification to run on other operating systems. I highly recomend installing a scientific Python distribution such as Anaconda or Enthought to handle the majority of the Python dependencies in this project (other than rPy2). These are both free for academic use.
Python Dependencies
- Numpy and Scipy, numeric calculations and statistics in Python
- matplotlib, plotting in Python
- Pandas, data-frames for Python, handles the majority of data-structures
- rPy2, communication between R and Python
- NOT IN DISTRIBUTIONS
- I recommend installing with
pip install rpy2 - Needs R to be compiled with shared libraries
My Internal Package Dependencies
These are Python packages that I use internally for things such as statistics and visualization. They are all available on my Github page, I recomend downloading them and installing them with python setup.py install. I appoligize for the generic names, I am hoping to develop these a bit more and make them into proper packages up to spec in my next code refactor.
- Figures
- Code for better figure generation, mainly using Pandas data-structures
- I am slowly phasing this out and replacing with the very nice seaborn library
- Stats
- Contains two packages, Stats and Helpers
- Stats has a number of helper functions that wrap calls to R or scipy statistics functions and allow them to play nicer with Pandas data-structures
- Helpers has a number of common tasks that I envoke to make code a bit more readable
- NotebookImport
- Utility for importing IPython notebooks as modules
- Code taken from MinRK's Gist
- This is dependent on the IPython/Jupyter version you are using, you may get deprecation warnings, I am trying to keep this up to date but I'm not sure how backwards compatable things are
- MethylTools
- Utility for organizing probe annotations for the Illumina methylation450k chip
- Has some R dependencies
R Dependencies
- IlluminaHumanMethylation450kanno.ilmn12.hg19
- Used for methylation probe annotations
- Should only be a dependency for the MethylTools package
- mgsa
- Model based GSEA
- I don't use this in the paper but it is in some exploritory analysis