This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
data [2013/02/14 17:52] wbiesing created |
data [2014/08/10 20:57] (current) ychen [DANN data] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | Publicly available datasets, released as part of our research. | + | ===== DANN data ===== |
+ | |||
+ | The raw data can be found here [[http://krishna.gs.washington.edu/martin/download/cadd_training/]]. The real SNV, insertion and deletion samples sum up to 16,627,775. We randomly sample equal number of simutation samples (SNV, insertion and deletion), combine with the real data, and get a dataset of 33,255,550 samples. | ||
+ | |||
+ | This dataset is transormed into svmlight format with script impute2svmlight.py, which is provided by Dr. Martin Kircher (the author of CADD paper), and the python package [[https://github.com/mblondel/svmlight-loader|svmlight-loader]]. We roughly partition the dataset into 80% for training, 10% for validation and 10% for testing. Their svmlight files are here: | ||
+ | |||
+ | |||
+ | |||
+ | ===== tree-hmm sample .bam's from chr19 ===== | ||
+ | |||
+ | tree-hmm sample data from the ENCODE human project [[http://cbcl.ics.uci.edu/public_data/tree-hmm-sample-data|http://cbcl.ics.uci.edu/public_data/tree-hmm-sample-data]] | ||
+ | |||
+ | ===== LRH-1 ChIP-seq Data ===== | ||
+ | |||
+ | Our ChIP-seq analysis of LRH-1 can be found at: [[http://cbcl.ics.uci.edu/public_data/LRH-1|http://cbcl.ics.uci.edu/public_data/LRH-1]] | ||
+ | |||
+ | ===== FXR ChIP-seq Data ===== | ||
+ | |||
+ | [[http://cbcl.ics.uci.edu/public_data/FXR/|http://cbcl.ics.uci.edu/public_data/FXR/]] | ||
+ | |||
+ | ===== SREBP-2 ChIP-seq Data ===== | ||
+ | |||
+ | ChipSeq was performed on SREBP-2 and peaks were called using [[http://web.me.com/kaestnerlab1/GLITR/|GLITR]]. We also re-analyzed SREBP-1 using GLITR. Supplemental Tables, Figures, and datasets are available: [[http://cbcl.ics.uci.edu/public_data/SREBP2/|SREBP2]] | ||
+ | |||
+ | ===== SREBP-1 ChIP-seq Data ===== | ||
+ | |||
+ | Included here is ChIP-Seq raw and processed data from: | ||
+ | |||
+ | //Genome-wide analysis of SREBP-1 binding in mouse liver chromatin reveals a preference for promoter proximal binding to a new motif.// PNAS 2009 106:13765-13769; Young-Kyo Seo, Hansook Kim Chong, Aniello M. Infante, Seung-Soon Im, Xiaohui Xie, and Timothy F. Osborne | ||
+ | |||
+ | * Raw sequence data (raw_data/%%*%%_sequence.txt) was processed using eland to create raw/%%*%%_eland_multi.txt | ||
+ | * ChipSeq-mini 2.0 (http:<nowiki>//</nowiki>woldlab.caltech.edu/html/software, now part of the ERANGE package) was used to process the eland files and produce the processed/%%*%%.bed files. | ||
+ | * IgG files are for the control run, SREBP1 files are after fasting and refeeding. See paper for details. | ||
+ | |||
+ | - [[http://cbcl.ics.uci.edu/public_data/SREBP1/processed/|Processed Data]] | ||
+ | - [[http://cbcl.ics.uci.edu/public_data/SREBP1/raw_data/|Raw Data]] |