Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
data [2013/07/08 15:59]
wbiesing
data [2014/08/10 20:57] (current)
ychen [DANN data]
Line 1: Line 1:
 +===== DANN data =====
 +
 +The raw data can be found here [[http://​krishna.gs.washington.edu/​martin/​download/​cadd_training/​]]. The real SNV, insertion and deletion samples sum up to 16,627,775. We randomly sample equal number of simutation samples (SNV, insertion and deletion), combine with the real data, and get a dataset of 33,255,550 samples. ​
 +
 +This dataset is transormed into svmlight format with script impute2svmlight.py,​ which is provided by Dr. Martin Kircher (the author of CADD paper), and the python package [[https://​github.com/​mblondel/​svmlight-loader|svmlight-loader]]. We roughly partition the dataset into 80% for training, 10% for validation and 10% for testing. Their svmlight files are here: 
 +
 +
 +
 ===== tree-hmm sample .bam's from chr19 ===== ===== tree-hmm sample .bam's from chr19 =====
-tree-hmm sample data from the ENCODE human project [[http://​cbcl.ics.uci.edu/​public_data/​tree-hmm-sample-data]]+ 
 +tree-hmm sample data from the ENCODE human project [[http://​cbcl.ics.uci.edu/​public_data/​tree-hmm-sample-data|http://​cbcl.ics.uci.edu/​public_data/​tree-hmm-sample-data]]
  
 ===== LRH-1 ChIP-seq Data ===== ===== LRH-1 ChIP-seq Data =====
-Our ChIP-seq analysis of LRH-1 can be found at: 
-[[http://​cbcl.ics.uci.edu/​public_data/​LRH-1]] 
  
 +Our ChIP-seq analysis of LRH-1 can be found at: [[http://​cbcl.ics.uci.edu/​public_data/​LRH-1|http://​cbcl.ics.uci.edu/​public_data/​LRH-1]]
  
 ===== FXR ChIP-seq Data ===== ===== FXR ChIP-seq Data =====
-[[http://​cbcl.ics.uci.edu/​public_data/​FXR/​]] 
- 
  
 +[[http://​cbcl.ics.uci.edu/​public_data/​FXR/​|http://​cbcl.ics.uci.edu/​public_data/​FXR/​]]
  
 ===== SREBP-2 ChIP-seq Data ===== ===== SREBP-2 ChIP-seq Data =====
-ChipSeq was performed on SREBP-2 and peaks were called using [[http://​web.me.com/​kaestnerlab1/​GLITR/​|GLITR]]. ​ We also re-analyzed SREBP-1 using GLITR. 
-Supplemental Tables, Figures, and datasets are available: 
-[[http://​cbcl.ics.uci.edu/​public_data/​SREBP2/​|SREBP2]] 
- 
  
 +ChipSeq was performed on SREBP-2 and peaks were called using [[http://​web.me.com/​kaestnerlab1/​GLITR/​|GLITR]]. ​ We also re-analyzed SREBP-1 using GLITR. Supplemental Tables, Figures, and datasets are available: [[http://​cbcl.ics.uci.edu/​public_data/​SREBP2/​|SREBP2]]
  
 ===== SREBP-1 ChIP-seq Data ===== ===== SREBP-1 ChIP-seq Data =====
Line 23: Line 27:
 Included here is ChIP-Seq raw and processed data from: Included here is ChIP-Seq raw and processed data from:
  
-//​Genome-wide analysis of SREBP-1 binding in mouse liver chromatin reveals a preference for promoter proximal binding to a new motif.// +//​Genome-wide analysis of SREBP-1 binding in mouse liver chromatin reveals a preference for promoter proximal binding to a new motif.// PNAS 2009 106:​13765-13769;​ Young-Kyo Seo, Hansook Kim Chong, Aniello M. Infante, Seung-Soon Im, Xiaohui Xie, and Timothy F. Osborne 
-PNAS 2009 106:​13765-13769;​ Young-Kyo Seo, Hansook Kim Chong, Aniello M. Infante, Seung-Soon Im, Xiaohui Xie, and Timothy F. Osborne+ 
 +    * Raw sequence data (raw_data/​%%*%%_sequence.txt) was processed using eland to create raw/​%%*%%_eland_multi.txt 
 +    * ChipSeq-mini 2.0 (http:<​nowiki>//</​nowiki>​woldlab.caltech.edu/​html/​software,​ now part of the ERANGE package) was used to process the eland files and produce the processed/​%%*%%.bed files. 
 +    * IgG files are for the control run, SREBP1 files are after fasting and refeeding. See paper for details.
  
-  *  Raw sequence data (raw_data/*_sequence.txt) was processed ​using eland to create raw/*_eland_multi.txt +    - [[http://cbcl.ics.uci.edu/​public_data/​SREBP1/​processed/|Processed Data]] 
-  ​* ​ ChipSeq-mini 2.0 (http:<​nowiki>​//</​nowiki>​woldlab.caltech.edu/html/software, now part of the ERANGE package) was used to process the eland files and produce the processed/​*.bed files. +    [[http://cbcl.ics.uci.edu/public_data/SREBP1/​raw_data/​|Raw Data]]
-  *  IgG files are for the control run, SREBP1 ​files are after fasting and refeeding. See paper for details.+
  
-  - [[http://​cbcl.ics.uci.edu/​public_data/​SREBP1/​processed/​|Processed Data]] 
-  - [[http://​cbcl.ics.uci.edu/​public_data/​SREBP1/​raw_data/​|Raw Data]] 
You are here: startdata