===== DANN data ===== The raw data can be found here [[http://krishna.gs.washington.edu/martin/download/cadd_training/]]. The real SNV, insertion and deletion samples sum up to 16,627,775. We randomly sample equal number of simutation samples (SNV, insertion and deletion), combine with the real data, and get a dataset of 33,255,550 samples. This dataset is transormed into svmlight format with script impute2svmlight.py, which is provided by Dr. Martin Kircher (the author of CADD paper), and the python package [[https://github.com/mblondel/svmlight-loader|svmlight-loader]]. We roughly partition the dataset into 80% for training, 10% for validation and 10% for testing. Their svmlight files are here: ===== tree-hmm sample .bam's from chr19 ===== tree-hmm sample data from the ENCODE human project [[http://cbcl.ics.uci.edu/public_data/tree-hmm-sample-data|http://cbcl.ics.uci.edu/public_data/tree-hmm-sample-data]] ===== LRH-1 ChIP-seq Data ===== Our ChIP-seq analysis of LRH-1 can be found at: [[http://cbcl.ics.uci.edu/public_data/LRH-1|http://cbcl.ics.uci.edu/public_data/LRH-1]] ===== FXR ChIP-seq Data ===== [[http://cbcl.ics.uci.edu/public_data/FXR/|http://cbcl.ics.uci.edu/public_data/FXR/]] ===== SREBP-2 ChIP-seq Data ===== ChipSeq was performed on SREBP-2 and peaks were called using [[http://web.me.com/kaestnerlab1/GLITR/|GLITR]]. We also re-analyzed SREBP-1 using GLITR. Supplemental Tables, Figures, and datasets are available: [[http://cbcl.ics.uci.edu/public_data/SREBP2/|SREBP2]] ===== SREBP-1 ChIP-seq Data ===== Included here is ChIP-Seq raw and processed data from: //Genome-wide analysis of SREBP-1 binding in mouse liver chromatin reveals a preference for promoter proximal binding to a new motif.// PNAS 2009 106:13765-13769; Young-Kyo Seo, Hansook Kim Chong, Aniello M. Infante, Seung-Soon Im, Xiaohui Xie, and Timothy F. Osborne * Raw sequence data (raw_data/%%*%%_sequence.txt) was processed using eland to create raw/%%*%%_eland_multi.txt * ChipSeq-mini 2.0 (http://woldlab.caltech.edu/html/software, now part of the ERANGE package) was used to process the eland files and produce the processed/%%*%%.bed files. * IgG files are for the control run, SREBP1 files are after fasting and refeeding. See paper for details. - [[http://cbcl.ics.uci.edu/public_data/SREBP1/processed/|Processed Data]] - [[http://cbcl.ics.uci.edu/public_data/SREBP1/raw_data/|Raw Data]]