######################################################################
 
README
 
######################################################################

=====================================
Usage Description
=====================================
The mostCons elements identification pipeline is used to detect conserved elements from 7way whole-genome alighnment
Two steps are performed as: (1) phyloFit to fit conserved and nonconserved model; (2) phastCons to identity mostCons elements;

 
=====================================
Directory Contents
=====================================
 
This directory includes README, input directory, output directory and scripts directory. 
 
 
The sections below include:
 
	scripts directory
	        step1_phyloFit.sh
	        step2_mostCons.sh
	Input directory
		CDS_bed directory
			Chr[1-7]_CDS.bed files
                maf directory
                    12way.maf
                    chr_maf directory
                        chr[1-7].maf
	Output directory
		mostCons directory
			allMostCons.bed file
		        chr[1-7] directory
                            chr[1-7]-mostCons.bed file
                            chr[1-7]-scores.wig file 
	README file
 
 
=====================================
run.sh file
=====================================
##############################################################################################
# Step1 phyloFit to fit models   
# Usage: two models fitted by phyloFit based on 12-way alignments                     
#      Depedency_tools: phast package;   	                            
#      Input: 12way multiple alignment, CDS annotation, tree.newick                              
#      Output: Chr*.cons.mod, Chr*.noncons.mod							                         
# Step2 mostCons elements identification    			 
# Usage:Aligning each of other 11 species genome to reference genome                         
#      Depedency_tools: phast package, bedtools;   	                            
#      Input:  Chr*.cons.mod,Chr*.noncons.mod,12way alignment MAF file;  
#      Output: mostCons elements;								 
##############################################################################################


=====================================
scripts directory
=====================================
This directory provides two scripts.
(1) step1_phyloFit.sh was used to fit two models;
(2) step2_mostCons.sh was used to detect conserved elements.

=====================================
Input directory
=====================================
This directory provides 12-way alignment file with MAF format;
tree.newick required in phyloFit and phastCons; 
CDS_bed dir including coding sequence annotation in cucumber genome.

=====================================
Output directory
=====================================
This directory provides main output files including models
sites dir including three types of sites used to fit model as 1st-codon, 2nd-codon ad 4d sites;
mostCons elements dir;

=====================================
README file
=====================================
It is this file.

 



