######################################################################
 
README
 
######################################################################

=====================================
Usage Description
=====================================
The module of multiple alignment anchors (MAAs) identification is used to identify the MAAs elements from 15way.multiz file.
Two main steps are performed as: 
	(1) From 15-way multiple alignment file, using mafTool-master tool to select subgroup of n species which coexisted in one aligned block; 
	(2) Using in-house perl scripts to select MAAs according to its identity and length.

 
=====================================
Directory Contents
=====================================
 
This directory includes README, input directory, output directory and scripts directory. 
 
 
The sections below include:
 
	scripts directory
		pl_scripts
			run.sh
			pl_scripts
				maf_parse.pl
	Input directory
		species_list directory
			species[1~15].list files
		15way.multiz
	Output directory
    	MAAs.table
		MAAs.maf
		MAAs.fasta
	README file
 
 
=====================================
run.sh file
=====================================
##############################################################################################
# Prepare dataset: 15way.multiz produced by alignment.sh, 15 species.list					
# Description: Step 1 From 15-way multiple alignment, using mafTool-master tool to select    
#              subgroup of n species which coexisted in one aligned block and saved as    	 
#              n-way multiple alignment anchors [n-way.MAAs]						        
#              From 1.1 to 1.14 step by step												 
# Usage:Species indexed by d, where d is gradually increased with the divergence from       
#       cucumber according to the phylogenetic tree in Figure 1                              
#       Dependency_tools: mafTool-master										             	
#       Input: 15way.multiz,15 species.list				     								 
#       Output: n-way.MAAs [n=1~15]								             				
##############################################################################################



=====================================
scripts directory
=====================================
This directory provides the dependency one perl scripts required in MAAs_identify.sh.
(1) maf_parse.pl was used to parse maf files in step2.1~2.3 of MAAs_identify.sh
#############################################################################################################################
#	This script can be used to select MAAs according to the block's identity and min_length from MAAs-15.maf   			
#	Usage: 
#		perl maf_parse.pl identity_cut length_cut output_label <1: output_fasta; 2: output_table>        			
#		identity_cut: Indicating a [0,1] number which determines the at least ratio of identical bases 				  	
#			accross all 15 species in one block with cucucmber as reference. Default: 0  									
#		length_cut: Indicating the minimal length of blocks with cucumber as reference. Default: 1					
#		output_label: 1:len_identity.maf, 2:len_identity.fasta, 3:len_identity.table 								
#			7 columns in table:(1) MAAs-ID (2)species (3)chromosome (4)start (5)end (6)strand (7)aligned_seq	
#############################################################################################################################


=====================================
Input directory
=====================================
This directory provides 15way.multiz produced by alignment.sh and species_list/species.list required in MAAs_identify.sh. 


=====================================
Output directory
=====================================
This directory provides three types of MAAs files as fasta, maf and table.
table files include 7 Columns in table:(1) MAAs-ID (2)species (3)chromosome (4)start (5)end (6)strand (7)aligned_seq


=====================================
README file
=====================================
It is this file.

 



