######################################################################
 
README
 
######################################################################

=====================================
Usage Description
=====================================
The module of collinear segments detection is used to detect n-way[2way-n] level of collinear segments among 15way species, by using MAAs as genomic markers.
Three main steps are performed as:
	(1) Step1:Using MAAs as genomic markers, to detect the collinear segments among 15 species. 
	(2) Step2:Parsing the segments.txt to collect the n-way collinear segments.
	(3) Step3:Transfering the table to gff3 format for collinear segments.

 
=====================================
Directory Contents
=====================================
 
This directory includes README, input directory, output directory, and scripts directory. 
 
 
The sections below include:

	scripts directory
		run.sh
		pl_scripts
			write_scaffold_lst_files.pl
			write_families.pl
			write_genome_lst.pl		
			Step2.pl
			Step3.pl
	Input directory
		MAAs_gff3_dir directory
			MAAs.gff3 files
		MAAs.table
		ini
		segments.txt
	Output directory
		segments_gff3 directory
			n-way directory
				{n}-way_segments.gff3
			2way-n directory
				2way-{n}_segments.gff3
		segments_table directory
			n-way directory
				{n}-way_segments.table
			2way-n directory
				2way-{n}_segments.table		
	README file
 
 
=====================================
run.sh file
=====================================
#############################################################################################################
# Prepare dataset: MAAs.table produced by MAAs_identification.sh.													
# Description:
#	Step 1 Using MAAs as genomic markers, to detect the collinear segments among 15 species.
#		ini file indicating the running parameters used in i-adhore.						
#		Step1.1~1.4 prepare the families.csv and data.ini required by i-adhore;			
#		Step1.5 running i-adhore to detect the collinear segments among 15 species		
#	Step 2 Parsing the segments.txt to collect the n-way collinear segments 				
#		Output as n-way_segments.table as 6 columns including (1)segmentID;(2)multipliconID;
#		(3)species;(4)chr;(5)start_anchorID;(6)end_anchorID;(7)order	
#	Step 3 Transfering the table to gff3 format for collinear segments						
# Dependency tools: write_scaffold_lst.pl, write_families.pl, Step2.pl, Step3.pl						
# Usage: sh collinear.sh MAAs.table																	
#	Input: MAAs.table, ini which includes the i-adhore running parameters					
#	Output: 2way-n.collinear.gff3 [n=2~15] and n-way.collinear.gff3 [n=3~15]					
#############################################################################################################



=====================================
scripts directory
=====================================
This directory provides the dependency five perl scripts required in run.sh.

(1)	write_scaffold_lst_files.pl was used to write the scaffold files required in data.ini file in step1.1 of collinear_detection.sh
#################################################################################################
#	This script can be used to prepare the species_scaffold_lst/* files used in i-ADhore tool.
#	Usage:	perl write_scaffold_lst.pl MAAs_gff(or genes_gff) species.						
#	Input:	MAAs_gff, species																
#	Output:	species_scaffold_lst/*scaffold.lst												
#################################################################################################

(2)	write_families.pl was used to prepare families.csv file in step1.2 of run.sh
#################################################################################
#	This script can be used to prepare the family file used in i-ADhore tool.	
#	Usage: perl write_families.pl MAAs.table > families.csv					
#	Input: MAAs.table														
#	Output: families.csv													
#	output with two columns as: (1)	MAAs_ID;(2)	HOM_ID.						
#################################################################################

(3)	write_genome_lst.pl was used to write genome.lst required in data.ini file in step1.3 of run.sh
#####################################################################################
#	This script can be used to prepare the genome.lst file used in i-ADhore tool.
#	Usage:	perl write_genome_lst.pl 15 species									
#	Input:	15 species name, genome_scaffold_lst								
#	Output:	genome.lst															
#####################################################################################

(4)	Step2.pl was used to prepare segments.table file in step2 of run.sh
#####################################################################################################################################################################
#	This script can be used to prepare segments.table file used in collinear_detection.sh.																		
#	n-way[2way-n].segments.table: select the multiplicon segments that include the determined species subgroups.												
#	Usage: perl Step2.pl segments.txt reference_genome subgroup_of_species_index > n-way[2way-n].segments.table.												
#	Input: segments.txt.																																		
#	Output: n-way[2way-n].segments.table as 7 columns including (1)segmentID;(2)multipliconID;(3)species;(4)scaffold;(5)start_anchorID;(6)end_anchorID;(7)order.
#####################################################################################################################################################################

(5)	Step3.pl was used to obtain segments.gff3 file in step3 of run.sh
#####################################################################################################################
#	This script can be used to obtain segments.gff3 used in collinear_detection.sh as Step3.					
#	n-way[2way-n]_segments.gff3: Collect the start and end information from MAAs.table and segments.table.		
#	Usage: perl Step3.pl n-way[2way-n]_segments.table MAAs.table > n-way[2way-n].segments.gff3.					
#	Input: n-way[2way-n]_segments.table, MAAs.table.															
#	Output: n-way[2way-n].segments.gff3.																		
#	output with 9 Columns as: (1)scaffold;(2)ADHore;(3)coline;(4)Start;(5)End;(6).;(7)strand;(8)spec;(9)annotation	
#####################################################################################################################



=====================================
Input directory
=====================================
This directory provides MAAs.table produced by MAAs_identification.sh; ini file indicating the running parameters of i-ADHore, MAAs_gff3_dir including MAAs.gff3 and segments.txt produced by i-ADHore required in run.sh. 


=====================================
Output directory
=====================================
This directory provides two types of collinear segments files as table and gff3 files, and the segments at two different levels (n-way and 2way-n) are displayed individually.
(1) table files include 7 Columns as:(1)segmentID;(2)multipliconID;(3)species;(4)scaffold;(5)start_anchorID;(6)end_anchorID;(7)order.
(2) gff3 files include 9 Columns as:(1)scaffold;(2)ADHore;(3)coline;(4)Start;(5)End;(6).;(7)strand;(8)spec;(9)annotation 

=====================================
README file
=====================================
It is this file.

 



