######################################################################
 
README
 
######################################################################

=====================================
Usage Description
=====================================
The module of Function_transfer is used to transfer GO terms from 10 species of Uniprot-proteome database to Cucumber protein-coding genes guided by orthologous relationship among 15way species inferred by OPPs_identify.sh.
Three main steps are performed as: 
	(1) Step1:performing function transfer at 2way-n level. 
	(2) Step2:performing function transfer at n-way level.
	(3) Step3:Combining all results into final.anno.

 
=====================================
Directory Contents
=====================================
 
This directory includes README, input directory, output directory and scripts directory. 
 
 
The sections below include:

	scripts directory
		run.sh
		pl_scripts
			function_transfer.pl		
	Input directory
		genemap_good directory
			n-way directory
				ref_spec_{n}way_good.genemap
			2way-n directory
				ref_spec_2way-{n}_good.genemap
		uniprot-proteome directory
			uniprot-proteome_species.txt
	Output directory
		n-way directory
			ref_spec_{n}-way.anno
		2way-n directory
			ref_spec_2way-{n}.anno
		2way-n_all.anno file
		n-way_all.anno file
		Final.anno file
	README file
 
 
=====================================
run.sh file
=====================================
#####################################################################################################################################
# Prepare dataset: genemap_good files produced by OPPs_identification.sh, 10 species of uniprot-proteome database							
# Description: Step 1 performing function transfer at 2way-n level.			    												
#              Step 2 performing function transfer at n-way level.															
#              Step 3 Combining all results into final.anno.																	
# Dependency tools: function_transfer.pl							 																
# Usage: sh function_transfer.sh 			   																				
#	Input: genemap_good/n-way(2way-n)/good.genemap, uniprot-proteome.txt												
#	Output: Final.anno																									
#	output with 7 Columns as: (1)segmentID;(2)ref_geneID;(3)spec_geneID;(4)GO-terms;(5)Evidence;(6)species;(7)way
#####################################################################################################################################



=====================================
scripts directory
=====================================
This directory provides the dependency one perl scripts required in run.sh.

(1)	function_transfer.pl was used to perform geneset2genemap required in step2 of run.sh
#########################################################################################################################################################
# This script can be used to perform function transfer from uniprot-proteome database to protein-coding genes in Cucumber guided by the OPPs results.	
# Usage: perl function_transfer.pl uniprot_anno_spec.txt ref_spec_n-way[2way-n].good.genemap n-way[2way-n] spec > Final.anno							
#	Input: 
#	(1) uniprot_anno.txt: Ten species of uniprot-proteome database were downloaded from http://www.uniprot.org/proteomes/					
#	(2) good.genemap: ref_spec_nway_good.genemap obtained by Step3.1 in OPPs_identigy.sh													
#	(3) n-way: the segments level that supports the inference of OPPs																		
#	(4) species: indicating the functional annotation of origin																				
#	Output:	Final.anno																																	
#	Final.anno with 7 columns as: segmentID; 2: ref_geneID; 3: spec_geneID; 4: GO-term; 5: evidence; 6: spec; 7: n-way[2way-n]				
#########################################################################################################################################################


=====================================
Input directory
=====================================
This directory provides:
(1) genemap_good/*_good.genemap files obtained by OPPs_identification.sh. 
(2) uniprot-proteome/uniprot-proteome_species.txt downloaded from http://www.uniprot.org/proteomes/.


=====================================
Output directory
=====================================
This directory provides: 
(1) 2way-n directory includes files as ref_spec_2way-{n}.anno with 7 columns as: segmentID; 2: ref_geneID; 3: spec_geneID; 4: GO-term; 5: evidence; 6: spec; 7: 2way-n.
(2) n-way directory includes files as ref_spec_{n}-way.anno with 7 columns as: segmentID; 2: ref_geneID; 3: spec_geneID; 4: GO-term; 5: evidence; 6: spec; 7: n-way.
(3) 2way-n_all.anno is the total annotation results at 2way-n level with 7 columns as: segmentID; 2: ref_geneID; 3: spec_geneID; 4: GO-term; 5: evidence; 6: spec; 7: n-way.
(4) n-way_all.anno is the total annotation results at n-way level with 7 columns as: segmentID; 2: ref_geneID; 3: spec_geneID; 4: GO-term; 5: evidence; 6: spec; 7: n-way.
(5) Final.anno is the all annotations at both of levels which was used to further analysis.


=====================================
README file
=====================================
It is this file.

 



