This script is used to convert some Genbank format files to the GFF3 format (including Fasta). All features describes in the sheet will result in a GFF entry.
GFF entries will also refer to original Genbank file with an additional attribute to allow the download of original sheet for any entry.
The script is located in solr/bin directory of the distribution and requires BioPerl.
Calling genbank2gff.pl with no option will output help usage. Basic usage is:
perl genbank2gff3.pl --dir path_to_files --outdir path_to_converted_files
A --json option is available to extract sequence data (nucleic or proteic) and to write them in a flat file in the JSON format. This format is supported as input for riak and mongodb. Furthermore mongodb provides an import tool that takes input from json files, really convenient for bulk uploads (first import for example).