The AGINFRAplus4BioCos Virtual Research Environment has been deployed to enact the BioCos members to exploit AGINFRAplus facilities and services to reconsider their current solutions.
BioCoS novel bioinformatics tool, namely “TRUE PLANT” Biomarkers Computational System (“TRUE-PLANT” BioCoS), is capable to identify potential genetic markers (SNPs, SSR, etc) and/or species- specific genomic loci to be applied as artificial markers, using DNA information from thousands of publicly available complete genomes. The functional principle of the tool relies in the elaboration of thousands of genomes (~15.000), namely the pan-genome, minus those associated to the species/organism of interest. The aim is to assess short sequences of specific length not tolerated in the pan-genome, but evolutionary conserved in the species/organism of interest. Post-processing of these sequences leads to the isolation of potential genetic markers to be used in DNA authenticity solutions, as well as artificial ones to be applied as tags in DNA traceability.
“TRUE-PLANT” BioCoS is based on alignment free algorithms (no whole genome alignments) and reduces the needs of samples de-novo sequencing. Despite the already performed optimization of the algorithms behind the tool, different stages of the process are computationally demanding in base of the application. Indicatively, in the case of the Olive genome to complete the first stage of the process (extraction of the non-tolerant sequences from the pan-genome; ~15.000 complete genomes) it required 6 days using ~250GB RAM and 8Tb storage disk. Thus, its application to several other species of high commercial interest in the food industry would increase computational needs e.g., for 100 organisms using the same computational set-up would require ~600 days.
This VRE is private, only selected members are allowed to use it.