Preview

Journal of NBC Protection Corps

Advanced search

Modern Bioinformatics Solutions Used for Genetic Data Analysis

https://doi.org/10.35825/2587-5728-2023-7-4-366-383

Abstract

Effective counteraction to biological threats, both natural and man-made, requires the availability of means and methods  for rapid and reliable microorganism identification and a comprehensive study of their basic biological properties.  Over the past decade, the arsenal of domestic microbiologists has been supplemented by numerous methods for  analyzing the genomes of pathogens, primarily based on nucleic acid sequencing. The purpose of this work is to provide  the reader with information about capabilities of modern technical and methodological arsenal used for in-depth  molecular genetic study of microorganisms, including bioinformatics solutions used for the genetic data analysis. The  source base for this research is English-language scientific literature available via the Internet, bioinformation software  documentation. The research method is an analysis of scientific sources from the general to the specific. We considered  the features of sequencing platforms, the main stages of genetic information analysis, current bioinformation utilities,  their interaction and organization into a single workflow. Results and discussion. The performance of modern genetic  analyzers allows for complete decoding of the bacterial genome within one day, including the time required to prepare  the sample for research. The key factor that largely determines the effectiveness of the genetic analysis methods used is  the competent use of the necessary bioinformatics software utilities. Standard stages of primary genetic data analysis  are assessment of the quality control, data preprocessing, mapping to a reference genome or de novo genome assembly,  genome annotation, typing and identification of significant genetic determinants (resistance to antibacterial drugs,  pathogenicity factors, etc.), phylogenetic analysis. For each stage bioinformation utilities have been developed, differing  in implemented analysis algorithms. Conclusion. Open source utilities that do not require access to remote resources  for their operation are of greatest interest due to activities specifics of NBC protection corps units.

About the Authors

Ya. A. Kibirev
Branch Office of the Federal State Budgetary Establishment «48 Central Scientific Research Institute» of the Ministry of Defence
Russian Federation

Yaroslav A. Kibirev - Chief of the Department. Cand. Sci. (Biol.)

Oktyabrsky Avenue 119, Kirov 610000



A. V. Kuznetsovskiy
Branch Office of the Federal State Budgetary Establishment «48 Central Scientific Research Institute» of the Ministry of Defence
Russian Federation

Andrey V. Kuznetsovskiy - Deputy Chief of the Branch Office. Cand. Sci. (Biol.)

Oktyabrsky Avenue 119, Kirov 610000



S. G. Isupov
Branch Office of the Federal State Budgetary Establishment «48 Central Scientific Research Institute» of the Ministry of Defence
Russian Federation

Sergey G. Isupov - Deputy Chief of the Department, Cand. Sci. (Med.)

Oktyabrsky Avenue 119, Kirov 610000



I. V. Darmov
Branch Office of the Federal State Budgetary Establishment «48 Central Scientific Research Institute» of the Ministry of Defence
Russian Federation

Ilya V. Darmov -  Leading Researcher. Dr. Sci. (Med.), Professor

Oktyabrsky Avenue 119, Kirov 610000



References

1. Morens DM, Fauci AS. Emerging pandemic diseases: how we got to COVID-19. Cell. 2020;182(5):1077–92. https://doi.org/10.1016/j.cell.2020.08.021

2. Smit M, Marinosci A, Agoritsas T, Calmy A. Prophylaxis for COVID-19: a systematic review. Clin Microbiol Infect. 2021;27(4):532–7. https://doi.org/10.1016/j.cmi.2021.01.013

3. Graña C, Ghosn L, Evrenoglou T, Jarde A, Minozzi S, Bergman H, et al. Efficacy and safety of COVID-19 vaccines. Cochrane Database Syst Rev. 2022;12(12):CD015477. https://doi.org/10.1002/14651858.CD015477

4. Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, et al. Nucleotide sequence of bacteriophage φX174 DNA. Nature. 1977;265(5596):687–95. https://doi.org/10.1038/265687a0

5. Watts D, MacBeath JRE. Automated fluorescent DNA sequencing on the ABI PRISM 310 Genetic Analyzer. In: DNA Sequencing Protocols. Methods in Molecular Biology, vol 167. Graham CA, Hill AJM, Eds. Humana Press; 2001. https://doi.org/10.1385/1-59259-113-2:153

6. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26:1135-45. https://doi.org/10.1038/nbt1486

7. Hernandez D, François P, Farinelli L, Osterås M, Schrenzel J. De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. Genome Res. 2008;18(5):802-9. https://doi.org/10.1101/gr.072033.107

8. Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, et al. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics. 2012;13:341. https://doi.org/10.1186/1471-2164-13-341

9. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323(5910):133–38. https://doi.org/10.1126/science.1162986

10. Arumugam K, Bessarab I, Liu X, Natarajan G, Drautz-Moses DI, Wuertz S, et al. Improving recovery of member genomes from enrichment reactor microbial communities using MinION–based long read metagenomics. bioRxiv. 2018:465328. https://doi.org/10.1101/465328

11. Maljkovic Berry I, Melendrez MC, Bishop-Lilly KA, Rutvisuttinunt W, Pollett S, Talundzic E, et al. Next generation sequencing and bioinformatics methodologies for infectious disease research and public health: approaches, applications, and considerations for development of laboratory capacity. J Infect Dis. 2020;221(Suppl 3):S292–S307. https://doi.org/10.1093/infdis/jiz286

12. Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. 2018;24(4):335–41. https://doi.org/10.1016/j.cmi.2017.10.013

13. Robinson JM, Pasternak Z, Mason CE, Elhaik E. Forensic applications of microbiomics: a review. Front Microbiol. 2021;11:608101. https://doi.org/10.3389/fmicb.2020.608101

14. Allali I, Arnold JW, Roach J, Cadenas MB, Butz N, Hassan HM, et al. A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome. BMC Microbiol. 2017;17(1):194. https://doi.org/10.1186/s12866-017-1101-8

15. Chaudhari HG, Prajapati S, Wardah ZH, Raol G, Prajapati V, Patel R, et al. Decoding the microbial universe with metagenomics: a brief insight. Front Genet. 2023;14:1119740. https://doi.org/10.3389/fgene.2023.1119740

16. Vincent AT, Derome N, Boyle B, Culley AI, Charette SJ. Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money. J Microbiol Methods. 2017;138:60–71. https://doi.org/10.1016/j.mimet.2016.02.016

17. Lema NK, Gemeda MT, Woldesemayat AA. Recent advances in metagenomic approaches, applications, and challenge. Curr Microbiol. 2023;80(11):347. https://doi.org/10.1007/s00284-023-03451-5

18. Cornet L, Baurain D. Contamination detection in genomic data: more is not enough. Genome Biol. 2022;23:60. https://doi.org/10.1186/s13059-022-02619-9

19. Bush SJ, Connor TR, Peto TEA, Crook DW, Walker AS. Evaluation of methods for detecting human reads in microbial sequencing datasets. Microb Genom. 2020;6(7):mgen000393. https://doi.org/10.1099/mgen.0.000393

20. Salzberg SL, Breitwieser FP, Kumar A, Hao H, Burger P, Rodriguez FJ, et al. Next-generation sequencing in neuropathologic diagnosis of infections of the nervous system. Neurol Neuroimmunol Neuroinflamm. 2016;3(4):e251. https://doi.org/10.1212/NXI.0000000000000251

21. Brennan C, Salido RA, Belda-Ferre P, Bryant M, Cowart C, Tiu MD, et al. Maximizing the potential of high-throughput next-generation sequencing through precise normalization based on read count distribution. mSystems. 2023;8(4):e0000623. https://doi.org/10.1128/msystems.00006-23

22. Portik DM, Brown CT, Pierce-Ward NT. Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets. BMC Bioinformatics. 2022;23(1):541. https://doi.org/10.1186/s12859-022-05103-0

23. Reinert K, Langmead B, Weese D, Evers DJ. Alignment of next-generation sequencing reads. Annu Rev Genomics Hum Genet. 2015;16:133-51. https://doi.org/10.1146/annurev-genom-090413-025358

24. Liu Y, Shen X, Gong Y, Liu Y, Song B, Zeng X. Sequence Alignment/Map format: a comprehensive review of approaches and applications. Brief Bioinform. 2023;24(5):bbad320. https://doi.org/10.1093/bib/bbad320

25. Antipov D, Raiko M, Lapidus A, Pevzner PA. Plasmid detection and assembly in genomic and metagenomic data sets. Genome Res. 2019;29(6):961-8. https://doi.org/10.1101/gr.241299.118

26. Gupta SK, Raza S, Unno T. Comparison of de-novo assembly tools for plasmid metagenome analysis. Genes Genomics. 2019;41(9):1077–83. https://doi.org/10.1007/s13258-019-00839-1

27. Gurevich A, Saveliev V, Vyahhi N, Tesler G, QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;8(29):1072–5. https://doi.org/10.1093/bioinformatics/btt086

28. Huang B, Wei G, Wang B, Ju F, Zhong Y, Shi Z, et al. Filling gaps of genome scaffolds via probabilistic searching optical maps against assembly graph. BMC Bioinformatics. 2021;22(1):533. https://doi.org/10.1186/s12859-021-04448-2

29. Lu J, Rincon N, Wood DE, Breitwieser FP, Pockrandt C, Langmead B, et al. Metagenome analysis using the Kraken software suite. Nat Protoc. 2022;17(12):2815–39. https://doi.org/10.1038/s41596-022-00738-y

30. Nascimento M, Sousa A, Ramirez M, Francisco AP, Carriço JA, Vaz C. PHYLOViZ 2.0: providing scalable data integration and visualization for multiple phylogenetic inference methods. Bioinformatics. 2017;33(1):128–9. https://doi.org/10.1093/bioinformatics/btw582

31. Rose R, Golosova O, Sukhomlinov D, Tiunov A, Prosperi M. Flexible design of multiple metagenomics classification pipelines with UGENE. Bioinformatics. 2018;11(35):1963–5. https://doi.org/10.1093/bioinformatics/bty901


Review

For citations:


Kibirev Ya.A., Kuznetsovskiy A.V., Isupov S.G., Darmov I.V. Modern Bioinformatics Solutions Used for Genetic Data Analysis. Journal of NBC Protection Corps. 2023;7(4):366-383. (In Russ.) https://doi.org/10.35825/2587-5728-2023-7-4-366-383

Views: 264


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2587-5728 (Print)
ISSN 3034-2791 (Online)