Development of knowledge bases with augmented generation using basic models of artificial intelligence
DOI: 10.31673/2412-9070.2024.050247
DOI:
https://doi.org/10.31673/2412-9070.2024.050247Abstract
Trends in global technology growth, computing power costs cheapening and continuous increase in data volumes that need to be managed and processed, necessitate cardinal reconsideration and improvement of methods and approaches to data governance and management. At same time, recent achievements in artificial intelligence (further abbreviated as AI) field, machine learning (abbreviated as ML) technologies, and natural language processing (abbreviated as NLP) algorithms have signifycantly simplified methods and approaches to data management and interaction, as well as allow to substantially automate processes and routine tasks, which handle this textual data. The article descrybes theoretical aspects and principles of building knowledge bases using semantic search (or similarity search) and generative augmented generation (abbreviated as RAG), based on foundation models of artificial intelligence (frequently referred to as Foundation Models). This study examines the fundamental principles of similarity search algorithm as a fundamental functional component of knowledge bases, and its implementation using vector databases. The article also discusses a practical example of implementing such knowledge bases based on managed Bedrock service. Situational examples provided in the article for organizing data in form of knowledge bases, can be used in practical projects for automating and enhancing processes which operate with the textual data. The implementation of the knowledge base proposed in the article, can be used as a foundation for developing intelligent chatbots familiar with the subject domain for automating user support systems. Knowledge bases, integrated with retrieval augmented generation also can be used for scientific researches and variety of scientific tasks, like data analysis and classification, automation of literature review, question-answering systems, data extraction from unstructured sources. Knowledge bases can be used as collaborative research platforms.
Keywords: Knowledge bases, retrieval augmented generation, artificial intelligence, generative foundation models of artificial intelligence, natural language processing, similarity search, semantic search, vector databases