Journal article

A theoretically proposed algorithm in a decision tree format for choosing an efficient storage type of large datasets

Year:

2022

Published in:

Technology Center PC
large datasets
non-relational database
column-oriented database
document-oriented database
key-value database
graph database

The object of research is methods and approaches to improve storage efficiency and optimize access to large amounts of data. The importance of this study consists in the wide dissemination of big data and the need for the right selection of technologies that will help improve the efficiency of big data processing systems. The complexity of the choice is caused by the large number of different data storages and databases that are available now, so the best decision requires a deep understanding of the advantages, disadvantages and features of each. And the difficulty lies in the lack of a universal algorithm for deciding on the optimal repository. Accordingly, based on the experiments, analysis of existing projects and research papers, a decision-making algorithm was proposed that determines the best way to store large datasets, depending on their characteristics and additional system requirements. This is necessary to simplify the design of the system in the early stages of big data processing projects. Thus, by highlighting the key differences, as well as the disadvantages and advantages of each type of storage and database, a list of key characteristics of the data and the future system, which should be considered when designing. This algorithm is a theoretical proposal based on the studied research papers. Accordingly, using this algorithm at the design stage of the system, it would be possible to quickly and clearly determine the optimal type of storage of large datasets. The paper considers column-oriented, document-oriented, graph and key-value types of databases, as well as distributed file systems and cloud services.

Related by author

12 publications found

2025
Journal article

The development of an electronic circuit simulation system using variable tabular bases

Publisher: Technology Center PC

Authors: Vadym Yaremenko, Bogdan Bulakh, Yaroslav Kornachevskyy, Oleksandr Beznosyk, Kostyantyn Kharchenko

2021
Journal article

A comparative analysis of text data classification accuracy and speed using neural networks, Bloom filter and naive Bayes

Publisher: Technology Center PC

Authors: Olena Hryshchenko, Vadym Yaremenko

2020
Journal article

МОДЕЛЬ МУЛЬТИАГЕНТНОЇ СИСТЕМИ ДЛЯ СЕМАНТИЧНОГО АНАЛІЗУ ТЕКСТІВ

Publisher: Луцький національний технічний університет

Authors: Vadym Yaremenko, Andrii Khudiakov

2019
Journal article

COMPARATIVE ANALYSIS OF SOFTWARE LIBRARIES FOR THE CLASSIFICATION OF TEXT DATA USING ARTIFICIAL NEURAL NETWORKS

Publisher: Таврійський національний університет ім. В.І. Вернадського

Authors: Vadym Yaremenko, Mykola Tarasenko

2020
Working paper

Development of a Multi‑Agent System for Solving Domain Dictionary Construction Problem

Publisher: SSRN

Authors: Vadym Yaremenko, Oleksandr Syrotiuk

2024
Journal article

Forecasting software development costs in scrum iterations using ordinary least squares method

Publisher: Technology Center PC

Authors: Vadym Yaremenko, Kostyantyn Kharchenko, Oleksandr Beznosyk, Bogdan Bulakh, Bogdan Kyriusha

2020
Journal article

Використання штучних нейронних мереж для визначення наявності сердцево‑судинних хвороб та захворювань печінки при малих наборах даних.

Publisher: Луцький національний технічний університет

Authors: Vadym Yaremenko, Sofiia Materynska

2019
Journal article

Підхід до використання фільтра блума для багатокласової класифікації текстових даних в режимі реального часу.

Publisher: Technology Center PC

Authors: Vadym Yaremenko, Dmytro Budonnyi

2021
Working paper

Neural Networks and Monte‑Carlo Method Usage in Multi‑Agent Systems for Sudoku Problem Solving

Publisher: SSRN

Authors: Vadym Yaremenko, Kateryna Poloziuk

2020
Journal article

Mobile Driving License System Deployment Model With Security Enhancement

Publisher: Theoretical and cryptographic problems of cybersecurity

Authors: Vadym Yaremenko, V. Blynkov