A theoretically proposed algorithm in a decision tree format for choosing an efficient storage type of large datasets
Year:
2022Published in:
Technology Center PCThe object of research is methods and approaches to improve storage efficiency and optimize access to large amounts of data. The importance of this study consists in the wide dissemination of big data and the need for the right selection of technologies that will help improve the efficiency of big data processing systems. The complexity of the choice is caused by the large number of different data storages and databases that are available now, so the best decision requires a deep understanding of the advantages, disadvantages and features of each. And the difficulty lies in the lack of a universal algorithm for deciding on the optimal repository. Accordingly, based on the experiments, analysis of existing projects and research papers, a decision-making algorithm was proposed that determines the best way to store large datasets, depending on their characteristics and additional system requirements. This is necessary to simplify the design of the system in the early stages of big data processing projects. Thus, by highlighting the key differences, as well as the disadvantages and advantages of each type of storage and database, a list of key characteristics of the data and the future system, which should be considered when designing. This algorithm is a theoretical proposal based on the studied research papers. Accordingly, using this algorithm at the design stage of the system, it would be possible to quickly and clearly determine the optimal type of storage of large datasets. The paper considers column-oriented, document-oriented, graph and key-value types of databases, as well as distributed file systems and cloud services.
Related by author
12 publications found
The development of an electronic circuit simulation system using variable tabular bases
Publisher: Technology Center PC
Authors: Vadym Yaremenko, Bogdan Bulakh, Yaroslav Kornachevskyy, Oleksandr Beznosyk, Kostyantyn Kharchenko
A comparative analysis of text data classification accuracy and speed using neural networks, Bloom filter and naive Bayes
Publisher: Technology Center PC
Authors: Olena Hryshchenko, Vadym Yaremenko
МОДЕЛЬ МУЛЬТИАГЕНТНОЇ СИСТЕМИ ДЛЯ СЕМАНТИЧНОГО АНАЛІЗУ ТЕКСТІВ
Publisher: Луцький національний технічний університет
Authors: Vadym Yaremenko, Andrii Khudiakov
COMPARATIVE ANALYSIS OF SOFTWARE LIBRARIES FOR THE CLASSIFICATION OF TEXT DATA USING ARTIFICIAL NEURAL NETWORKS
Publisher: Таврійський національний університет ім. В.І. Вернадського
Authors: Vadym Yaremenko, Mykola Tarasenko
Development of a Multi‑Agent System for Solving Domain Dictionary Construction Problem
Publisher: SSRN
Authors: Vadym Yaremenko, Oleksandr Syrotiuk
Forecasting software development costs in scrum iterations using ordinary least squares method
Publisher: Technology Center PC
Authors: Vadym Yaremenko, Kostyantyn Kharchenko, Oleksandr Beznosyk, Bogdan Bulakh, Bogdan Kyriusha
Використання штучних нейронних мереж для визначення наявності сердцево‑судинних хвороб та захворювань печінки при малих наборах даних.
Publisher: Луцький національний технічний університет
Authors: Vadym Yaremenko, Sofiia Materynska
Підхід до використання фільтра блума для багатокласової класифікації текстових даних в режимі реального часу.
Publisher: Technology Center PC
Authors: Vadym Yaremenko, Dmytro Budonnyi
Neural Networks and Monte‑Carlo Method Usage in Multi‑Agent Systems for Sudoku Problem Solving
Publisher: SSRN
Authors: Vadym Yaremenko, Kateryna Poloziuk
Mobile Driving License System Deployment Model With Security Enhancement
Publisher: Theoretical and cryptographic problems of cybersecurity
Authors: Vadym Yaremenko, V. Blynkov