Аннотации:
The data of testing the distributed storage system with data splitting approved
in the internal network of the organization are given. One of the main parameters of the tested
distributed storage system is the splitting level, which is responsible for the parity dimension
generated in the splitting process. In the course of tests, in which the levels of splitting gradually
increased, the optimal value for the information network of the organization was established.
With the increase in the level of splitting and the associated parity dimension, the resistance to
losses of split parts of files distributed among the nodes involved in storing information increases.
At the same time, with the increase in the level of splitting, the time of information processing
in the system also increases. Testing has shown that the most optimal levels of splitting are
the levels from the second to the third, at which there is a reasonable compromise between
the reliability of storage and latency of the system. At these levels, up to 33% of nodes are
guaranteed to recover files 100% in case of failure, and the probability of information loss was
0.07 in the case of 44% of nodes failure. The most relevant distributed storage of information
with data splitting for large volumes that require the use of multiple storage locations in which
it is known that virtually the only method to achieve reliability is multiple redundancy and
replication (Hadoop FS, ZFS). The described storage method, having less redundancy with
comparable reliability, can be recommended for use in Big Data systems.