Perfect Infrastructure for Big-Data and Machine Learning
Hardware
The CfADS Data-Analytic Cluster (DA-Cluster) consists of 18 physical server nodes that are internally connected over an InfiniBand network with a bandwith of 56 Gbit/s. Combined with the Hadoop framework, the DA-Cluster is designed to efficiently process large amounts of data and parallelize computations in a distributed fashion. In addition, several state-of-the-art AI tools are available. Thus, different approaches and requirements for a professional data science workflow can be fully covered. As a result, the DA-Cluster provides a highly adjustable environment for different types of data science projects.
The DA-Cluster's architecture was designed to obtain a very high level of data security and integrity. For data protection purposes, 4 security tiers are set up, which span from the user interface to the backup layer. The data transfer into the cluster is fully encrypted and access permissions to stored data is maximally limited. Consequently, the DA-Cluster provides a responsible handling when it comes to highly sensitive data. Moreover, the distributed, redundant file system (Hadoop HDFS) and the attached backup layer (NAS) effectively protect against data losses.