BDL is the industry's only true secure, enterprise-ready Big Data distribution aimed at understanding unstructured data. BDL addresses the complete needs of data-at-rest and dark data, powers a new breed of customer applications and delivers unprecedented insights to accelerate innovation.
Hadoop Distributed File System (HDFS) is the core component of the BigConnect Data Lake (BDL) for data-at-rest and dark data. HDFS provides a scalable, fault-tolerant and cost-efficient storage for your Big Data Lake. Operational data sits on top of HDFS in Druid, Accumulo, or BigConnect Graph Engine
BDL includes a versatile range of processing engines that empowers you to interact with the same data in multiple ways, at the same time. This means applications for big data analytics can interact with the data in the best way: from federated, interactive SQL to low latency access with NoSQL or linked data through Cypher.
Best of breed tools for big data analytics are available to derive instant knowledge and insights. Work with BigConnect Explorer to search, aggregate and see how data is linked or with BigConnect Discovery to create stunning dashboards and reports
Emerging use cases for data science are enabled with Apache Spark, TensorFow, H2O, Jupyter Notebooks and Prodigy, all in a seamingless, integrated way. Train, Test and Productionize your state-of-the-art model with Stream Sets data pipelines or data preparation workflows in BigConnect Discovery.
BDL brings data access and management to a new level with powerful tools for data governance and integration. They provide a reliable, repeatable, and complete framework for managing the flow of data in and out of your Data Lake. This controlled structure, along with a set of tooling to ease and automate the application of schema or metadata on sources is critical for successful integration of a Data Lake into your modern data architecture
More info: BigConnect Data Collector
Security is woven and integrated into BDL in multiple layers. Critical features for authentication, authorization, accountability and data protection are in place to help secure BDL across these key requirements.
Operations teams deploy, monitor and manage a BDL cluster within their broader enterprise data ecosystem. Apache Ambari simplifies this experience. Ambari is an open source management platform for provisioning, managing, monitoring, and securing the BigConnect Data Lake, enabling it to fit seamlessly into your enterprise environment
More and more enterprise architectures are shifting to hybrid and multi-cloud environments. While this shift allows for more flexibility and agility, it also means having to separate compute from storage, creating new challenges in how data needs to be managed and orchestrated across frameworks, clouds and storage systems. BDL provides an in-memory HDFS-compatible Virtual File System to work with on-premise and cloud data in a unified way,enabling hybrid data.
More info: Alluxio