The first open-source Big Data Application platform
Would you spend thousands of man-days to build a Big Data Platform ?
We wouldn't, and this is why we built BigConnect.
A Big Data fusion platform to understand
any amount of data, from any source, in any format
Ingest any kind of information
Databases, Documents (PDF, Office files, text documents etc.), Images, Audio, Video, and Web sites (using Sponge)
Get data in using Drag & Drop, Flink, Spark, ETL tools (Nifi, Oracle, IBM, Microsoft, Pentaho) or trough the API
Extract and apply knowledge
Extract relevant knowledge like text, image tags, video frames, audio text and more.
Using embedded text processing, image, audio and video pipelines and auto-trainable DNN models, the possibilities are limited only by your imagination.
Search through the data lake
Search through the data lake using Boolean operators with full-text, spatial, range, fuzzy, and wildcard queries using an intuitive search interface where you can drill down, filter and sort using the Query Builder.
Or, use Cypher to build specialized queries.
See the data
Build vizualizations and aggregations on top of your searches, perform link analysis or spatial analysis or define advanced behaviours to identify relevant knowledge.
Use Hadoop, Spark or ML to do your magic.
Connect your systems
Whether is about BigConnect talking to other systems or you talking to BigConnect, we have it covered:
- JDBC driver to run SQL queries against BigConnect data
- Java, Node and PHP clients to connect to BigConnect remotely
Spark, Apache NiFi, Pentaho Data Integration and more to follow.
Train, Learn, Evaluate
BigConnect has text, image, audio and video processing pipelines that use auto-trainable Deep Convolutional Neural Networks models to extract relevant information from unstructured data:
- Named Entity Extraction
- Realtime Image/Video Object Detection
- Face Detection, Face Recognition
The visual console is collaborative, and all actions are taken inside a space.
All data that is added, modified, queried or analyzed can be contained to that space alone, can be shared with specific users or made available to everybody.
Customize , Extend, Adapt
BigConnect is designed to be highly extensible and customizable at all levels. The platform can be extended using back-end and font-end plugins. The storage and search layers are also pluggable.
See the online guides for details on what are the extension points and have a look at some sample plugins.
At the data level, fine-grained access can be controlled up to an object's attributes and only users with specific authorizations can access the data.
In the visual console, security is enforced through roles and privileges that control what users can do, at a global level and, space level and data level.
Built on top of Hadoop, Accumulo and ElasticSearch, the platform can scale massively to thousands of nodes.
Benchmark tests show that a cluster of 1000 nodes can ingest up to 100 million records per second and a search on 1 Tb of records is performed near-realtime.