Pentaho and NoSQL Databases
Access, Integrate, Visualize, Explore and Predict
Pentaho Business Analytics provides easy to use visual development tools and big data analytics that empower users to easily prepare, model, visualize and explore data sets stored in NoSQL databases such as MongoDB, Cassandra and HBase. Pentaho simplifies the end-to-end NoSQL data life cycle by providing a complete platform from data preparation to predictive analytics.
Visual development for NoSQL data preparation and modeling
Pentaho’s visual development tools drastically reduce the time to design, develop and deploy NoSQL analytics solutions by as much as 15x compared to traditional custom coding and ETL approaches.
Pentaho provides a powerful visual user interface for ingesting and manipulating data within NoSQL databases, as well as making it easy to enrich NoSQL data by integrating with reference data from other sources. Pentaho makes it easy to access NoSQL data, either directly, or through rapid visual extraction into data marts/warehouses optimized for expressive aggregate queries. A visual tool for defining business metadata models helps developers prepare their data for analytics.
With a simple, point-and-click alternative to writing custom code, Pentaho exposes a familiar ETL-style user interface. NoSQL databases easily become usable by IT and data scientists, not just developers with specialized coding skills.
Would you rather do this ... or this?
Directly connect for immediate analysis
Pentaho can report directly against NoSQL databases, eliminating any need to extract data and load it into conventional relational databases or data marts.
The real-time high performance query characteristics of NoSQL databases make direct connection an ideal way of quickly and easily reporting against data in NoSQL systems.
Visual interface and drag & drop orchestration
Pentaho provides a powerful library of graphical job steps for orchestrating execution of jobs for NoSQL databases and other large data warehouses. These include conditional checking steps, event waiting steps, execution steps and notification steps.
Together these steps enable easy visual assembly of powerful job flow logic, across multiple jobs and data sources.
An end-to-end analytical platform, Pentaho Business Analytics provides visual development tools for IT developers and analysts to immediately integrate and orchestrate NoSQL data with relational data warehouses and marts, enterprise applications and data stored in cloud applications.
Pentaho also provides complete business analytics for NoSQL databases, including direct-connect reporting, visualization, dashboards, interactive analysis and advanced statistical and predictive analytics.
Complete big data analytics
Either through direct-connect interactive reporting and visualization, or by simplifying the process of extracting data from a NoSQL database into a relational database for interactive data exploration, Pentaho provides the ability to immediately deploy powerful analytics for data in NoSQL databases.
The tightly-coupled data integration and business analytics platform enables IT and business users easily explore data in NoSQL databases through:
- Rich visualization – Interactive web-based interfaces for ad hoc reporting, charting and dashboards.
- Flexible exploration – Views of data across dimensions such as time, product and geography and across measures such as revenue and quantity.
- Predictive analysis – Powerful predictive analytics capabilities using advanced statistical algorithms such as classification, regression, clustering and association rules.
Instant and interactive NoSQL analytics for data analysts
Pentaho Instaview takes data analysts from data to visualization in minutes with interactive self-service access and analytics for Hadoop. Preparation of Hadoop data for analysis is greatly simplified and automated, enabling users to accelerate the big data analytics cycle from days and weeks to minutes and hours.
Learn more: Pentaho Instaview
Extending NoSQL query languages
Many NoSQL database access methods are missing analytic query operations such as grouping and sorting of data. Pentaho effectively extends for the capabilities of these NoSQL databases by post-processing query results. A library of operations is available and can be applied to prepare data for analytics.
Scalability for even the most complex organizations
Pentaho's Java-based engine is multi-threaded. Each step in a job executes on its own thread, leveraging the multi-core processors running on each node of the cluster. As a result, Pentaho's ETL jobs for NoSQL databases often execute many times faster than equivalent hand-coded jobs.
Unparalleled support for NoSQL databases