fb
WSO2 Tutorial 4 min

WSO2Con USA 2015 – Day 1 – WSO2 Data Analytics Server tutorial

Thijs Volders
Thijs Volders
Strategic Technology Officer
Scroll

San Francisco, November 2, 2015 –  Yenlo is a Silver Sponsor at the WSO2Con USA 2015 which will be held from November 2 to 4 at Park Central Hotel, San Francisco. WSO2Con USA 2015 presents opportunities to interact with experts, find innovative solutions, be inspired by keynotes presented by Google, Forrester Research and West Interactive and customer uses cases on successful WSO2 product deployments. This blog shows a summary of the WSO2 Data Analytics Server (DAS) tutorial on Day 1, written by Thijs Volders from Yenlo.

Today, 2nd november 2015, WSO2Con US is starting off with the tutorial day. During this day several tutorials are being provided by WSO2 technical engineers. Today I am starting my day with joining the tutorial for “WSO2 Analytics Platform: One stop shop for all Your Data Needs”, a tutorial about the WSO2 Data Analytics Server product of WSO2.
The Data Analytics Server (DAS for short) is a new product in the complete range of products. It is an evolution of the Business Analytics Monitoring server (BAM). The main changes for the DAS product are that many underlying libraries have been changed. For instance, the apache Hive batch analytics and Hadoop distributed File system libraries have been replaced by Apache Spark which has much better performance than Hadoop.
Previously WSO2 had two products in the analytics space, namely the Business Activity Monitor (BAM) and the Complex Event Processor (CEP). These products have a large amount of overal and were based on the same core components. Thus it was a natural evolution to merge these into a single more powerful product now called the Data Analytics Server.
The DAS is capable of doing both batch- as well as realtime analytics> Batch processing was done using apache Hive and Hadoop and has been replaced by apache Spark in the DAS. Realtime analytics is still done using the Siddhi engine as was already the case with WSO2 Complex Event Processing.
There are several additional functions in the DAS product amount which is an Analytics REST API. This API provides access to the analytics tables which are being used inside the persistent event store. Such an API allows you to use other tools to do analytics on the raw data.
These analytics tables were previously (in BAM and CEP) stored in Cassandra. Together with Hive, Hadoop and Zookeeper you’d then have a complete analytics platform. The downside of that solution was that in a proper production setup you’d need several servers to manage all components of the platform. This resulted in a minimum of around 12 instances needed.  By using apache Spark the number of instances is going down as Apache Spark needs less server in a production setup. Its much simpler in form and function compared to Apache Hive.
A long awaited addition to the DAS is the addition of RDBMS support as analytics data store. Many customer know their RDBMS very well and find the introduction of a NoSQL database type rather problematic. In the DAS we can now choose to use a ‘regular’ RDBMS instead of Cassandra. You can now basically use any JDBC compatible datastore as database engine. By adding the support for RBMS as analytics store there was also a need for an abstraction layer for accessing the analytics tables in the underlying database. Thus you don’t need to write database type specific statements to access the analytics tables of the DAS but you can use a database-type agnostic language to access the analytics data.
Interactive analytics has been added to the DAS. An Apache Lucene (indeed WSO2 uses lots of components from Apache…) has been added as an indexing library to offer full text search capabilities. By using the Apache Lucene Query Language you can search through its index. Together with the Interactive Analytics Console you can execute separate analytics statements and see the results of these statement immediately. You can use this to define and test your analytics script before deploying it into the DAS. The console allows you to input SparkSQL, just as you have in the analytics script, and see the results of the SparkSQL execution.
The DAS admin console user interface offers a way of defining an event stream through the UI. This can help in easily defining a new event-stream definition. Also there is a screen added through which you can simulate an event submission.This helps to easily test the stream definition and even processing etc.
In the DAS we no longer deploy the Toolboxes as tbox files. They are now contained within CAR files just like many other artifacts which are deployed on WSO2 products. This makes the overall deployment model simpler as they’ve yet consolidate another file format into the CAR deployment model.
Creating dashboards in the DAS server is also really easy. Wizards are available to create dashboard and to fill them up with gadgets. Creating a gadget is also really simple as its being backed by a wizard functionality as well.
When creating a dashboard you can select a layout for the dashboard like a grid-layout, single- or column layout and various other types of layouts. You can create a new gadget to display something in the form of a bar-graph, pie chart or line-diagram through a couple of simple steps like selecting the table or event stream you want the data to come fom and then define the chart type and attributes to use from the table or event stream. Once done the gadget is then added to the gadget store which you can use to add the gadget to your dashboard.
The overall impression from the DAS server is that it has made some great improvements over the CEP and BAM products. Looking forward to start using it at customers!

Full API lifecycle Management Selection Guide

WHITEPAPER

smartmockups l0qqucke