Bigdata use cases
we are trying to create a dashboard using BigData. The Data are currently transacted in SQLServer and the front end is in MVC. As the data flow is extremely high to analyse using SQLServer itself it is decided to use BigData. I had chosen Cloudera Manager CDH, SQOOP to import data from SQLServer to HIVE and running the analytic using IMPALA. Decided to up the results with Microstrategy to provide the charts in mobile platform to the clients. Any Ideas or suggestion are welcome to improve the process?
Looks like you're off to a great start. Remember your analytics can be done with a multitude of tools, not just Impala.
Once you're in Hadoop, Hive and Pig give a lot of power (more available with UDFS) with an easy learning curve.
If you eventually want to do some iterative use cases (and exploit machine learning), you might want to check out Spark (those two things are in its wheelhouse), which is not constrained by (to?) MapReduce.
Tons of great tools available. Enjoy the journey.
I would consider to use two stages. Data Analysis and Data Visualisation. Use two stages makes the solution more flexible and decouple the responsiblility.
- Ingest the data (Include cleaning), Sqoop can do the ingest step, might require extra steps to cleaning the data.
- Explore/Analyse the data, Apache Spark is a very flexible and powerful tool.
- Store the analyse result in a specified format
- Load the data from data analysis phase
- Visualise it. Using Highcharts/Kibana/Dashing. Or use D3 create customised dashboard.