Friday, June 22, 2012

Jumbune - MapReduce Execution Flow Profiler



Welcome Jumbune [juhm-b-yoon], industry’s first MapReduce Flow Profiler that will help you to:

  • Analyze Cluster wide Hadoop MapReduce Job(s) Flow Execution
  • Profile MapReduce Jobs,
  • Monitors Hadoop Clusters,
  • Validate HDFS Data

Jumbune - High level Overview
Where Jumbune Helps:
Hadoop has become the vocabulary of Big Data. Every enterprise
organization, which wants to analyze its Terabytes or Petabytes of data is
either actively evaluating Hadoop or has started using Hadoop with its
ecosystem of technologies. The scale of this processing invariably runs into
tens or hundreds of machine clusters.

The parallelism of computation brings its own set of programming challenges
including identification of faults across MapReduce logic, discrepancies in
working data, Analytics Logic, or just an unhealthy node.

It is important for you (MapReduce developer) to profile your code, validate the
input/output data set, and understand MapReduce implementation across the
Hadoop cluster for debugging purposes, which can be extremely painful. 

Jumbune Usage Overview:

Jumbune supports both UI based and shell based execution. Jumbune
execution is triggered by the user by submitting a workflow, expressed by a
simple YAML configuration. The YAML configuration consists of instructions for
individual Jumbune components. Jumbune presents intuitive self-explanatory
reports for the submitted workflow.

We presented it in Hadoop Summit 2012 - San Jose, CA.
We received good and constructive feedbacks from many technologist there. Jumbune datasheet can be found here. Please feel free to contact Impetus for any demonstrations, evaluation copies, or queries around it.