Hadoop ToolRunner and Tool

When your create a MapReduce Job, you can implement the Tool interface’s method

int run(String[] args) throws Exception;

This is the only method in Tool interface. This would allow the ToolRunner to invoke our implemented run(String[]) method of our MapReduce job. In the main method, you call ToolRunner.run(new Configuration(), new MapReduceJob(), args) as follows:

public static void main(String[] args) {

int status = ToolRunner.run(new Configuration(), new MapReduceJob(), args);

}

If you take a look at the ToolRunner class, you will find the method, since our MapReduceJob is an implementation of Tool interface, ToolRunner will in turn invokes our implementation of run(String[] args)

public static int run(Configuration conf, Tool tool, String[] args)

throws Exception{

if(conf == null) {

conf = new Configuration();

}

GenericOptionsParser parser = new GenericOptionsParser(conf, args);

//set the configuration back, so that Tool can configure itself

tool.setConf(conf);

//get the args w/o generic hadoop args

String[] toolArgs = parser.getRemainingArgs();

return tool.run(toolArgs);

}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s