1
Hello. I’m having a hard time working with Mapreduce. Whenever I run the application I get no result, because apparently the Map function runs, but the Reduce function remains at 0%.
When I check the files that were generated on the server that Hadoop is installed, the Input is perfect, in . txt as defined, but reduce generates a folder with the name of the file I defined (filename.txt), within that folder is present a log folder and the file "part-00000", which is empty.
What should I do to get the result of the operation?
Follow the driver to connect the application to Hadoop:
public static void main(String args[]) throws IOException{
System.out.println("Olá");
ControleBD conBD = new ControleBD();
ControleArq conAR = new ControleArq();
conAR.gravar(conBD.pesquisa());
JobConf conf = new JobConf(Principal.class); // Definimos qual classe o job tomará como principal
conf.setJobName("TestePrincipal"); //Nome do job que irá executar na maquina virtual
FileInputFormat.addInputPath(conf, new Path("/user/hadoop-user/input/DadosBancarios.txt")); // Definimos o arquivo de entrada
FileOutputFormat.setOutputPath(conf, new Path("/user/hadoop-user/output/saidaDadosBancarios.txt")); // Definimos o arquivo de saida
conf.setMapperClass(ClasseMapper.class); // configuramos a classe do mapper
conf.setReducerClass(ClasseReducer.class); // configuramos a classe do reducer
conf.setOutputKeyClass(Text.class); // definimos o tipo de saida esperada para as operações de map e reduce, nesse caso Texto
conf.setOutputValueClass(IntWritable.class); // definimos o tipo de saida esperada para as operações de map e reduce, nesse caso inteiros
JobClient.runJob(conf); // Executa o Job com as configurações passadas
}
Follows the Mapper class:
public class ClasseMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
public void map(LongWritable chave, Text valor,OutputCollector<Text, IntWritable> output, Reporter reporter)throws IOException {
String linha = valor.toString();
System.out.println(linha);
String ano = "";
int valorIndice = 0;
if(linha.contains("year:")){
String[] divisor = linha.split(":");
ano = divisor[1];
}
if(linha.contains("value:")){
String[] divisor = linha.split(":");
valorIndice = Integer.parseInt(divisor[1]);
}
output.collect(new Text(ano), new IntWritable(valorIndice));
}
}
Follow the class of Reducer:
public class ClasseReducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text chave, Iterator<IntWritable> valor, OutputCollector<Text, IntWritable> output, Reporter reporter)throws IOException {
int maxValue = 99999999;
while (valor.hasNext()) {
maxValue = Math.max(maxValue, valor.next().get());
}
output.collect(chave, new IntWritable(maxValue));
}
}
Follow the log generated by Eclipse-Plugin:
15/09/17 17 17:13:37 WARN mapred.Jobclient: Use Genericoptionsparser for Parsing the Arguments. Applications should implement Tool for the same.
15/09/17 17 17:13:38 INFO mapred.Fileinputformat: Total input paths to process : 1
15/09/17 17 17:13:38 INFO mapred.Fileinputformat: Total input paths to process : 1
15/09/17 17:13:39 INFO mapred.Jobclient: Running job: job_201509170444_0002
15/09/17 17 17:13:40 INFO mapred.Jobclient: map 0% reduce 0%
15/09/17 17 17:13:48 INFO mapred.Jobclient: map 100% reduce 0%
15/09/17 17:13:53 INFO mapred.Jobclient: Job complete: job_201509170444_0002
15/09/17 17:13:53 INFO mapred.Jobclient: Counters: 16
15/09/17 17:13:53 INFO mapred.Jobclient: File Systems
15/09/17 17:13:53 INFO mapred.Jobclient: HDFS bytes read=152753
15/09/17 17:13:53 INFO mapred.Jobclient: HDFS bytes Written=10
15/09/17 17:13:53 INFO mapred.Jobclient: Local bytes read=44044
15/09/17 17:13:53 INFO mapred.Jobclient: Local bytes Written=88160
15/09/17 17:13:53 INFO mapred.Jobclient: Job Counters
15/09/17 17:13:53 INFO mapred.Jobclient: Launched reduce tasks=1
15/09/17 17:13:53 INFO mapred.Jobclient: Launched map tasks=2
15/09/17 17:13:53 INFO mapred.Jobclient: Data-local map tasks=2
15/09/17 17:13:53 INFO mapred.Jobclient: Map-Reduce Framework
15/09/17 17:13:53 INFO mapred.Jobclient: Reduce input groups=1
15/09/17 17:13:53 INFO mapred.Jobclient: Combine output Records=0
15/09/17 17:13:53 INFO mapred.Jobclient: Map input Records=6240
15/09/17 17:13:53 INFO mapred.Jobclient: Reduce output Records=1
15/09/17 17:13:53 INFO mapred.Jobclient: Map output bytes=31200
15/09/17 17:13:53 INFO mapred.Jobclient: Map input bytes=149856
15/09/17 17:13:53 INFO mapred.Jobclient: Combine input Records=0
15/09/17 17:13:53 INFO mapred.Jobclient: Map output Records=6240
15/09/17 17:13:53 INFO mapred.Jobclient: Reduce input Records=6240
Ps: I am not using Hadoop directly on my computer, it is running on a virtual machine ( https://developer.yahoo.com/hadoop/tutorial/)
Grateful
Rafael Muniz