i use multipleoutputs in reducer. multiple output write file folder called newidentities. code shown below:
private multipleoutputs<text,text> mos; @override public void reduce(text inputkey, iterable<text> values, context context) throws ioexception, interruptedexception { ...... // output change report if (ischangereport.equals("true")) { mos.write(new text(e.gethid()), new text(changereport.deletecharat(changereport.length() - 1).tostring()), "newidentities/"); } } } @override public void setup(context context) { mos = new multipleoutputs<text,text>(context); } @override protected void cleanup(context context) throws ioexception, interruptedexception { mos.close(); } it can run previously. when run today, throws exception below. hadoop version 2.4.0.
error: org.apache.hadoop.fs.filealreadyexistsexception: /captureonlymatchindex9/temp/changereport/newidentities/-r-00000 client 192.168.71.128 exists @ org.apache.hadoop.hdfs.server.namenode.fsnamesystem.startfileinternal(fsnamesystem.java:2297) @ org.apache.hadoop.hdfs.server.namenode.fsnamesystem.startfileint(fsnamesystem.java:2225) @ org.apache.hadoop.hdfs.server.namenode.fsnamesystem.startfile(fsnamesystem.java:2178) @ org.apache.hadoop.hdfs.server.namenode.namenoderpcserver.create(namenoderpcserver.java:520) @ org.apache.hadoop.hdfs.protocolpb.clientnamenodeprotocolserversidetranslatorpb.create(clientnamenodeprotocolserversidetranslatorpb.java:354) @ org.apache.hadoop.hdfs.protocol.proto.clientnamenodeprotocolprotos$clientnamenodeprotocol$2.callblockingmethod(clientnamenodeprotocolprotos.java) @ org.apache.hadoop.ipc.protobufrpcengine$server$protobufrpcinvoker.call(protobufrpcengine.java:585) @ org.apache.hadoop.ipc.rpc$server.call(rpc.java:928) @ org.apache.hadoop.ipc.server$handler$1.run(server.java:2013) @ org.apache.hadoop.ipc.server$handler$1.run(server.java:2009) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1548) @ org.apache.hadoop.ipc.server$handler.run(server.java:2007) @ sun.reflect.nativeconstructoraccessorimpl.newinstance0(native method) @ sun.reflect.nativeconstructoraccessorimpl.newinstance(nativeconstructoraccessorimpl.java:57) @ sun.reflect.delegatingconstructoraccessorimpl.newinstance(delegatingconstructoraccessorimpl.java:45) @ java.lang.reflect.constructor.newinstance(constructor.java:526) @ org.apache.hadoop.ipc.remoteexception.instantiateexception(remoteexception.java:106) @ org.apache.hadoop.ipc.remoteexception.unwrapremoteexception(remoteexception.java:73) @ org.apache.hadoop.hdfs.dfsoutputstream.newstreamforcreate(dfsoutputstream.java:1604) @ org.apache.hadoop.hdfs.dfsclient.create(dfsclient.java:1465) @ org.apache.hadoop.hdfs.dfsclient.create(dfsclient.java:1390) @ org.apache.hadoop.hdfs.distributedfilesystem$6.docall(distributedfilesystem.java:394) @ org.apache.hadoop.hdfs.distributedfilesystem$6.docall(distributedfilesystem.java:390) @ org.apache.hadoop.fs.filesystemlinkresolver.resolve(filesystemlinkresolver.java:81) @ org.apache.hadoop.hdfs.distributedfilesystem.create(distributedfilesystem.java:390) @ org.apache.hadoop.hdfs.distributedfilesystem.create(distributedfilesystem.java:334) @ org.apache.hadoop.fs.filesystem.create(filesystem.java:906) @ org.apache.hadoop.fs.filesystem.create(filesystem.java:887) @ org.apache.hadoop.fs.filesystem.create(filesystem.java:784) @ org.apache.hadoop.mapreduce.lib.output.textoutputformat.getrecordwriter(textoutputformat.java:132) @ org.apache.hadoop.mapreduce.lib.output.multipleoutputs.getrecordwriter(multipleoutputs.java:475) @
i found reason it. because in 1 of reducers, run out of memory. throws out out-of-memory exception implicitly. hadoop stops current multiple output. , maybe thread of reducer want output, creates multiple output object, collision happens.
Comments
Post a Comment