Yarn AM is ending work by sending a remote job to the cluster

Asked

Viewed 27 times

0

I have searched for an identifier in a problem without too many clues. Any help pointing me in the right direction will be huge.

Any help will be of great value.

I’m spinning a Hadoop Cluster using emr-5.25.0 with the following applications: Hive 2.3.5, Spark 2.4.3, Oozie 5.1.0. I also have one client emr with the same configuration.

All environments are running on Hundreds.

I sent some papers from Spark locally in the cluster successfully.

When I remotely submitted the works Mapeduce remotely, I received a AM Error:

AM fails the job with the error:

           SecretManager$InvalidToken: appattempt_1564396684837_0011_000002 not found in AMRMTokenSecretManager

I searched for /var/log/oozie in the client and in the cluster no unusual messages.

Here is the content of workflow.xml:

<?xml version="1.0" encoding="UTF-8"?>

<workflow-app xmlns="uri:oozie:workflow:1.0" name="shell-wf">

    <start to="spark-node"/>

    <action name='spark-node'>
        <spark xmlns="uri:oozie:spark-action:1.0">
            <resource-manager>hdfs://xxx:8032</resource-manager>
            <name-node>hdfs://xxx:8020</name-node>
            <master>${master}</master>
            <mode>${mode}</mode>
            <name>snapshot-${table_name}</name>
            <jar>${continuous_ingestion}/workflow-snapshot.py</jar>
      <spark-opts>--py-files ${continuous_ingestion}/release/continuous_ingestion_release.zip --files ${continuous_ingestion}/${env}.cfg --conf spark.executorEnv.PYSPARK_PYTHON=/usr/bin/python3.6 --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/bin/python3.6 --conf spark.driver.memory=4g --conf spark.driver.maxResultSize=1g --conf spark.driver.cores=1 --conf spark.executor.instances=1 --conf spark.executor.cores=2 --conf spark.executor.memory=4g --conf spark.eventLog.enabled=false --conf spark.sql.broadcastTimeout=36000 --conf spark.shuffle.service.enabled=true --conf spark.dynamicAllocation.enabled=true --conf spark.dynamicAllocation.initialExecutors=1 --conf spark.dynamicAllocation.minExecutors=1 --conf spark.dynamicAllocation.maxExecutors=30 --packages org.lz4:lz4-java:1.4.0 </spark-opts>
            <arg>${table_name}</arg>
            <arg>${env}</arg>
        </spark>
        <ok to="end" />
        <error to="fail_spark" />
    </action>
    <kill name="fail_spark">
        <message>Bash action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

The exit of log of tasks in the cluster was:

Caused by: org.apache.Hadoop.ipc.Remoteexception(org.apache.Hadoop.security.token.Secretmanager$Invalidtoken): appattempt_1564396684837_0011_000002 not found in Amrmtokensecretmanager. at org.apache.Hadoop.ipc.Client.getRpcResponse(Client.java:1489) at org.apache.Hadoop.ipc.Client.call(Client.java:1435) at org.apache.Hadoop.ipc.Client.call(Client.java:1345) at org.apache.Hadoop.ipc.Protobufrpcengine$Invoker.invoke(Protobufrpcengine.java:227) at org.apache.Hadoop.ipc.Protobufrpcengine$Invoker.invoke(Protobufrpcengine.java:116) at com.sun.proxy. $Proxy7.registerApplicationMaster(Unknown Source) at org.apache.Hadoop.yarn.api.impl.pb.client.Applicationmasterprotocolpbclientimpl.registerApplicationMaster(Applicationmasterprotocolpbclientimpl.java:107) ... 20 more Exception in thread "main" org.apache.Hadoop.security.token.Secretmanager$Invalidtoken: appattempt_1564396684837_0011_000002 not found in Amtokenrmsecretmanager. at sun.reflect.Nativeconstructoraccessorimpl.newInstance0(Native Method) at sun.reflect.Nativeconstructoraccessorimpl.newInstance(Nativeconstructoraccessorimpl.java:62) at sun.reflect.Delegatingconstructoraccessorimpl.newInstance(Delegatingconstructoraccessorimpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.Hadoop.yarn.ipc.RPCUtil.instantiateException(Rpcutil.java:53) at org.apache.Hadoop.yarn.ipc.RPCUtil.instantiateIOException(Rpcutil.java:80) at org.apache.Hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(Rpcutil.java:119) at org.apache.Hadoop.yarn.api.impl.pb.client.Applicationmasterprotocolpbclientimpl.finishApplicationMaster(Applicationmasterprotocolpbclientimpl.java:94) at sun.reflect.Nativemethodaccessorimpl.invoke0(Native Method) at sun.reflect..Nativemethodaccessorimpl.invoke(Nativemethodaccessorimpl.java:62) at sun.reflect.Delegatingmethodaccessorimpl.invoke(Delegatingmethodaccessorimpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.Hadoop.io.Retry.RetryInvocationHandler.invokeMethod(Retryinvocationhandler.java:409) at org.apache.Hadoop.io.Retry.Retryinvocationhandler$Call.invokeMethod(Retryinvocationhandler.java:163) at org.apache.Hadoop.io.Retry.Retryinvocationhandler$Call.invoke(Retryinvocationhandler.java:155) at org.apache.Hadoop.io.Retry.Retryinvocationhandler$Call.invokeOnce(Retryinvocationhandler.java:95) at org.apache.Hadoop.io.Retry.RetryInvocationHandler.invoke(Retryinvocationhandler.java:346) at com.sun.proxy. $Proxy8.finishApplicationMaster(Unknown Source) at org.apache.Hadoop.yarn.client.api.impl.AMRMClientImpl.unregisterApplicationMaster(Amrmclientimpl.java:472) at org.apache.Hadoop.yarn.client.api.async.impl.Amrmclientasyncimpl.unregisterApplicationMaster(Amrmclientasyncimpl.java:187) at org.apache.oozie.action.Hadoop.LauncherAM.unregisterWithRM(Launcheram.java:352) at org.apache.oozie.action.Hadoop.LauncherAM.run(Launcheram.java:278) at org.apache.oozie.action.Hadoop.Launcheram$1.run(Launcheram.java:153) at java.security.Accesscontroller.doPrivileged(Native Method) at javax.security.auth.Subject.doas(Subject.java:422) at org.apache.Hadoop.security.Usergroupinformation.doas(Usergroupinformation.java:1844) at org.apache.oozie.action.Hadoop.LauncherAM.main(Launcheram.java:141) Caused by: org.apache.Hadoop.ipc.Remoteexception(org.apache.Hadoop.security.token.Secretmanager$Invalidtoken): appattempt_1564396684837_0011_000002 not found in Amrmtokensecretmanager. at org.apache.Hadoop.ipc.Client.getRpcResponse(Client.java:1489) at org.apache.Hadoop.ipc.Client.call(Client.java:1435) at org.apache.Hadoop.ipc.Client.call(Client.java:1345) at org.apache.Hadoop.ipc.Protobufrpcengine$Invoker.invoke(Protobufrpcengine.java:227) at org.apache.Hadoop.ipc.Protobufrpcengine$Invoker.invoke(Protobufrpcengine.java:116) at com.sun.proxy. $Proxy7.finishApplicationMaster(Unknown Source) at org.apache.Hadoop.yarn.api.impl.pb.client.Applicationmasterprotocolpbclientimpl.finishApplicationMaster(Applicationmasterprotocolpbclientimpl.java:92) ... 19 more

  • Can I translate to Portuguese? It was posted in a community in Portuguese.

  • I edited and translated the question, who can, collaborate to help the user.

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.