Spark结构化流中的委托令牌负更新时间

时间:2021-02-10 16:01:52

标签: apache-spark hadoop kerberos spark-structured-streaming

我有一个在 Cloudera Cluster 上运行的 Spark Structured Streaming (3.0.1) 作业。这项工作正在使用来自 kerberized Kafka 的数据并将其放入 ADLS gen2。 ADLS 访问是通过委托令牌建立的。

一切正常,但我看到 HadoopDelegationTokenManager 续订时间为负数。 有时它每秒出现几次:

2021-02-10 15:28:26 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663379723 ms.
2021-02-10 15:28:30 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663382500 ms.
2021-02-10 15:28:33 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663385182 ms.
2021-02-10 15:28:37 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663387844 ms.
2021-02-10 15:28:41 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663390793 ms.
2021-02-10 15:28:44 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663393576 ms.
2021-02-10 15:28:48 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663396341 ms.
2021-02-10 15:28:52 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663399054 ms.

这里有一个完整的日志条目:

2021-02-10 15:31:10 INFO  HadoopFSDelegationTokenProvider:57 - getting token for: AzureBlobFileSystem{uri=abfs://BUCKET@cdpdlmain.dfs.core.windows.net, user='some_user', primaryUserGroup='some_user', Statistics: {{Context=AbfsContext}{AbfsID=beb93638-9a3a-477d-9b10-f875b5f8d40c}{AbfsBucket=cdpdlmain.dfs.core.windows.net}{op_create=0}{op_open=0}{op_get_file_status=0}{op_append=0}{op_create_non_recursive=0}{op_delete=0}{op_exists=0}{op_get_delegation_token=0}{op_list_status=0}{op_mkdirs=0}{op_rename=0}{directories_created=0}{directories_deleted=0}{files_created=0}{files_deleted=0}{error_ignored=0}{connections_made=0}{send_requests=0}{get_responses=0}{bytes_sent=0}{bytes_received=0}{read_throttles=0}{write_throttles=0}}} with renewer yarn/dev-de-main-master0.SOMEDOMAIN.CLOUDERA.SITE@SOMEDOMAIN.CLOUDERA.SITE
2021-02-10 15:31:10 INFO  logger:49 - Using default JAAS configuration
2021-02-10 15:31:10 INFO  HadoopFSDelegationTokenProvider:57 - getting token for: DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_1941596994_53, ugi=some_user@SOMEDOMAIN.CLOUDERA.SITE (auth:KERBEROS)]] with renewer yarn/dev-de-main-master0.SOMEDOMAIN.CLOUDERA.SITE@SOMEDOMAIN.CLOUDERA.SITE
2021-02-10 15:31:10 INFO  DFSClient:706 - Created token for some_user: HDFS_DELEGATION_TOKEN owner=some_user@SOMEDOMAIN.CLOUDERA.SITE, renewer=yarn, realUser=, issueDate=1612971070483, maxDate=1613575870483, sequenceNumber=13700, masterKeyId=24 on ha-hdfs:ns1
2021-02-10 15:31:11 INFO  HadoopDelegationTokenManager:57 - Scheduling renewal in -1209663503490 ms.
2021-02-10 15:31:11 INFO  HadoopDelegationTokenManager:57 - Updating delegation tokens.
2021-02-10 15:31:11 INFO  HadoopDelegationTokenManager:57 - Attempting to login to KDC using principal: some_user
2021-02-10 15:31:11 INFO  HadoopDelegationTokenManager:57 - Successfully logged into KDC.
2021-02-10 15:31:11 INFO  SparkHadoopUtil:57 - Updating delegation tokens for current user.

恐怕它会对性能产生一些影响,而且它显然看起来像是一个错误。 (毫秒而不是纳秒?)关于这个有什么提示吗?

0 个答案:

没有答案
相关问题