数据提取问题配置单元:java.lang.OutOfMemoryError:无法创建新的本机线程

时间:2018-09-17 20:08:31

标签: hive hiveql hortonworks-data-platform apache-tez hive-query

我是一个配置单元的新手,在将大型(1TB)HDFS文件放入分区的配置单元托管表中时遇到了麻烦。你能帮我解决这个问题吗?我觉得我的某个地方配置不正确,因为我无法完成reducer作业。

这是我的查询:

DROP TABLE IF EXISTS ts_managed;

SET hive.enforce.sorting = true;

CREATE TABLE IF NOT EXISTS ts_managed (
 svcpt_id VARCHAR(20),
 usage_value FLOAT,
 read_time SMALLINT)
PARTITIONED BY (read_date INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS ORC
TBLPROPERTIES("orc.compress"="snappy","orc.create.index"="true","orc.bloom.filter.columns"="svcpt_id");

SET hive.vectorized.execution.enabled = true;
SET hive.vectorized.execution.reduce.enabled = true;
SET set hive.cbo.enable=true;
SET hive.tez.auto.reducer.parallelism=true;
SET hive.exec.reducers.max=20000;
SET yarn.nodemanager.pmem-check-enabled = true;
SET optimize.sort.dynamic.partitioning=true;
SET hive.exec.max.dynamic.partitions=10000;

INSERT OVERWRITE TABLE ts_managed
PARTITION (read_date)
SELECT svcpt_id, usage, read_time, read_date
FROM ts_raw
DISTRIBUTE BY svcpt_id
SORT BY svcpt_id;

我的集群规格是:

  • VM群集
  • 总共4个节点
  • 4个数据节点
  • 32核
  • 140 GB RAM
  • Hortonworks HDP 3.0
  • Apache Tez作为默认的Hive引擎
  • 我是集群的唯一用户

我的纱线配置是:

yarn.nodemanager.resource.memory-mb = 32GB
yarn.scheduler.minimum-allocation-mb = 512MB
yarn.scheduler.maximum-allocation-mb = 8192MB
yarn-heapsize = 1024MB

我的Hive配置为:

hive.tez.container.size = 682MB
hive.heapsize = 4096MB
hive.metastore.heapsize = 1024MB
hive.exec.reducer.bytes.per.reducer = 1GB
hive.auto.convert.join.noconditionaltask.size = 2184.5MB
hive.tex.auto.reducer.parallelism = True
hive.tez.dynamic.partition.pruning = True

我的tez配置:

tez.am.resource.memory.mb = 5120MB
tez.grouping.max-size = 1073741824 Bytes
tez.grouping.min-size = 16777216 Bytes
tez.grouping.split-waves = 1.7
tez.runtime.compress = True
tez.runtime.compress.codec = org.apache.hadoop.io.compress.SnappyCodec

我尝试了无数配置,包括:

  • 按日期分区
  • 按日期进行分区,并在带有桶的svcpt_id上集群
  • 对日期进行分区,对svcpt进行bloom筛选,按svcpt_id排序
  • 对日期进行分区,对svcpt进行bloom过滤,按svcpt_id进行分配和排序

我可以运行我的映射顶点,但是还没有完成我的第一个reducer顶点。这是上述查询中我最近的示例:

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED   1043       1043        0        0       0       0
Reducer 2        container       RUNNING   9636          0        0     9636       1       0
Reducer 3        container        INITED   9636          0        0     9636       0       0
----------------------------------------------------------------------------------------------
VERTICES: 01/03  [=>>-------------------------] 4%    ELAPSED TIME: 6804.08 s
----------------------------------------------------------------------------------------------

错误是:

Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1537061583429_0010_2_01, diagnostics=[Task failed, taskId=task_1537061583429_0010_2_01_000070, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: unable to create new native thread

我遇到了似乎无法解决的OOM错误,或者我的数据节点脱机并且无法满足我的复制因子要求。

目前,我已经进行了2周以上的故障排除。我为解决这个问题可以付钱给专业顾问的任何联系人也将不胜感激。

谢谢!

1 个答案:

答案 0 :(得分:0)

与Hortonworks技术人员交谈后,我终于解决了这个问题。原来我在对桌子进行过分区。我没有按照大约4年的时间每天参与,而是按月进行分区,并且效果很好。