pthread_create失败:资源在MongoDB上暂时不可用

时间:2017-02-09 02:20:03

标签: mongodb apache-spark docker pyspark stratio

目前,我在具有16GB RAM Ubuntu 16.04.1 x64

的物理机器上使用docker运行具有独立模式的Spark Cluster

Spark Cluster容器的RAM配置: 主4g,slave1 2g,slave2 2g,slave3 2g

docker run -itd --net spark -m 4g -p 8080:8080 --name master --hostname master MyAccount/spark &> /dev/null
docker run -itd --net spark -m 2g -p 8080:8080 --name slave1 --hostname slave1 MyAccount/spark &> /dev/null
docker run -itd --net spark -m 2g -p 8080:8080 --name slave2 --hostname slave2 MyAccount/spark &> /dev/null
docker run -itd --net spark -m 2g -p 8080:8080 --name slave3 --hostname slave3 MyAccount/spark &> /dev/null
docker exec -it master sh -c 'service ssh start' > /dev/null
docker exec -it slave1 sh -c 'service ssh start' > /dev/null
docker exec -it slave2 sh -c 'service ssh start' > /dev/null
docker exec -it slave3 sh -c 'service ssh start' > /dev/null
docker exec -it master sh -c '/usr/local/spark/sbin/start-all.sh' > /dev/null

我的MongoDB数据库中有大约170GB的数据。 我使用./mongod运行MongoDB,而不使用docker在本地主机上进行任何复制和分片。

使用Stratio / Spark-Mongodb连接器

以下命令我在“master”容器上运行:

/usr/local/spark/bin/spark-submit --master spark://master:7077 --executor-memory 2g --executor-cores 1 --packages com.stratio.datasource:spark-mongodb_2.11:0.12.0 code.py

code.py:

from pyspark import SparkContext
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
spark.sql("CREATE TEMPORARY VIEW tmp_tb USING com.stratio.datasource.mongodb OPTIONS (host 'MyPublicIP:27017', database 'firewall', collection 'log_data')")
df = spark.sql("select * from tmp_tb")
df.show()

我修改了/etc/security/limits.conf/etc/security/limits.d/20-nproc.conf

中的ulimit值
* soft nofile unlimited
* hard nofile 131072
* soft nproc unlimited
* hard nproc unlimited
* soft fsize unlimited
* hard fsize unlimited
* soft memlock unlimited
* hard memlock unlimited
* soft cpu unlimited
* hard cpu unlimited
* soft as unlimited
* hard as unlimited

root soft nofile unlimited
root hard nofile 131072
root soft nproc unlimited
root hard nproc unlimited
root soft fsize unlimited
root hard fsize unlimited
root soft memlock unlimited
root hard memlock unlimited
root soft cpu unlimited
root hard cpu unlimited
root soft as unlimited
root hard as unlimited

$ ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63682
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

另外,添加

kernel.pid_max=200000
vm.max_map_count=600000
/etc/sysctl.conf

中的

然后,重新启动后再次运行spark程序。

我仍然有以下错误说pthread_create failed: Resource temporarily unavailablecom.mongodb.MongoException$Network: Exception opening the socket

错误快照:

pyspark error

mongodb error

物理内存不够吗?或配置的哪一部分我做错了?

感谢。

0 个答案:

没有答案