AWS EMR - 如何在所有从属节点上自动编辑文件?

时间:2018-03-26 18:13:57

标签: python matplotlib edit amazon-emr worker

我正在AWS EMR集群的所有从属节点上运行Python脚本。我需要配置matplotlib以在每个从节点上使用非交互式后端,否则我将遇到错误(detailed description provided here

我目前的解决方案是手动ssh到每个从属节点并手动编辑/usr/local/lib64/python2.7/site-packages/matplotlib/mpl-data/matplotlibrc:

SQL> create table book
  2    (isbn   varchar2(13) primary key,
  3     name   varchar2(20) not null
  4    );

Table created.

SQL> create table price
  2    (id        number primary key,
  3     isbn      varchar2(13) constraint fk_pr_boo references book (isbn),
  4     date_from date not null,
  5     price     number
  6    );

Table created.

SQL>
SQL> insert all
  2    into book (isbn, name) values ('1-1234-124', 'C#')
  3    into book (isbn, name) values ('9-1244-332', 'C++')
  4    --
  5    into price (id, isbn, date_From, price) values (1, '1-1234-124', date '2018-01-01', 300)
  6    into price (id, isbn, date_From, price) values (2, '1-1234-124', date '2018-03-20', 400)
  7  select * From dual;

4 rows created.

SQL>
SQL> select b.isbn, b.name, p.date_from, p.price
  2  from book b left join price p on p.isbn = b.isbn
  3  order by b.isbn, p.date_from;

ISBN          NAME                 DATE_FROM       PRICE
------------- -------------------- ---------- ----------
1-1234-124    C#                   01.01.2018        300
1-1234-124    C#                   20.03.2018        400
9-1244-332    C++

SQL>

显然,这种方法非常耗时且效率低下。

任何人都可以提供一个小的(伪)代码片段,在所有从属节点上自动执行此任务吗?

1 个答案:

答案 0 :(得分:0)

最简单的解决方案是在Amazon EMR启动实例后通过bootstrap action提供shell脚本:

#! /bin/sh
sudo sed -i 's/TkAgg/agg/g' /usr/local/lib64/python2.7/site-packages/matplotlib/mpl-data/matplotlibrc