Spark无法加载大文件?

时间:2018-05-29 15:23:15

标签: apache-spark pyspark

我有一个很大的csv文件,我想加载所以我尝试了pyspark,但jupyter笔记本返回此错误:

const regex = /\(((?:[^\)]+|\([^)]+\))*\bgreen\b[^)]+)\)/g;
const str = `OSDfhosjdjakjdnvkjndkfvjelkrjejrijrvrvrjvnkrjvnkn(mint (light) green pants)shdbfhsbdhfbsjd(couch)hvbjshdbvjhsbdfbjs(forest (dark) (stained) green shirt) sjdfjsdhfjshkdfjskdjfksjdfhfskdjf(table)`;
let m;
while ((m = regex.exec(str)) !== null) {
  if (m.index === regex.lastIndex) {
    regex.lastIndex++;
  }
  console.log(m[1]);
}

这是我的代码:

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

0 个答案:

没有答案