如何为熊猫中的每个字符将一列中的字符串拆分为新列

时间:2018-09-20 18:51:50

标签: python-3.x pandas

我有一个pandas数据框,看起来像:

                flag
0        NNxxNxNNxNN
1        xxNNNNNNNNN
2        xxxNNxNNNNN
3        xxxxNxxxxxN
4        xxxxxxNxxxx
5        xxxxxxxNxNN

我想将每个字符的字符串分成一个新列, 例如这样的

         col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11
0        N    N    x    x    N    x    N    N    x    N    N
1        x    x    N    N    N    N    N    N    N    N    N
2        x    x    x    N    N    x    N    N    N    N    N
3        x    x    x    x    N    x    x    x    x    x    N
4        x    x    x    x    x    x    N    x    x    x    x
5        x    x    x    x    x    x    x    N    x    N    N

我的数据框有几百万行-有有效的方法吗?

2 个答案:

答案 0 :(得分:0)

您可以这样做:

file_path = ActiveStorage::Blob.service.send(:path_for, materials_upload.csv_file.key)
CSV.foreach file_path, headers: true do
  # ...
end

window.location.href = ip.value;

要获取列名,只需将GET /_search { "size": 0, "query" : { "query_string": { "fields" : ["error_message"], "query" : "login AND failed" } }, "aggs": { "group_by_id": { "terms": { "field": "Id", "size": 1000 }, "aggs": { "group_by_date": { "date_range": { "field": "timestamp", "ranges": [ { "from": "now-6h", "to": "now" } ] } } } } } } 添加到以上任一调用中即可:

       {
          "key": "12",
          "doc_count": 89388,
          "group_by_date": {
            "buckets": [
              {
                "key": "2018-09-20T12:48:04.200-2018-09-20T18:48:04.200",
                "from": 1537447684200,
                "from_as_string": "2018-09-20T12:48:04.200",
                "to": 1537469284200,
                "to_as_string": "2018-09-20T18:48:04.200",
                "doc_count": 50
              }
            ]
          }
        }

答案 1 :(得分:0)

tolistpd.DataFrame一起使用

pd.DataFrame(df.flag.apply(list).tolist())
Out[905]: 
  0  1  2  3  4  5  6  7  8  9  10
0  N  N  x  x  N  x  N  N  x  N  N
1  x  x  N  N  N  N  N  N  N  N  N
2  x  x  x  N  N  x  N  N  N  N  N
3  x  x  x  x  N  x  x  x  x  x  N
4  x  x  x  x  x  x  N  x  x  x  x
5  x  x  x  x  x  x  x  N  x  N  N

extractall中的方法

df.flag.str.extractall('(.)')[0].unstack()
Out[931]: 
match 0  1  2  3  4  5  6  7  8  9  10
0      N  N  x  x  N  x  N  N  x  N  N
1      x  x  N  N  N  N  N  N  N  N  N
2      x  x  x  N  N  x  N  N  N  N  N
3      x  x  x  x  N  x  x  x  x  x  N
4      x  x  x  x  x  x  N  x  x  x  x
5      x  x  x  x  x  x  x  N  x  N  N