在数据框中拆分几列

时间:2019-05-06 09:14:46

标签: python pandas

我有一个数据框,我想将第3列中的字符串拆分为最后一列,每列分为两列,标题保留在第一列拆分列中。 这是数据框:

{"size":24,
"query":{
  "bool":{
    "filter":[{"term":{"author":{"value":"tom","boost":1.0}}}],
    "must_not":[{"term":{"status":{"value":"deleted","boost":1.0}}}],
    "should":[
      {"term":{"f1":{"value":"v1","boost":1.0}}},
      {"term":{"f2":{"value":"v2","boost":1.0}}},
      {"term":{"f3":{"value":"v3","boost":1.0}}},
      {"term":{"f4":{"value":"v4","boost":1.0}}}
      ],
      "minimum_should_match":"2",
      "boost":1.0
  }}
}

这是我想要的数据框,它从第3列拆分为两列(用制表符分隔),并使用字符串:

Sample  Pop     a1      a10     a100
F295    Pesche  AC      AT      AA
F296    Pesche  GT      CG      AC
F297    Pesche  AA      GG      TT
F298    Pesche  AC      AG      CG

问题与那些“拆分一列”不相似,请帮忙。

2 个答案:

答案 0 :(得分:0)

您可以在列中创建$answer -> answer = $request->input('answer.'.$value); ,方法是将转换后的字符串拆分成具有concat的列表,以将值拆分成列表,以进行连接:

MultiIndex

如果需要避免df1 = df.set_index(['Sample','Pop']) comp = [pd.DataFrame(df1[x].apply(list).values.tolist(), index=df1.index) for x in df1.columns] df2 = pd.concat(comp, axis=1, keys=df1.columns) print (df2) a1 a10 a100 0 1 0 1 0 1 Sample Pop F295 Pesche A C A T A A F296 Pesche G T C G A C F297 Pesche A A G G T T F298 Pesche A C A G C G ,请先使用f字符串连接列名,以避免重复的列名,然后再DataFrame.reset_index

MultiIndex

答案 1 :(得分:0)

您可以使用for循环

import pandas as pd

data = {
    'Sample': ['F295','F296','F297','F298'],
    'Pop': ['Pesche', 'Pesche', 'Pesche', 'Pesche'],
    'a1': ['AC', 'GT', 'AA', 'AC'],
    'a10': ['AT', 'CG', 'GG', 'AG'],
    'a100': ['AA', 'AC', 'TT', 'CG']
}

df = pd.DataFrame(data) # For reproductibiliy, you should include this kind of code in your next questions :)

for col_name in list(df.columns[2:]): # iterate on all column after the third one
    df[col_name] = df[col_name].apply(lambda x: f"{x[0]}\t{x[1]}") # split on tab

df