我有一个数据框,我想将第3列中的字符串拆分为最后一列,每列分为两列,标题保留在第一列拆分列中。 这是数据框:
{"size":24,
"query":{
"bool":{
"filter":[{"term":{"author":{"value":"tom","boost":1.0}}}],
"must_not":[{"term":{"status":{"value":"deleted","boost":1.0}}}],
"should":[
{"term":{"f1":{"value":"v1","boost":1.0}}},
{"term":{"f2":{"value":"v2","boost":1.0}}},
{"term":{"f3":{"value":"v3","boost":1.0}}},
{"term":{"f4":{"value":"v4","boost":1.0}}}
],
"minimum_should_match":"2",
"boost":1.0
}}
}
这是我想要的数据框,它从第3列拆分为两列(用制表符分隔),并使用字符串:
Sample Pop a1 a10 a100
F295 Pesche AC AT AA
F296 Pesche GT CG AC
F297 Pesche AA GG TT
F298 Pesche AC AG CG
问题与那些“拆分一列”不相似,请帮忙。
答案 0 :(得分:0)
您可以在列中创建$answer -> answer = $request->input('answer.'.$value);
,方法是将转换后的字符串拆分成具有concat
的列表,以将值拆分成列表,以进行连接:
MultiIndex
如果需要避免df1 = df.set_index(['Sample','Pop'])
comp = [pd.DataFrame(df1[x].apply(list).values.tolist(), index=df1.index) for x in df1.columns]
df2 = pd.concat(comp, axis=1, keys=df1.columns)
print (df2)
a1 a10 a100
0 1 0 1 0 1
Sample Pop
F295 Pesche A C A T A A
F296 Pesche G T C G A C
F297 Pesche A A G G T T
F298 Pesche A C A G C G
,请先使用f字符串连接列名,以避免重复的列名,然后再DataFrame.reset_index
:
MultiIndex
答案 1 :(得分:0)
您可以使用for循环
import pandas as pd
data = {
'Sample': ['F295','F296','F297','F298'],
'Pop': ['Pesche', 'Pesche', 'Pesche', 'Pesche'],
'a1': ['AC', 'GT', 'AA', 'AC'],
'a10': ['AT', 'CG', 'GG', 'AG'],
'a100': ['AA', 'AC', 'TT', 'CG']
}
df = pd.DataFrame(data) # For reproductibiliy, you should include this kind of code in your next questions :)
for col_name in list(df.columns[2:]): # iterate on all column after the third one
df[col_name] = df[col_name].apply(lambda x: f"{x[0]}\t{x[1]}") # split on tab
df