Question

我有一个样本数据集：

import pandas as pd


df = {'READID': [1,1,1  ,1,1    ,5,5    ,5,5],
  'VG': ['LV5-F*01','LV5-F*01'  ,'LV5-F*01','LV5-F*01','LV5-F*01','LV5-A*01','LV5-A*01','LV5-A*01','LV5-A*01'],
  'Pro': [1,1,1,1,1,2,2,2,2]}

df = pd.DataFrame(df)

看起来像：

df
Out[23]: 
     Pro  READID     VG
0    1       1   LV5-F*01
1    1       1   LV5-F*01
2    1       1   LV5-F*01
3    1       1   LV5-F*01
4    1       1   LV5-F*01
5    2       5   LV5-A*01
6    2       5   LV5-A*01
7    2       5   LV5-A*01
8    2       5   LV5-A*01

这是一个示例数据集，实际数据集包含更多列和更多行，具有不同的groupby组合，我想组合3列并输出单独的单独文件与VG作为其名称的一部分：< / p>

期望的输出：

'LV5-F*01.txt':

     Pro  READID     VG
0    1       1   LV5-F*01
1    1       1   LV5-F*01
2    1       1   LV5-F*01
3    1       1   LV5-F*01
4    1       1   LV5-F*01

'LV5-A*01.txt':

    Pro  READID     VG
5    2       5   LV5-A*01
6    2       5   LV5-A*01
7    2       5   LV5-A*01
8    2       5   LV5-A*01

我的尝试：

(df.groupby(['READID','VG','Pro'])
.apply(lambda gp: gp.to_csv('{}.txt'.format(gp.VG.name), sep='\t', index=False))
 )

然而，

  '{}.txt'.format(gp.VG.name)

部分只生成一个名为＆＃39; VG.txt＆＃39;只包含一行，这不是我想要的。

Answer 1

您不需要groupby，您只需选择所需的行并将其转换为文本文件。

import pandas as pd
df = {'READID': [1,1,1  ,1,1    ,5,5    ,5,5],
  'VG': ['LV5-F*01','LV5-F*01'  ,'LV5-F*01','LV5-F*01','LV5-F*01','LV5-A*01','LV5-A*01','LV5-A*01','LV5-A*01'],
  'Pro': [1,1,1,1,1,2,2,2,2]}
df = pd.DataFrame(df)

with open('LV5-F*01.txt', 'w') as fil:
    fil.write(df[df['VG'] == 'LV5-F*01'].to_string())

with open('LV5-A*01.txt', 'w') as fil:
    fil.write(df[df['VG'] == 'LV5-A*01'].to_string())

Answer 2

javascript:(function(){prompt('Copy the below text and then paste it into the URL bar:', 'about:reader?url='+encodeURIComponent(document.location))})();

如果字符导致问题，您可能需要删除g = df.groupby(['READID','VG','Pro']) for group in g: group[1].to_csv('{}.txt'.format(group[0][1]), sep='\t', index=False)字符。

另请注意，您将三个键分组，但只使用一个键作为文件名。它可能会用相同的密钥覆盖其他文件。

输出文件，其名称来自groupby结果pandas python

2 个答案: