数据框:将python列表转换为数据框组?

时间:2020-07-30 14:51:56

标签: python pandas numpy dataframe

我有一个清单

数据列表

    [['mark', 1], ['tom', 2], ['mark', 3], ['mark', 4], ['tom', 5], ['stuart', 6]]

传递data_list值以在此处起作用。

    for name_list in data_list:
        convertMerge(name_list)

还有一个获取列表并将其转换为df并保存的函数。

    def convertMerge(name_list):
        df = pd.DataFrame([name_list],columns=['name','id'])
        df.to_csv('names'.csv)

如果Df具有相同的name,我正在尝试合并/追加/合并。

(这必须在convertMerge函数内部发生)。

结果输出应具有这样的df:

     df with mark 

            mark.csv

                name    id
            0   mark    1
            1   mark    3
            2   mark    4

    df with tom 
          
            tom.csv

                name    id
            0   tom     2
            1   tom     5


    df with stuart

            stuart.csv`

                name    id
            0   stuart  6

5 个答案:

答案 0 :(得分:0)

尝试以下df.groupby

>>> master_df = pd.DataFrame(data_list, columns = ['name', 'ID'])
>>> for key, sub_df in master_df.groupby('name'):
        sub_df.reset_index(drop=True).to_csv(key + '.csv')

对于您的功能:

def convertMerge(name_list):
    df = pd.DataFrame(name_list,columns=['name','id'])
    for key, sub_df in df.groupby('name'):
        sub_df.reset_index(drop=True).to_csv(key + '.csv')

convertMerge(data_list)

如果打印出来,它看起来像这样:

>>> master_df = pd.DataFrame(data_list, columns = ['name', 'ID'])
>>> for key, sub_df in master_df.groupby('name', sort=False):
        print(key + '.csv')
        sub_df.reset_index(drop=True)

# output:
mark.csv
   name  ID
0  mark   1
1  mark   3
2  mark   4
tom.csv
  name  ID
0  tom   2
1  tom   5
stuart.csv
     name  ID
0  stuart   6

答案 1 :(得分:0)

此解决方案也可以使用unique

data_list = [['mark', 1], ['tom', 2], ['mark', 3], ['mark', 4], ['tom', 5], ['stuart', 6]]
df = pd.DataFrame(data_list, columns=['name', 'id'])
for name in df['name'].unique():
    df.loc[df['name'] == name].to_csv(name + '.csv')

答案 2 :(得分:0)

您可以使用apply来做到这一点:

pd.DataFrame([['mark', 1], ['tom', 2], ['mark', 3], ['mark', 4], ['tom', 5], ['stuart', 6]], columns = ['name', 'ID']).groupby('name').apply(lambda d: d.to_csv(f'{d.name}.csv', index=False))

答案 3 :(得分:0)

这应该做您想做的事

data_list = [['mark', 1], ['tom', 2], ['mark', 3], ['mark', 4], ['tom', 5], ['stuart', 6]]
def convertMerge():
  df = pd.DataFrame()
  for name_list in data_list:
    df = df.append(pd.DataFrame([name_list],columns=['name','id']))
    [y.reset_index(drop = True).to_csv(x + '.csv', index = False) for x, y in df.groupby('name')]
convertMerge()

答案 4 :(得分:0)

我认为这是您喜欢的解决方案... convertMerge

中包含逻辑
data_list = [['mark', 1], ['tom', 2], ['mark', 3], ['mark', 4], ['tom', 5], ['stuart', 6]]

def convertMerge(name_list):
    name = name_list[0]
    df = pd.DataFrame([name_list],columns=['name','id'])
    
    if not os.path.isfile(f'{name}.csv'):
        df.to_csv(f'{name}.csv')
    else:
        df.to_csv(f'{name}.csv', mode='a', header=False)
        
for name_list in data_list:
    convertMerge(name_list)