我有以下数据框:
0 1 2 3 4
0 1.JPG NaN NaN NaN NaN
1 2883 2957.0 3412.0 3340.0 miscellaneous
2 3517 3007.0 4062.0 3371.0 miscellaneous
3 5678 3158.0 6299.0 3423.0 miscellaneous
4 1627 3287.0 2149.0 3694.0 miscellaneous
5 2894 3272.0 3421.0 3664.0 miscellaneous
6 3525 3271.0 4064.0 3672.0 miscellaneous
7 4759 3337.0 5321.0 3640.0 miscellaneous
8 6141 3289.0 6664.0 3654.0 miscellaneous
9 1017 3598.0 1539.0 3979.0 miscellaneous
10 1624 3586.0 2155.0 3993.0 miscellaneous
11 2252 3612.0 2777.0 3967.0 miscellaneous
12 3211 3548.0 3735.0 3944.0 miscellaneous
13 6052 3616.0 6572.0 3983.0 miscellaneous
14 691 3911.0 1204.0 4223.0 miscellaneous
15 2.JPG NaN NaN NaN NaN
16 3.JPG NaN NaN NaN NaN
17 5384 2841.0 5963.0 3095.0 miscellaneous
18 5985 2797.0 6611.0 3080.0 miscellaneous
19 3512 3012.0 4025.0 3366.0 miscellaneous
20 5085 2974.0 5587.0 3367.0 miscellaneous
21 2593 3224.0 3148.0 3469.0 miscellaneous
22 1044 3630.0 1511.0 3928.0 miscellaneous
23 4764 3619.0 5283.0 3971.0 miscellaneous
24 5103 3613.0 5635.0 3928.0 miscellaneous
我想将这个数据框拆分成多个csv,这样:首先应该将csv命名为1.csv并且所有数据都低于1.jpg,依此类推。 例如,导出的CSV应该是:
1.csv
2883 2957 3412 3340 miscellaneous
3517 3007 4062 3371 miscellaneous
5678 3158 6299 3423 miscellaneous
1627 3287 2149 3694 miscellaneous
2894 3272 3421 3664 miscellaneous
3525 3271 4064 3672 miscellaneous
4759 3337 5321 3640 miscellaneous
6141 3289 6664 3654 miscellaneous
1017 3598 1539 3979 miscellaneous
1624 3586 2155 3993 miscellaneous
2252 3612 2777 3967 miscellaneous
3211 3548 3735 3944 miscellaneous
6052 3616 6572 3983 miscellaneous
691 3911 1204 4223 miscellaneous
2.csv(此csv应为空白)
3.csv
5384 2841 5963 3095 miscellaneous
5985 2797 6611 3080 miscellaneous
3512 3012 4025 3366 miscellaneous
5085 2974 5587 3367 miscellaneous
2593 3224 3148 3469 miscellaneous
1044 3630 1511 3928 miscellaneous
4764 3619 5283 3971 miscellaneous
5103 3613 5635 3928 miscellaneous
如何使用python和pandas执行此操作?
答案 0 :(得分:2)
您可以使用:
for n,g in df.assign(grouper = df['0'].where(df['1'].isnull())
.ffill().astype('category'))\
.dropna().groupby('grouper'):
g.drop('grouper', axis=1).to_csv(n+'.csv', header=None, index=False)
注意: 使用astype('类别')来提取没有记录的群组
输出!dir *.JPG.csv
04/30/2018 03:43 PM 657 1.JPG.csv
04/30/2018 03:43 PM 0 2.JPG.csv
04/30/2018 03:43 PM 376 3.JPG.csv
列出1.jpg.csv的内容
2883,2957.0,3412.0,3340.0,miscellaneous
3517,3007.0,4062.0,3371.0,miscellaneous
5678,3158.0,6299.0,3423.0,miscellaneous
1627,3287.0,2149.0,3694.0,miscellaneous
2894,3272.0,3421.0,3664.0,miscellaneous
3525,3271.0,4064.0,3672.0,miscellaneous
4759,3337.0,5321.0,3640.0,miscellaneous
6141,3289.0,6664.0,3654.0,miscellaneous
1017,3598.0,1539.0,3979.0,miscellaneous
1624,3586.0,2155.0,3993.0,miscellaneous
2252,3612.0,2777.0,3967.0,miscellaneous
3211,3548.0,3735.0,3944.0,miscellaneous
6052,3616.0,6572.0,3983.0,miscellaneous
691,3911.0,1204.0,4223.0,miscellaneous
答案 1 :(得分:0)
# program splits one big csv file into individiual image csv 's
import pandas as pd
import numpy as np
df = pd.read_csv('results.csv', header=None)
#df1 = df.replace(np.nan, '1', regex=True)
print(df)
for n,g in df.assign(grouper = df[0].where(df[1].isnull())
.ffill().astype('category'))\
.dropna().groupby('grouper'):
g.drop('grouper', axis=1).to_csv(n+'.csv',float_format="%.0f", header=None, index=False )
这会产生所需的结果