熊猫的频率计数

时间:2019-11-29 03:11:33

标签: python pandas

我想将此数据分类为两个csv文件。如果文件名的频率为1,它将保存为single_face.csv,其他文件将保存为multi_faces.csv

我要做的是获取数据的频率,看起来像这样:

21_Festival_Festival_21_68.jpg                                                     346
8_Election_Campain_Election_Campaign_8_531.jpg                                     278
2_Demonstration_Demonstration_Or_Protest_2_17.jpg                                  266
18_Concerts_Concerts_18_542.jpg                                                    218
10_People_Marching_People_Marching_10_People_Marching_People_Marching_10_88.jpg    209
                                                                                  ... 
36_Football_americanfootball_ball_36_53.jpg                                          1
48_Parachutist_Paratrooper_Parachutist_Paratrooper_48_48.jpg                         1
55_Sports_Coach_Trainer_sportcoaching_55_837.jpg                                     1
22_Picnic_Picnic_22_586.jpg                                                          1
9_Press_Conference_Press_Conference_9_873.jpg                                        1

这是我的代码:

from pandas import DataFrame
import pandas as pd

def classification(file):

    df= pd.read_csv(file)

    frequency = df['file_name'].value_counts('1')
    print (frequency)

def main():

    classification('ground_truth.csv')

if __name__ == '__main__':
    main()

如何将这些数据分成两个csv文件?我尝试使用if frequency == 1:,但这会导致错误。

1 个答案:

答案 0 :(得分:0)

您可以仅将条件用作Series的索引,并对每个条件使用.to_csv()

In [4]: s = pd.Series(np.random.randint(1, 4, 20), index=list("abcdefghijkilmnopqrs"))

In [5]: s[s == 1]
Out[5]:
a    1
g    1
k    1
s    1
dtype: int32

In [6]: s[s > 1]
Out[6]:
b    2
c    3
d    3
e    2
f    2
h    2
i    2
j    2
i    2
l    3
m    3
n    2
o    3
p    2
q    2
r    3
dtype: int32

In [7]: s[s > 1].to_csv("multi.csv")