我正在尝试过滤通过从各种.csv中提取行而创建的数据框。当我尝试这样做时,例如df_data["GPS_COD_SINRUT"]
,它会打印:
1 T307 E0 00R
2 T307 C0 00I
3 T307 C0 00R
4 T307 C0 00R
...
2069747 T307 E0 00R
2069748 T307 C0 00I
2069749 T307 C0 00I
2069750 T307 C0 00I
2069751 T307 C0 00I
Name: GPS_COD_SINRUT, Length: 2069752, dtype: object
但是当我尝试执行以下操作时:df_data[df_data["GPS_COD_SINRUT"]=="T307 C0 00R"]
,打印一个空的数据框。我有另一个具有相同列但值不同的数据框,它工作得很好。 df_data的其他列是数字,当我尝试过滤值时,它可以正常工作。我尝试将str()
放在要搜索的值之前,并尝试使用.astype(str)
将列转换为字符串值。我认为也许是提取数据时的代码,但是我没有做任何更改。代码是:
outfile = open('output2.csv', 'w')
outfile.write("Patente; GPS_COD_SINRUT; idx_user; Date; LAT; LON, x_UTM; y_UTM; dist_rute; dist_to_rute; velocity; idx_empresa; idx_expedition")
for filename in glob.glob('*.gps'):
if filename == 'output2.csv': # Skip the file we're writing.
continue
with open(filename, 'r') as infile:
count = 0
lineno = 0
for line in infile:
lineno += 1
if lineno == 1: # Skip the header line.
continue
fields = line.split(';')
Patente = str(fields[0])
GPS_COD_SINRUT= str(fields[1])
idx_user = str(fields[2])
Date = str(fields[3])
LAT = float(fields[4])
LON = float(fields[5])
x_UTM = float(fields[6])
y_UTM = float(fields[7])
dist_rute = float(fields[8])
dist_to_rute = float(fields[9])
velocity = float(fields[10])
idx_empresa = str(fields[11])
idx_expedition = str(fields[12])
if "307" in idx_user:
outfile.write('%s; %s; %s; %s; %f; %f; %f; %f; %f; %f; %f; %s; %s' % (Patente, GPS_COD_SINRUT, idx_user, Date, LAT, LON, x_UTM, y_UTM, dist_rute, dist_to_rute, velocity, idx_empresa, idx_expedition))
count += 1
if count == 0: # Handle the case when no lines were found.
outfile.write('NA ; NA ; NA ; NA ; NA ; NA ; NA ; NA ; NA ; NA ; NA ; NA; NA ; %s\n' % filename)
outfile.close()
我在SO中找到了此代码作为问题的答案。我希望你能帮助我!