Pandas创建一个新列,指出文件是否存在

时间:2018-03-14 12:36:54

标签: python pandas

从某些图像不可用的数据集中解析数据,因此我想创建一个新行exists,这样我就可以遍历<id>.jpg的图像名称,将其置于False或True。 / p>

获取unicode错误

import pandas as pd
from pandas import Series
train = pd.read_csv('train.csv')

In [16]: train['exists'] = Series(str(os.path.isfile('training_images/' + train['id'] + '.jpg')))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-4ada5144d198> in <module>()
----> 1 train['exists'] = Series(str(os.path.isfile('training_images/' + train['id'] + '.jpg')))
/usr/lib/python2.7/genericpath.pyc in isfile(path)
     35     """Test whether a path is a regular file"""
     36     try:
---> 37         st = os.stat(path)
     38     except os.error:
     39         return False
TypeError: coercing to Unicode: need string or buffer, Series found

2 个答案:

答案 0 :(得分:1)

我建议您使用矢量化解决方案,如下所示:

train['filename'] = 'training_images/' + train['id'] + '.jpg'
train['exists'] = train['filename'].map(os.path.isfile)

结果将是布尔pd.Series

答案 1 :(得分:0)

您可以使用apply来执行此操作

train['exists'] = train['id'].apply(lambda x: os.path.isfile('training_images/' + x + '.jpg'))