从某些图像不可用的数据集中解析数据,因此我想创建一个新行exists
,这样我就可以遍历<id>.jpg
的图像名称,将其置于False或True。 / p>
获取unicode错误
import pandas as pd
from pandas import Series
train = pd.read_csv('train.csv')
In [16]: train['exists'] = Series(str(os.path.isfile('training_images/' + train['id'] + '.jpg')))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-16-4ada5144d198> in <module>()
----> 1 train['exists'] = Series(str(os.path.isfile('training_images/' + train['id'] + '.jpg')))
/usr/lib/python2.7/genericpath.pyc in isfile(path)
35 """Test whether a path is a regular file"""
36 try:
---> 37 st = os.stat(path)
38 except os.error:
39 return False
TypeError: coercing to Unicode: need string or buffer, Series found
答案 0 :(得分:1)
我建议您使用矢量化解决方案,如下所示:
train['filename'] = 'training_images/' + train['id'] + '.jpg'
train['exists'] = train['filename'].map(os.path.isfile)
结果将是布尔pd.Series
。
答案 1 :(得分:0)
您可以使用apply来执行此操作
train['exists'] = train['id'].apply(lambda x: os.path.isfile('training_images/' + x + '.jpg'))