我有一个由一组字符串组成的列,如下所示:
npa = pd.read_csv("file_names.csv", usecols=[3,5,6, 7, 8, 9], header=None)
npa.iloc[:,0]
XML_0_1841729699_001
XML_0_1841729699_00nn
XML_0_1841729699_00145
XML_0_1841729699_00145
XML_0_1841729699_00178
XML_0_1841729699_001jklm
XML_0_1841729699_001fjmfd
我的png名称如下:
path_img = "/images"
os.chdir(path_img)
images_name = glob.glob("*.png")
set_img = set([x.rsplit('.', 1)[0] for x in images_name])
set_img
set(['XML_0_1841729699_001fjmfd', XML_0_1841729699_00145','XML_0_1841729699_001','XML_0_1841729699_00178'])
我想在进行处理之前检查set_img
中的名称是否与数据帧中的名称匹配:
for i in range(1, 30):
for img_name in set_img:
if (img_name==npa.iloc[i,0]): # 0 corresponds to the the column of string
print("it works")
然而,它没有检查条件是否。 怎么了?
EDIT1:
f = open("file_names.csv", 'rt')
reader = csv.reader(f)
for row in reader:
if cpt >= 1: # skip header
characs.append(str(row[5]))
cpt += 1
path_img = "/images"
os.chdir(path_img)
images_name = glob.glob("*.png")
set_img = set([x.rsplit('.', 1)[0] for x in images_name])
mask = npa.iloc[:,0].isin(set_img)
for img in set_img:
img = cv2.imread(path_img+'/'+ img +'.png')
print(img.shape)
images = []
images_names = []
WIDTH=[]
HEIGHT=[]
for i in range(1, nb_charac):
if (img==npa[mask].iloc[i,0]):
print("hello")
coords = npa.iloc[[i]]
charac = characs[i - 1]
我遇到了以下错误:
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
if (img==npa[mask].iloc[i,0]):
Traceback (most recent call last):
File "/to_test.py", line 186, in <module>
if (img==npa[mask].iloc[i,0]):
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1225, in __getitem__
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1449, in _getitem_tuple
self._has_valid_tuple(tup)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 127, in _has_valid_tuple
if not self._has_valid_type(k, i):
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1417, in _has_valid_type
return self._is_valid_integer(key, axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1431, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
EDIT2:
然后我换了:
if (img==npa[mask].iloc[i,0]):
通过
if (img==npa[mask][3][i]):
它一直工作到某一行并且我得到以下错误:
if (img==npa[mask][3][i]):
File "/usr/lib/python2.7/dist-packages/pandas/core/series.py", line 557, in __getitem__
result = self.index.get_value(self, key)
File "/usr/lib/python2.7/dist-packages/pandas/core/index.py", line 1790, in get_value
return self._engine.get_value(s, k)
File "pandas/index.pyx", line 103, in pandas.index.IndexEngine.get_value (pandas/index.c:3204)
File "pandas/index.pyx", line 111, in pandas.index.IndexEngine.get_value (pandas/index.c:2903)
File "pandas/index.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)
File "pandas/hashtable.pyx", line 303, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6525)
File "pandas/hashtable.pyx", line 309, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6463)
KeyError: 2035
答案 0 :(得分:2)
使用isin
创建布尔蒙版。然后使用该掩码过滤数据帧。这相当于循环遍历每一行并检查第一列是否在集合中。
mask = npa.iloc[:,0].isin(set_img)
npa[mask]