Question

所以我创建了如下数据框-

|id    |  Image_name   | result   | classified  |
-------------------------------------------------
|01    |  1.bmp        |  0       |  10         |
|02    |  2.bmp        |  1       |  11         |
|03    |  3.bmp        |  0       |  10         |
|04    |  4.bmp        |  2       |  12         |

现在，我的目录中有一个名为images的文件夹，其中存储了所有.bmp文件（1.bmp，2.bmp，3.bmp，4.bmp等）。

我正在尝试编写一个脚本，以自动在数据帧的“ Image_name”中找到这些文件，并分别返回其结果和分类值。

 import pandas as pd
 import glob
 import os
 data = pd.read_csv("filename.csv")
 for file in glob.glob("*.bmp"):
     fname = os.path.basename(file)

所以这是我的初始代码，我想找到所有提取的fname，然后检查数据框中是否存在以下fname并显示其结果和分类列。

Answer 1

首先从文件夹中获取所有图像名称并存储在列表中

const reducer = (accumulator, currentValue) => { 
  currentValue = +currentValue || 1; return accumulator *= currentValue 
}

console.log(String(12035).split("").reduce(reducer,1));

输出（假设所有四个都在那里）

all_files_names=os.listdir("#path to the dir") 

df.loc[df['Image_name'].isin(all_files_names)]

Answer 2

似乎您只想访问Image_name与文件相同的行，并获得result和classified列。

尝试一下：

import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO("""
id   |  Image_name   | result   | classified   

01    | 1.bmp         | 0        | 10           

02    | 2.bmp         | 1         |  11         
03    |  3.bmp        |  0        |  10         

04    |  4.bmp        |  2        |  12         
"""), sep=r"\s+\|\s+")

file_example = "2.bmp"

print(df[df['Image_name'] == file_example][["result", "classified"]])

Answer 3

您可以为此使用布尔掩码。您可以在下面的链接中了解更多信息。 https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html

for file_name in df['Image_name']:
    print(df[df['Image_name']== file_name][['result', 'classified']])

希望这对您有帮助！

Answer 4

如果您需要相同的算法来处理大量图像（几千/十万个）。最好在执行.isin()方法之前，将过滤器所需的列用作DataFrame的索引。

image_file_names=os.listdir("#path to the dir")

df = df.set_index(df['Image_name'])

df = df.loc[df.index.isin(image_file_names)]

希望这会有所帮助:)）

如何在数据框中选择相应的列字段值？

4 个答案: