如何使用一个热编码的csv文件作为GroundTruth将图像移动到文件夹中?

时间:2019-05-11 18:08:35

标签: python pandas

我正在尝试使用GroundTruth.csv文件将图像从文件夹传输到单独的子文件夹中

csv文件头的图像:

https://user-images.githubusercontent.com/45392637/57573574-36e5f780-742a-11e9-8e16-bf14bf389f0e.JPG

这是我的代码:

 import pandas as pd
 import os
 import shutil

读入数据

 ground_truth = pd.read_csv('GroundTruth.csv')

获取两个文件夹中每个图像的列表

 folder = os.listdir('Training_Input')

获取火车和val图像列表

 train_list = list(ground_truth['image'])
 base_dir = 'base_dir'
 os.mkdir(base_dir)

创建文件目录

 train_dir = os.path.join(base_dir, 'train_dir')
 os.mkdir(train_dir)
 nv = os.path.join(train_dir, 'NV')
 os.mkdir(nv)
 mel = os.path.join(train_dir, 'MEL')
 os.mkdir(mel)
 bkl = os.path.join(train_dir, 'BKL')
 os.mkdir(bkl)
 bcc = os.path.join(train_dir, 'BCC')
 os.mkdir(bcc)
 akiec = os.path.join(train_dir, 'SCC')
 os.mkdir(akiec)
 vasc = os.path.join(train_dir, 'VASC')
 os.mkdir(vasc)
 df = os.path.join(train_dir, 'DF')
 os.mkdir(df)
 df = os.path.join(train_dir, 'UNK')
 os.mkdir(df)
 df = os.path.join(train_dir, 'AK')
 os.mkdir(df)

问题是当我尝试传输图像时

此处:

 # Transfer the training images
 for row in ground_truth.iterrows():

 image = row[1].image
 label = row[1].idxmax()
 fname = image + '.jpg'

  if fname in folder:
       # source path to image
       src = os.path.join('Training_Input', fname)
       # destination path to image
       dst = os.path.join(train_dir, label, fname)
       # copy the image from the source to the destination
       shutil.copyfile(src, dst)

我收到此错误:

  TypeError                                 Traceback (most recent call last)
  <ipython-input-35-9ee8f2c063cc> in <module>()
       2 
       3     image = row[1].image
 ----> 4     label = row[1].idxmax()
       5     fname = image + '.jpg'
       6 

 1 frames
 /usr/local/lib/python3.6/dist-packages/pandas/core/nanops.py in _f(*args, 
 **kwargs)
    71             if any(self.check(obj) for obj in obj_iter):
    72                 msg = 'reduction operation {name!r} not allowed for 
   this  dtype'
    ---> 73     
    raise TypeError(msg.format(name=f.__name__.replace('nan', '')))
   74             try:
   75                 with np.errstate(invalid='ignore'):

   TypeError: reduction operation 'argmax' not allowed for this dtype

问题出在我认为是dtype的列中!

1 个答案:

答案 0 :(得分:0)

我尝试了这个并且有效 发生argmax错误,因为第一列(图像)是字符串 使用row [1] .values [1:]我们跳过第一列

    for row in ground_truth.iterrows():
    image = row[1].image
    label = row[1].index[row[1].values[1:].argmax() + 1]
    fname = image + '.jpg'
    if fname in folder:
    src = os.path.join('ISIC_2019_Training_Input', fname)
    dst = os.path.join(train_dir, label, fname)
    shutil.copyfile(src, dst)