OSError:无法识别图像文件<_io.BufferedReader

时间:2019-06-14 21:31:37

标签: python-3.x python-imaging-library nvidia-jetson

我正在移植代码以训练神经网络。我将代码作为Udacity项目的一部分编写,并且在Udacity环境中运行良好。

现在,我将代码移植到运行Ubuntu 18.04和Python 3.6.8的Nvidia Jetson Nano。

遍历训练数据时,“ ._”以某种方式潜入文件名之前的文件路径并发出错误消息。

运行文件时,出现以下错误消息:

Traceback (most recent call last):
  File "train_rev6.py", line 427, in <module>
    main()
  File "train_rev6.py", line 419, in main
    train_model(in_args)
  File "train_rev6.py", line 221, in train_model
    for inputs, labels in trainloader:
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 560, in __next__
    batch = self.collate_fn([self.dataset[i] for i in indices])
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 560, in <listcomp>
    batch = self.collate_fn([self.dataset[i] for i in indices])
  File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/folder.py", line 132, in __getitem__
    sample = self.loader(path)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/folder.py", line 178, in default_loader
    return pil_loader(path)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/folder.py", line 160, in pil_loader
    img = Image.open(f)
  File "/usr/local/lib/python3.6/dist-packages/PIL/Image.py", line 2705, in open
    % (filename if filename else fp))
OSError: cannot identify image file <_io.BufferedReader name='/home/mme/Documents/001_UdacityFinalProjectFlowersRev2/flowers/train/40/._image_04589.jpg'>

我怀疑错误是由于文件名“ image ...”之前的“ ._”引起的,因为这不是文件名的一部分,当我提示时

sudo find / -name image_00824.jpg

我得到正确的路径:

/home/mme/Documents/001_UdacityFinalProjectFlowersRev2/flowers/train/81/image_00824.jpg

文件名前没有“ ._”。

我的问题似乎与

中的问题相同

OSError: cannot identify image file

(根据答案中的建议从PIL导入Image; Image.open(open(“ path / to / file”,'rb'))进行调整并运行不会发出错误消息。)

文件路径在命令行中给出:

python3 train_rev6.py --file_path "/home/mme/Documents/001_UdacityFinalProjectFlowersRev2/flowers" --arch "vgg16" --epochs 5 --gpu "gpu" --running_loss True --valid_loss True --valid_accuracy True --test True

下面的代码显示了两个相关功能。

有什么主意我要摆脱这个“ ._”吗?

def load_data(in_args):
    """
    Function to:
        - Specify diretories for training, validation and test set.
        - Define your transforms for the training, validation and testing sets.
        - Load the datasets with ImageFolder.
        - Using the image datasets and the trainforms, define the dataloaders.
        - Label mapping.
    """
    # Specify diretories for training, validation and test set.
    data_dir = in_args.file_path
    train_dir = data_dir + "/train"
    valid_dir = data_dir + "/valid"
    test_dir = data_dir + "/test"

    # Define your transforms for the training, validation, and testing sets
    # Means: [0.485, 0.456, 0.406]. Standard deviations [0.229, 0.224, 0.225]. Calculated by ImageNet images.
    # Transformation on training set: random rotation, random resized crop to 224 x 224 pixels, random horizontal and vertical flip, tranform to a tensor and normalize data.
    train_transforms = transforms.Compose([transforms.RandomRotation(23),
                                           transforms.RandomResizedCrop(224),
                                           transforms.RandomHorizontalFlip(),
                                           transforms.RandomVerticalFlip(),
                                           transforms.ToTensor(),
                                           transforms.Normalize([0.485, 0.456, 0.406],
                                                                [0.229, 0.224, 0.225])])

    # Transformation on validation set: resize and center crop to 224 x 224 pixels, tranform to a tensor and normalize data.
    valid_transforms = transforms.Compose([transforms.Resize(255),
                                           transforms.CenterCrop(224),
                                           transforms.ToTensor(),
                                           transforms.Normalize([0.485, 0.456, 0.406],
                                                                [0.229, 0.224, 0.225])])

    # Transformation on test set: resize and center crop to 224 x 224 pixels, tranform to a tensor and normalize data.
    test_transforms = transforms.Compose([transforms.Resize(255),
                                          transforms.CenterCrop(224),
                                          transforms.ToTensor(),
                                          transforms.Normalize([0.485, 0.456, 0.406],
                                                               [0.229, 0.224, 0.225])])

    # Load the datasets with ImageFolder
    global train_dataset
    global valid_dataset
    global test_dataset
    train_dataset = datasets.ImageFolder(data_dir + "/train", transform=train_transforms)
    valid_dataset = datasets.ImageFolder(data_dir + "/valid", transform=valid_transforms)
    test_dataset = datasets.ImageFolder(data_dir + "/test", transform=test_transforms)

    # Using the image datasets and the trainforms, define the dataloaders, as global variables.
    global trainloader
    global validloader
    global testloader
    trainloader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
    validloader = torch.utils.data.DataLoader(valid_dataset, batch_size=64)
    testloader = torch.utils.data.DataLoader(test_dataset, batch_size=64)

    # Label mapping.
    global cat_to_name
    with open("cat_to_name.json", "r") as f:
        cat_to_name = json.load(f)

    print("Done loading data...")

    return

def train_model(in_args):
    """
    Function to build and train model.
    """
    # Number of epochs.
    global epochs
    epochs = in_args.epochs
    # Set running_loss to 0
    running_loss = 0

    # Prepare lists to print losses and accuracies.
    global list_running_loss
    global list_valid_loss
    global list_valid_accuracy
    list_running_loss, list_valid_loss, list_valid_accuracy = [], [], []

    # If in testing mode, set loop counter to prematurly return to the main().
    if in_args.test == True:
        loop_counter = 0

    # for loop to train model.
    for epoch in range(epochs):
        # for loop to iterate through training dataloader.
        for inputs, labels in trainloader:
            # If in testing mode, increase loop counter to prematurly return to the main() after 5 loops.
            if in_args.test == True:
                loop_counter +=1
                if loop_counter == 5:
                    return

            # Move input and label tensors to the default device.
            inputs, labels = inputs.to(device), labels.to(device)

            # Set gradients to 0 to avoid accumulation
            optimizer.zero_grad()

            # Forward pass, back propagation, gradient descent and updating weights and bias.
            # Forward pass through model to get log of probabilities.
            log_ps = model.forward(inputs)
            # Calculate loss of model output based on model prediction and labels.
            loss = criterion(log_ps, labels)
            # Back propagation of loss through model / gradient descent.
            loss.backward()
            # Update weights / gradient descent.
            optimizer.step()

            # Accumulate loss for training image set for print out in terminal
            running_loss += loss.item()

            # Calculate loss for verification image set and accuracy for print out in terminal.
            # Validation pass and print out the validation accuracy.
            # Set loss of validation set and accuracy to 0.
            valid_loss = 0
            # test_loss = 0
            valid_accuracy = 0
            # test_accuracy = 0

            # Set model to evaluation mode to turn off dropout so all images in the validation & test set are passed through the model.
            model.eval()

            # Turn off gradients for validation, saves memory and computations.
            with torch.no_grad():
                # for loop to evaluate loss of validation image set and its accuracy.
                for valid_inputs, valid_labels in validloader:
                    # Move input and label tensors to the default device.
                    valid_inputs, valid_labels = valid_inputs.to(device), valid_labels.to(device)

                    # Run validation image set through model.
                    valid_log_ps = model.forward(valid_inputs)

                    # Calculate loss for validation image set.
                    valid_batch_loss = criterion(valid_log_ps, valid_labels)

                    # Accumulate loss for validation image set.
                    valid_loss += valid_batch_loss.item()

                    # Calculate probabilities
                    valid_ps = torch.exp(valid_log_ps)

                    # Get the most likely class using the ps.topk method.
                    valid_top_k, valid_top_class = valid_ps.topk(1, dim=1)

                    # Check if the predicted classes match the labels.
                    valid_equals = valid_top_class == valid_labels.view(*valid_top_class.shape)

                    # Calculate the percentage of correct predictions.
                    valid_accuracy += torch.mean(valid_equals.type(torch.FloatTensor)).item()

            # Print out losses and accuracies
            # Create string for running_loss.
            str1 = ["Train loss: {:.3f} ".format(running_loss) if in_args.running_loss == True else ""]
            str1 = "".join(str1)
            # Create string for valid_loss.
            str2 = ["Valid loss: {:.3f} ".format(valid_loss/len(validloader)) if in_args.valid_loss == True else ""]
            str2 = "".join(str2)
            # Create string for valid_accuracy.
            str3 = ["Valid accuracy: {:.3f} ".format(valid_accuracy/len(validloader)) if in_args.valid_accuracy == True else ""]
            str3 = "".join(str3)
            # Print strings
            print(f"{epoch+1}/{epochs} " + str1 + str2 + str3)

            # Append current losses and accuracy to lists to print losses and accuracies.
            list_running_loss.append(running_loss)
            list_valid_loss.append(valid_loss/len(validloader))
            list_valid_accuracy.append(valid_accuracy/len(validloader))

            # Set running_loss to 0.
            running_loss = 0

            # Set model back to train mode.
            model.train()

    print("Done training model...")

    return

1 个答案:

答案 0 :(得分:0)

工作中的一位同事指出,在Linux中,以句点开头的文件是隐藏文件。因此,我在文件浏览器中选择了“显示隐藏的文件”,它们就在那里。我删除了它们,从而解决了该问题(请参见下面的命令)。

在所有子文件夹中查找并显示所有以“ ._”开头的文件(首先显示所选文件,以确保这些文件是您要删除的文件):

find test -name '._* -print

在所有子文件夹中查找和删除所有以“ ._”开头的文件

find test -name '._*' -delete