如何将LIME与PyTorch集成?

时间:2019-03-20 09:05:44

标签: pytorch

使用这种mnist图像分类模型:

%reset -f

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import torch.utils.data as data_utils
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from matplotlib import pyplot
from pandas import DataFrame
import torchvision.datasets as dset
import os
import torch.nn.functional as F
import time
import random
import pickle
from sklearn.metrics import confusion_matrix
import pandas as pd
import sklearn


trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (1.0,))])

root = './data'
if not os.path.exists(root):
    os.mkdir(root)
train_set = dset.MNIST(root=root, train=True, transform=trans, download=True)
test_set = dset.MNIST(root=root, train=False, transform=trans, download=True)

batch_size = 64

train_loader = torch.utils.data.DataLoader(
                 dataset=train_set,
                 batch_size=batch_size,
                 shuffle=True)
test_loader = torch.utils.data.DataLoader(
                dataset=test_set,
                batch_size=batch_size,
shuffle=True)

class NeuralNet(nn.Module):
    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(28*28, 500)
        self.fc2 = nn.Linear(500, 256)
        self.fc3 = nn.Linear(256, 2)
    def forward(self, x):
        x = x.view(-1, 28*28)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

num_epochs = 2
random_sample_size = 200

values_0_or_1 = [t for t in train_set if (int(t[1]) == 0 or int(t[1]) == 1)]
values_0_or_1_testset = [t for t in test_set if (int(t[1]) == 0 or int(t[1]) == 1)]

print(len(values_0_or_1))
print(len(values_0_or_1_testset))

train_loader_subset = torch.utils.data.DataLoader(
                 dataset=values_0_or_1,
                 batch_size=batch_size,
                 shuffle=True)

test_loader_subset = torch.utils.data.DataLoader(
                 dataset=values_0_or_1_testset,
                 batch_size=batch_size,
                 shuffle=False)

train_loader = train_loader_subset

# Hyper-parameters 
input_size = 100
hidden_size = 100
num_classes = 2
# learning_rate = 0.00001
learning_rate = .0001
# Device configuration
device = 'cpu'
print_progress_every_n_epochs = 1

model = NeuralNet().to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

N = len(train_loader)
# Train the model
total_step = len(train_loader)

most_recent_prediction = []
test_actual_predicted_dict = {}

rm = random.sample(list(values_0_or_1), random_sample_size)
train_loader_subset = data_utils.DataLoader(rm, batch_size=4)

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader_subset):  
        # Move tensors to the configured device
        images = images.reshape(-1, 2).to(device)
        labels = labels.to(device)

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    if (epoch) % print_progress_every_n_epochs == 0:
        print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, i+1, total_step, loss.item()))


predicted_test = []
model.eval()  # eval mode (batchnorm uses moving mean/variance instead of mini-batch mean/variance)
probs_l = []

predicted_values = []
actual_values = []
labels_l = []

with torch.no_grad():
    for images, labels in test_loader_subset:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        predicted_test.append(predicted.cpu().numpy())

        sm = torch.nn.Softmax()
        probabilities = sm(outputs) 
        probs_l.append(probabilities)  
        labels_l.append(labels.cpu().numpy())

    predicted_values.append(np.concatenate(predicted_test).ravel())
    actual_values.append(np.concatenate(labels_l).ravel())

if (epoch) % 1 == 0:
    print('test accuracy : ', 100 * len((np.where(np.array(predicted_values[0])==(np.array(actual_values[0])))[0])) / len(actual_values[0]))

我要尝试集成“针对机器学习分类器的本地可解释模型不可知的解释”:https://marcotcr.github.io/lime/

似乎未启用PyTorch支持,因为在文档和后续教程中均未提及:

https://marcotcr.github.io/lime/tutorials/Tutorial%20-%20images.html

使用我更新的PyTorch代码:

from lime import lime_image
import time

explainer = lime_image.LimeImageExplainer()
explanation = explainer.explain_instance(images[0].reshape(28,28), model(images[0]), top_labels=5, hide_color=0, num_samples=1000)

原因错误:

/opt/conda/lib/python3.6/site-packages/skimage/color/colorconv.py in gray2rgb(image, alpha)
    830     is_rgb = False
    831     is_alpha = False
--> 832     dims = np.squeeze(image).ndim
    833 
    834     if dims == 3:

AttributeError: 'Tensor' object has no attribute 'ndim'

所以这里出现了tensorflow对象吗?

如何将LIME与PyTorch图像分类集成?

1 个答案:

答案 0 :(得分:0)

这是我的解决方法:

Lime希望输入numpy类型的图像。这就是为什么出现属性错误的原因,一种解决方案是将图像(从Tensor转换为numpy),然后再将其传递给解释器对象。另一种解决方案是使用test_loader_subset选择特定图像,然后使用img = img.numpy()进行转换。

第二,为了使LIME与pytorch(或任何其他框架)一起使用,您需要指定一个批处理预测函数,该函数为每个图像输出每个类别的预测分数。然后将此函数的名称(在这里我称之为batch_predict)传递给explainer.explain_instance(img, batch_predict, ...)。 batch_predict需要遍历传递给它的所有图像,将它们转换为Tensor,进行预测,最后返回预测分数列表(具有numpy值)。这就是我的工作方式。 还要注意,图像必须具有形状(... ,... ,3)(... ,... ,1)才能通过默认分割算法正确分割。这意味着您可能必须使用np.transpose(img, (...))。如果结果不佳,您也可以指定细分算法。

最后,您需要在原始图像上方显示LIME图像蒙版。此代码段显示了如何完成此操作:

from skimage.segmentation import mark_boundaries
temp, mask = explanation.get_image_and_mask(explanation.top_labels[0], positive_only=False, num_features=5, hide_rest=False)
img_boundry = mark_boundaries(temp, mask)
plt.imshow(img_boundry)
plt.show()

此笔记本是一个很好的参考: https://github.com/marcotcr/lime/blob/master/doc/notebooks/Tutorial%20-%20images%20-%20Pytorch.ipynb