Question

我使用自己的文件而不是Python字典，但当我在该文件上应用for循环时，我收到此错误：

TypeError: string indices must be integers, not str

我的代码在下面给出了＆＃34; sai.json＆＃34;是包含字典的文件。

import json
from naiveBayesClassifier import tokenizer
from naiveBayesClassifier.trainer import Trainer
from naiveBayesClassifier.classifier import Classifier

nTrainer = Trainer(tokenizer)

ofile = open("sai.json","r")

dataset=ofile.read()
print dataset

for n in dataset:
    nTrainer.train(n['text'], n['category'])

nClassifier = Classifier(nTrainer.data, tokenizer)

unknownInstance = "Even if I eat too much, is not it possible to lose some weight"

classification = nClassifier.classify(unknownInstance)
print classification

Answer 1

您正在导入json模块，但您还没有使用它！

您可以使用json.load将打开的文件中的JSON数据加载到Python dict中，或者您可以将文件读入字符串，然后使用json.loads将数据加载到dict。

例如，

ofile = open("sai.json","r")
data = json.load(ofile)
ofile.close()

甚至更好

with open("sai.json", "r") as ifile:
    data = json.load(ofile)

或者，使用json.loads：

with open("sai.json", "r") as ifile:
    dataset=ofile.read()
data = json.loads(dataset)

然后，您可以使用data和
来访问data['text']的内容 data['category']，假设字典中有这些键。

您收到错误，因为dataset是一个字符串，所以

for n in dataset:
    nTrainer.train(n['text'], n['category'])

逐个字符地循环遍历该字符串，将每个字符放入一个元素字符串中。字符串只能用整数索引，而不能用其他字符串索引，但是没有多少索引到一个元素字符串，因为如果s是一个元素字符串，那么s[0]具有相同的内容作为s

这是您在评论中添加的数据。我假设你的数据是一个包含在dict中的列表，但是将一个普通列表作为JSON对象就可以了。
FWIW，我用print json.dumps(dataset, indent=4)来格式化它。请注意，文件中的最后一个}后面没有逗号：在Python中没问题，但它在JSON中是错误的。

<强> sai.json

[
    {
        "category": "NO", 
        "text": "hello everyone"
    }, 
    {
        "category": "YES", 
        "text": "dont use words like jerk"
    }, 
    {
        "category": "NO", 
        "text": "what the hell."
    }, 
    {
        "category": "yes", 
        "text": "you jerk"
    }
]

现在，如果我们使用json.load阅读它，您的代码应该可以正常工作。但这是一个简单的演示，只打印内容：

with open("sai.json", "r") as f:
    dataset = json.load(f)

for n in dataset:
    print "text: '%s', category: '%s'" % (n['text'], n['category'])

<强>输出

text: 'hello everyone', category: 'NO'
text: 'dont use words like jerk', category: 'YES'
text: 'what the hell.', category: 'NO'
text: 'you jerk', category: 'yes'

python for循环使用文件而不是字典

1 个答案: