Question

对于当前的研究项目，我计划在Python / Pandas的预定义时间范围内读取JSON对象“ Main_Text”。但是，在计算唯一词时，代码在行TypeError: unhashable type: 'list' indices must be integers上产生错误if word in d:。

我一直在通过故障排除线程来解决问题，除其他外，我试图将其设置为元组（某些线程建议），这已经克服了错误，但导致输出为空。有什么有用的调整可以使这项工作吗？

JSON文件具有以下结构：

[
{"No":"121","Stock Symbol":"A","Date":"05/11/2017","Text Main":"Sample text"}
]

相关的代码摘录如下：

import string
import json
import csv

import pandas as pd
import datetime

import numpy as np


# Loading and reading dataset
file = open("Glassdoor_A.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df['Date'] = pd.to_datetime(df['Date'])


# Create an empty dictionary
d = dict()


# Filtering by date
start_date = "01/01/2009"
end_date = "01/01/2015"

after_start_date = df["Date"] >= start_date
before_end_date = df["Date"] <= end_date

between_two_dates = after_start_date & before_end_date
filtered_dates = df.loc[between_two_dates]

print(filtered_dates)


# Processing
for row in filtered_dates:
    line = list(filtered_dates['Text Main'])
    # Remove the leading spaces and newline character

    line = [val.strip() for val in line]

    # Convert the characters in line to
    # lowercase to avoid case mismatch
    line = [val.lower() for val in line]

    # Remove the punctuation marks from the line
    line = [val.translate(val.maketrans("", "", string.punctuation)) for val in line]

    # Split the line into words
    words = [val.split(" ") for val in line]

    # Iterate over each word in line
    for word in words:
        # Check if the word is already in dictionary
        if word in d:
            # Increment count of word by 1
            d[word] = d[word] + 1
        else:
            # Add the word to dictionary with count 1
            d[word] = 1

Answer 1

document.getElementById("progress").style.width=`${z}%`;

由于“ d”是字典，因此您不能这样做：

if word in d.keys()

我已对您的for循环进行了必要的更改：

if word in d: # does not work like this to check if something is present in a dictionary

Python：TypeError：无法散列的类型：“列表”索引必须为整数

1 个答案: