对于当前的研究项目,我计划在Python / Pandas的预定义时间范围内读取JSON对象“ Main_Text”。但是,在计算唯一词时,代码在行TypeError: unhashable type: 'list' indices must be integers
上产生错误if word in d:
。
我一直在通过故障排除线程来解决问题,除其他外,我试图将其设置为元组(某些线程建议),这已经克服了错误,但导致输出为空。有什么有用的调整可以使这项工作吗?
JSON文件具有以下结构:
[
{"No":"121","Stock Symbol":"A","Date":"05/11/2017","Text Main":"Sample text"}
]
相关的代码摘录如下:
import string
import json
import csv
import pandas as pd
import datetime
import numpy as np
# Loading and reading dataset
file = open("Glassdoor_A.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df['Date'] = pd.to_datetime(df['Date'])
# Create an empty dictionary
d = dict()
# Filtering by date
start_date = "01/01/2009"
end_date = "01/01/2015"
after_start_date = df["Date"] >= start_date
before_end_date = df["Date"] <= end_date
between_two_dates = after_start_date & before_end_date
filtered_dates = df.loc[between_two_dates]
print(filtered_dates)
# Processing
for row in filtered_dates:
line = list(filtered_dates['Text Main'])
# Remove the leading spaces and newline character
line = [val.strip() for val in line]
# Convert the characters in line to
# lowercase to avoid case mismatch
line = [val.lower() for val in line]
# Remove the punctuation marks from the line
line = [val.translate(val.maketrans("", "", string.punctuation)) for val in line]
# Split the line into words
words = [val.split(" ") for val in line]
# Iterate over each word in line
for word in words:
# Check if the word is already in dictionary
if word in d:
# Increment count of word by 1
d[word] = d[word] + 1
else:
# Add the word to dictionary with count 1
d[word] = 1
答案 0 :(得分:1)
document.getElementById("progress").style.width=`${z}%`;
由于“ d”是字典,因此您不能这样做:
if word in d.keys()
我已对您的for循环进行了必要的更改:
if word in d: # does not work like this to check if something is present in a dictionary