我将简要说明我的编码分配:
1)我必须计算来自4个不同时区的推文的幸福分数。
有两个文件,一个包含关键字,每个关键字具有关联的情感值,另一个包含实际的推文本身。
2)我必须先阅读关键字文件,然后将关键字分类为4个列表(基于它们的情感价值)。推文文件的格式为[lat,long] value date time text。
3)时区的“幸福分数”只是该区域中所有推文的总分(情感分数)除以推文数量。
我的程序应该忽略没有关键字的推文,也应该忽略来自的推文
在时区之外。因此,可能在时区之外。
keywordsfile=input("Enter name of keyword file: ")
infile=open(keywordsfile,"r",encoding="utf-8")
depressed=[] #keywords with sentiment value 1
okay=[] #keywords with sentiment value 5
good=[] #keywords with sentiment value 7
happy=[] #keywords with sentiment value 10
for line in infile:
line=line.rstrip()
keyWords=line.split(",")
keyWords[1]=int(keyWords[1])
if keyWords[1]==1:
depressed.append(keyWords[0])
elif keyWords[1]==5:
okay.append(keyWords[0])
elif keyWords[1]==7:
good.append(keyWords[0])
elif keyWords[1]==10:
happy.append(keyWords[0])
else:
pass
infile.close()
tweetfile=input("Enter name of tweet file: ")
infile2=open(tweetfile,"r",encoding="utf-8")
DEPRESSEDVALUE=1
OKAYVALUE=5
GOODVALUE=7
HAPPYVALUE=10
depressedKeys=0
okayKeys=0
goodKeys=0
happyKeys=0
numOfTweetsEastern=0
numOfTweetsCentral=0
numOfTweetsMountain=0
numOfTweetsPacific=0
for line in infile2:
line=line.rstrip()
words=line.split()
firststriplat=words[0].rstrip(",")
lat=firststriplat.lstrip("[")
lat=float(lat)
long=words[1].rstrip("]")
long=float(long)
easternLat= 24.660845 <= lat and lat<=49.189787
easternLong= -87.518395 <= long <= -67.444574
centralLat= 24.660845 <= lat and lat<=49.189787
centralLong= -101.998892 <= long <= -87.518395
mountainLat=24.660845 <= lat and lat<=49.189787
mountainLong=-115.236428 <= long <= -101.998892
pacificLat=24.660845 <= lat and lat<=49.189787
pacificLong= -125.242264<= long <= -115.236428
if easternLat and easternLong:
for word in words:
if word in depressed:
depressedKeys=depressedKeys+1
elif word in okay:
okayKeys=okayKeys+1
elif word in good:
goodKeys=goodKeys+1
elif word in happy:
happyKeys=happyKeys+1
else:
pass
numOfTweetsEastern=numOfTweetsEastern+1
sentimentValueEastern=(depressedKeys*DEPRESSEDVALUE)+(okayKeys*OKAYVALUE)+(goodKeys*GOODVALUE)+(happyKeys*HAPPYVALUE)
elif centralLat and centralLong:
for word in words:
if word in depressed:
depressedKeys=depressedKeys+1
elif word in okay:
okayKeys=okayKeys+1
elif word in good:
goodKeys=goodKeys+1
elif word in happy:
happyKeys=happyKeys+1
else:
pass
numOfTweetsCentral=numOfTweetsCentral+1
sentimentValueCentral=(depressedKeys*DEPRESSEDVALUE)+(okayKeys*OKAYVALUE)+(goodKeys*GOODVALUE)+(happyKeys*HAPPYVALUE)
elif mountainLat and mountainLong:
for word in words:
if word in depressed:
depressedKeys=depressedKeys+1
elif word in okay:
okayKeys=okayKeys+1
elif word in good:
goodKeys=goodKeys+1
elif word in happy:
happyKeys=happyKeys+1
else:
pass
numOfTweetsMountain=numOfTweetsMountain+1
sentimentValueMountain=(depressedKeys*DEPRESSEDVALUE)+(okayKeys*OKAYVALUE)+(goodKeys*GOODVALUE)+(happyKeys*HAPPYVALUE)
elif pacificLat and pacificLong:
for word in words:
if word in depressed:
depressedKeys=depressedKeys+1
elif word in okay:
okayKeys=okayKeys+1
elif word in good:
goodKeys=goodKeys+1
elif word in happy:
happyKeys=happyKeys+1
else:
pass
numOfTweetsPacific=numOfTweetsPacific+1
sentimentValuePacific=(depressedKeys*DEPRESSEDVALUE)+(okayKeys*OKAYVALUE)+(goodKeys*GOODVALUE)+(happyKeys*HAPPYVALUE)
else:
pass
happScoreEastern=sentimentValueEastern/numOfTweetsEastern
happScoreCentral=sentimentValueCentral/numOfTweetsCentral
happScoreMountain=sentimentValueMountain/numOfTweetsMountain
happScorePacific=sentimentValuePacific/numOfTweetsPacific
print("The happiness score for the Eastern timezone is",happScoreEastern,"and the total number of tweets were",numOfTweetsEastern)
print("The happiness score for the Central timezone is",happScoreCentral,"and the total number of tweets were",numOfTweetsCentral)
print("The happiness score for the Mountain timezone is",happScoreMountain,"and the total number of tweets were",numOfTweetsMountain)
print("The happiness score for the Pacific timezone is",happScorePacific,"and the total number of tweets were",numOfTweetsPacific)
但是,很明显,我的幸福分数和每个时区的tweet数量都在减少(显然是大量)。我的代码在哪里出错?我以为我做对了