我只是试图从pandas DataFrame中列出地区名称和District对象的列表,但是由于某些原因,代码永远无法完成运行。我看不到任何可能成为无限循环的地方,因此,为什么每次运行它都会卡住,这超出了我的范围。这是卡住的部分(尤其是j迭代的for循环):
import numpy as np
import pandas as pd
#make dataframe
data = pd.read_csv('gun-violence-data_01-2013_03-2018.csv', header=0, delimiter=',')
#drop data points with null condressional district values
data = data[data.congressional_district != 0]
data.dropna(axis=0,how='any',subset=['congressional_district'],inplace= True)
#constructing working table
table = data[['incident_id','state','congressional_district']]
#list of districts. Formatting in original file must be corrected to analyze data
districtNames = ['filler1','filler2']
districts = []
s = table.shape
#loop thru the rows of the table
for i in range(s[0]):
check = True
#build strings for each district
ds = table.iloc[i,1] + str(table.iloc[i,2])
#testString = str(table.iloc[i,2])
#append ds to districtNames if it isnt in already
#make array of District Objects
for j in range(len(districtNames)):
if(ds == districtNames[j]):
check = False
if(check):
districtNames.append(ds)
districts.append(District(ds,0))
作为参考,这是地区类:
class District:
def __init__(self, name, count):
self._name = name
self._count = count
def get_name(self):
return name
def get_count(self):
return count
def updateCount(self,amount):
self._count += amount
.csv初始文件很大,在切掉第8行和第9行中的某些数据点后,我还剩下227,312个数据点。我了解这很多,但是运行5分钟后代码甚至还没有完成。我在做什么错了?
答案 0 :(得分:1)
并不是它不会终止,而是在当前状态下效率低下。尝试这样的事情:
import numpy as np
import pandas as pd
class District:
def __init__(self, name, count):
self._name = name
self._count = count
def get_name(self):
return name
def get_count(self):
return count
def updateCount(self,amount):
self._count += amount
#make dataframe
data = pd.read_csv('gun-violence-data_01-2013_03-2018.csv', header=0, delimiter=',')
#drop data points with null condressional district values
data = data[data.congressional_district != 0]
data.dropna(axis=0,how='any',subset=['congressional_district'],inplace= True)
#constructing working table
table = data[['incident_id','state','congressional_district']]
#list of districts. Formatting in original file must be corrected to analyze data
districtNames = (table.state + table.congressional_district.astype(str)).unique()
districts = list(map(lambda districtName: District(districtName, 0), districtNames))
答案 1 :(得分:0)
您可以使用<div class="green">
<span>Green</span>
</div>
<div class="blue">
<span>Blue</span>
</div>
包来查看代码在哪个循环中停留。
tqdm