我是Python的初学者。我正在分析巴士路线上每个站点的公交车头。对于每个站点,我都有一个车头时距列表。每个站点的车头时距计数可以不同。为了可视化数据,我想在同一页面上绘制箱线图,以便您可以观察路线上的总线聚束情况。为此,我开发了一个代码,将.csv文件中的总线数据读入停止字典,其名称为键,值为对象(我跟踪停止的其他一些方面,但为简洁起见,此处未包括)。我遇到的麻烦与箱线图有关。我认为大熊猫可以轻松地做到这一点。但是,我在设置数据框时遇到了很多麻烦,因为我的字典包含了对象。你可能有其他的想法。我将我的代码简化为最小化,以便您仍然可以按照我的方式进行操作。作为旁注,我正在尝试学习如何在进行此分析时使用类。这就是为什么你在我的代码中看到一堆类。在我的完整代码中,我用他们自己的方法处理重复的车辆和异常值。
stops={}
stopNamesA=[]
headwaysA=[]
class Data:
def __init__(self):
self.depart = 0
self.vehicle = 0
class Stop:
def __init__(self):
self.vehicles = []
self.departs = []
self.headways=[]
self.stopName =""
def AddData(self, line):
fields = line.split(",")
self.stopName = fields[3]
self.vehicles.append(fields[0])
x = fields[4]
self.departs.append(datetime.datetime.strptime(x[:-1], "%m/%d/%y %I:%M:%S %p"))
def CalcHeadway(self):
for i in range(len(self.departs)-1):
dt = self.departs[i]
dt2 = self.departs[i+1]
self.headways.append(datetime.timedelta.total_seconds(dt2 - dt))
with open('data.csv','r') as f:
for line in f:
fields = line.split(",")
sid = str(fields[3])
if (fields[1] == 'X2' and fields[2] == 'WEST'):
if sid in stops.keys():
s = stops[sid]
else:
s = Stop()
stops[sid] = s
s.AddData(line)
for key, value in stops.items():
value.CalcHeadway()
数据如下所示(我再次截断了其他部分):
5401 X2 WEST H ST NW + 7TH ST NW 10/3/16 7:58:48 AM
2835 X2 WEST H ST NW + 7TH ST NW 10/3/16 8:16:49 AM
2460 X2 WEST H ST NW + 7TH ST NW 10/3/16 8:20:12 AM
2460 X2 WEST H ST NW + 7TH ST NW 10/3/16 8:20:38 AM
2460 X2 WEST H ST NW + 7TH ST NW 10/3/16 8:20:57 AM
5404 X2 WEST I ST + 14TH ST 10/3/16 8:01:55 AM
2835 X2 WEST I ST + 14TH ST 10/3/16 8:24:01 AM
2853 X2 WEST I ST + 14TH ST 10/3/16 9:27:07 AM
5404 X2 WEST I ST + 14TH ST 10/3/16 9:45:43 AM
2835 X2 WEST I ST + 14TH ST 10/3/16 9:57:31 AM
2831 X2 WEST MINNESOTA AVE NE + BENNING RD NE 10/3/16 8:02:41 AM
2821 X2 WEST MINNESOTA AVE NE + BENNING RD NE 10/3/16 8:17:42 AM
5420 X2 WEST MINNESOTA AVE NE + BENNING RD NE 10/3/16 8:34:43 AM
2853 X2 WEST MINNESOTA AVE NE + BENNING RD NE 10/3/16 8:44:14 AM
5401 X2 WEST MINNESOTA AVE NE + BENNING RD NE 10/3/16 9:02:20 AM
答案 0 :(得分:0)
首先,正如错误所示,'Series' object has no attribute 'boxplot'
。您可以通过Series
从Series.plot.box()
绘制箱线图
但是,由于您希望显示多个框,因此使用数据框是有意义的。所以你需要DataFrame
来绘制你的boxplot
。
如果我正确理解您的需求,您需要一个包含26列的DataFrame
,每个公交站一列。
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame()
df["I ST + 14TH ST"] = [1107.0, 1359.0, 1859.0, 1190.0, 1071.0, 904.0]
df["BENNING RD NE + 19TH ST NE"] = [1132.0, 1503.0, 1448.0, 1344.0, 958.0, 771.0]
#......
df["H ST NW + 5TH ST NW"] = [1182.0, 1315.0, 1691.0, 1193.0, 956.0, 729.0]
df.boxplot(rot=45)
plt.tight_layout()
plt.show()
似乎为了从stops
字典中获取工作数据帧,可以做到。
stops_for_drawing = {}
for key, val in stops.iteritems():
stops_for_drawing.update({key: val.headways})
df = pd.DataFrame(stops_for_drawing)