我正在从在线数据库中获取数据。它以列表中的字符串形式返回日期和数值。即['87', '79', '50', 'M', '65']
(这是y轴图的值,而x轴值是与这些值关联的年份,即['2018', '2017', '2016', '2015', '2014']
。在绘制这些值之前,我首先需要将它们转换为整数。只需使用maxT_int = list(map(int,maxTList)
即可完成此操作,但是仍然存在问题,如上例所示,有时数据会丢失,并由'M'指示为丢失。
我想做的是删除“ M”或以某种方式解释它并能够绘制值。
当列表中没有'M'时,我可以绘制出很好的值。关于如何最好地处理此问题的任何建议?
下面列出了我的完整代码
import urllib
import datetime
import urllib.request
import ast
from bokeh.plotting import figure
#from bokeh.io import output_file, show, export_png
import numpy as np
# Get user input for day
# in the format of mm-dd
print("Enter a value for the day that you would like to plot.")
print("The format should be mm-dd")
dayofmonth = input("What day would you like to plot? ")
# testing out a range of years
y = datetime.datetime.today().year
# get starting year
ystart = int(input("What year would you like to start with? "))
# get number of years back
ynum = int(input("How many years would you like to plot? "))
# calculate the number of years back to start from current year
diff = y - ystart
#assign values to the list of years
years = list(range(y-diff,y-(diff+ynum), -1))
start = y - diff
endyear = y - (diff+ynum)
i = 0
dateList=[]
minTList=[]
maxTList=[]
for year in years:
sdate = (str(year) + '-' + dayofmonth)
#print(sdate)
url = "http://data.rcc-acis.org/StnData"
values = {
"sid": "KGGW",
"date": sdate,
"elems": "maxt,mint",
"meta": "name",
"output": "json"
}
data = urllib.parse.urlencode(values).encode("utf-8")
req = urllib.request.Request(url, data)
response = urllib.request.urlopen(req)
results = response.read()
results = results.decode()
results = ast.literal_eval(results)
if i < 1:
n_label = results['meta']['name']
i = 2
for x in results["data"]:
date,maxT,minT = x
#setting the string of date to datetime
date = date[0:4]
date_obj = datetime.datetime.strptime(date,'%Y')
dateList.append(date_obj)
minTList.append(minT)
maxTList.append(maxT)
maxT_int = list(map(int,maxTList))
# setting up the array for numpy
x = np.array(years)
y = np.array(maxT_int)
p = figure(title="Max Temps by Year for the day " + dayofmonth + " " + n_label, x_axis_label='Years',
y_axis_label='Max Temps', plot_width=1000, plot_height=600)
p.line(x,y, line_width=2)
output_file("temps.html")
show(p)
答案 0 :(得分:1)
您可以使用numpy.nan
和一个函数:
import numpy as np
lst = ['87', '79', '50', 'M', '65']
def convert(item):
if item == 'M':
return np.nan
else:
return int(item)
new_lst = list(map(convert, lst))
print(new_lst)
或者-如果您对列表理解感兴趣:
new_lst = [int(item) if item is not 'M' else np.nan for item in lst]
[87, 79, 50, nan, 65]
答案 1 :(得分:0)
尝试一下:
>>> maxTList = ['87', '79', '50', 'M', '65'] >>> maxT_int = [int(item) for item in maxTList if item.isdigit()] >>> maxT_int [87, 79, 50, 65]
现在,代码只是丢弃非数字字符串(如问题中所指定),使 maxT_int 比 maxTList 短(在这种情况下,您必须将相同的算法应用于其他列表,以确保排除相应的年份)。
如果希望它们相等,则可以指定默认值,以防字符串不是有效的 int (请注意 if 和 for < / em>顺序相反):
>>> maxT_int2 = [int(item) if item.isdigit() else -1 for item in maxTList] [87, 79, 50, -1, 65]
答案 2 :(得分:0)
您可以使用列表推导,对y值进行两次迭代。
raw_x = ['2018', '2017', '2016', '2015', '2014']
raw_y = ['87', '79', '50', 'M', '65']
clean_x = [x for x, y in zip(raw_x, raw_y) if y != 'M']
clean_y = [y for y in raw_y if y != 'M']