按日期绘制的值以字符串形式显示

时间:2017-06-06 02:39:21

标签: python matplotlib plot

我无法找到如何通过字符串将值绘制到时间图表中。

这是我的数据。

输入(来自csv):

Fecha,Pais,count
"20/05/2017",Brazil,1
"20/05/2017",China,821
"20/05/2017",Czechia,31
"20/05/2017",France,1
"20/05/2017","Republic of Korea",1
"21/05/2017",Argentina,5
"21/05/2017",Australia,2
"21/05/2017",China,3043
"21/05/2017",Denmark,1
"21/05/2017",Egypt,1
...
..
.

我已经从CSV中导入了数据,并且解析了日期,字符串和整数值:

DatetimeIndex(['2017-05-20', '2017-05-20', '2017-05-20', '2017-05-20',
               '2017-05-20', '2017-05-21', '2017-05-21', '2017-05-21',
               '2017-05-21', '2017-05-21', '2017-05-21', '2017-05-21',
               '2017-05-21', '2017-05-21', '2017-05-21', '2017-05-21',
               '2017-05-21', '2017-05-21', '2017-05-21', '2017-05-21',
               '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22',
               '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22',
               '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22',
               '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22'],
              dtype='datetime64[ns]', freq=None)
['Brazil' 'China' 'Czechia' 'France' 'Republic of Korea' 'Argentina'
 'Australia' 'China' 'Denmark' 'Egypt' 'France' 'Hungary' 'Netherlands'
 'Oman' 'Republic of Korea' 'Russia' 'Slovak Republic' 'Taiwan' 'Ukraine'
 'United Arab Emirates' 'Argentina' 'Brazil' 'China' 'Czechia' 'Ecuador'
 'France' 'Germany' 'India' 'Latvia' 'Liberia' 'Pakistan' 'Peru'
 'Republic of Korea' 'Russia' 'Taiwan' 'Ukraine']
['1' '821' '31' '1' '1' '5' '2' '3043' '1' '1' '1' '1' '1' '1' '1' '1' '1'
 '3' '48' '1' '2' '1' '3759' '79' '2' '1' '3' '1' '192' '1' '1' '1' '1' '2'
 '1' '1']

事实上我有情节:

see plot figure

但是,我无法加入同一个国家/地区的值,并在包含数据的日期绘制每个的历史记录。

这是我的代码:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from  matplotlib.dates import DateFormatter, DayLocator, AutoDateLocator, AutoDateFormatter
import datetime


locator = DayLocator()
formatter = AutoDateFormatter(locator)

date, country, count = np.loadtxt("72hcountcountry.csv",
                                  delimiter=',',
                                  unpack=True,
                                  dtype='string',
                                  skiprows=1)

date = np.char.replace (date, '"', '')
country = np.char.replace (country, '"', '')
date2 = pd.to_datetime(date, format="%d/%m/%Y")

print date2
print country 
print count

fig, ax = plt.subplots()

ax.plot_date(date2, count)
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(formatter)
ax.autoscale_view()

ax.grid(True)
fig.autofmt_xdate()

plt.show()

如何将每个国家/地区的每个日期与数据分开?

1 个答案:

答案 0 :(得分:0)

如果我理解你正在尝试做什么,你可以使用Pandas库实现它:你需要将输入数据读入DataFrame(它应该正确处理日期格式),然后制作使用groupby方法(请参阅文档here)。

您的csv文件的一个简单示例就在这里(您可能还想另外更改刻度线的格式等):

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

infile = "foo.csv"

# Read in the file to a Pandas 'DataFrame'
df = pd.read_csv(infile)

# Group the different entries by the content of the
# Country/Pais column
dfg = df.groupby('Pais')

fig, ax = plt.subplots()

# Loop over group names (country names),
# and plot each one separately (assigning the appropriate label)
for country in dfg.groups.keys():
    thisdf = dfg.get_group(country)
    ax.plot_date(thisdf['Fecha'], thisdf['count'], 'o-', label=country)


ax.legend()
fig.autofmt_xdate()

plt.show()

这是结果(对于输入文件的最小版本): example plot