如何在一个图中绘制不同的数据框数据?

时间:2019-10-31 17:22:20

标签: python matplotlib seaborn

我需要一些指导进行绘制:

  1. df1数据的散点图:时间与y的色相用于z列
  2. 线图df2数据:时间与y
  3. y = c处的单行(c为常数)
df1和df2中的

y数据不同,但是它们在同一范围内。

我不知道从哪里开始。任何指导表示赞赏。

更多说明。此处显示了部分数据。我想作图:

  1. 时间与CO2的散点图
  2. 根据每小时数据找到二氧化碳的年度滚动平均值(从01/01/2016到09/30/2019。因此,第一个平均值将是从“ 01/01/2016 00”到“ 12/31/2016” 23”,第二个平均值将从“ 01/01/2016 01”到“ 01/01/2017 00”)(如下面图表中的趋势所示)
  3. 通过图上的一条线(如下面的直线)找到所有数据的最大值

enter image description here

样本数据

data = {'Date':['0     01/14/2016 00', '01/14/2016 01','01/14/2016 02','01/14/2016 03','01/14/2016 04','01/14/2016 05','01/14/2016 06','01/14/2016 07','01/14/2016 08','01/14/2016 09','01/14/2016 10','01/14/2016 11','01/14/2016 12','01/14/2016 13','01/14/2016 14','01/14/2016 15','01/14/2016 16','01/14/2016 17','01/14/2016 18','01/14/2016 19'],
        'CO2':[2415.9,2416.5,2429.8,2421.5,2422.2,2428.3,2389.1,2343.2,2444.,2424.8,2429.6,2414.7,2434.9,2420.6,2420.5,2397.1,2415.6,2417.4,2373.2,2367.9],
        'Year':[2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016]} 

# Create DataFrame 
df = pd.DataFrame(data)

# DataFrame view
                Date     CO2  Year
 0     01/14/2016 00  2415.9  2016
       01/14/2016 01  2416.5  2016
       01/14/2016 02  2429.8  2016
       01/14/2016 03  2421.5  2016
       01/14/2016 04  2422.2  2016

2 个答案:

答案 0 :(得分:1)

您可以使用双轴图表。理想情况下,它的外观与您的外观相同,因为两个轴的比例相同。可以使用熊猫数据框直接绘制

import matplotlib.pyplot as plt
import pandas as pd

# create a color map for the z column
color_map = {'z_val1':'red', 'z_val2':'blue', 'z_val3':'green', 'z_val4':'yellow'}

fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx() #second axis within the first

# define scatter plot
df1.plot.scatter(x = 'date',
                 y = 'CO2',
                 ax = ax1,
                 c = df['z'].apply(lambda x:color_map[x]))

# define line plot
df2.plot.line(x = 'date',
         y = 'MA_CO2', #moving average in dataframe 2
         ax = ax2)


# plot the horizontal line at y = c (constant value)
ax1.axhline(y = c, color='r', linestyle='-')

# to fit the chart properly
plt.tight_layout()

答案 1 :(得分:1)

使用matplotlib.pyplot

  • plt.hlines以恒定添加一条水平线
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# with synthetic data
np.random.seed(365)
data = {'CO2': [np.random.randint(2000, 2500) for _ in range(783)],
        'Date': pd.bdate_range(start='1/1/2016', end='1/1/2019').tolist()}

# create the dataframe:
df = pd.DataFrame(data)

# verify Date is in datetime format
df['Date'] = pd.to_datetime(df['Date'])

# set Date as index so .rolling can be used
df.set_index('Date', inplace=True)

# add rolling mean
df['rolling'] = df['CO2'].rolling('365D').mean()

# plot the data
plt.figure(figsize=(8, 8))
plt.scatter(x=df.index, y='CO2', data=df, label='data')
plt.plot(df.index, 'rolling', data=df, color='black', label='365 day rolling mean')
plt.hlines(max(df['CO2']), xmin=min(df.index), xmax=max(df.index), color='red', linestyles='dashed', label='Max')
plt.hlines(np.mean(df['CO2']), xmin=min(df.index), xmax=max(df.index), color='green', linestyles='dashed', label='Mean')
plt.xticks(rotation='45')
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.show()

使用综合数据进行绘制:

enter image description here

在操作数据中以日期格式发布:

  • 使用正则表达式修复Date
  • Date之前放置代码以修复df['Date'] = pd.to_datetime(df['Date'])
import re

# your data
                Date     CO2  Year
 0     01/14/2016 00  2415.9  2016
       01/14/2016 01  2416.5  2016
       01/14/2016 02  2429.8  2016
       01/14/2016 03  2421.5  2016
       01/14/2016 04  2422.2  2016

df['Date'] = df['Date'].apply(lambda x: (re.findall(r'\d{2}/\d{2}/\d{4}', x)[0]))

# fixed Date column
       Date     CO2  Year
 01/14/2016  2415.9  2016
 01/14/2016  2416.5  2016
 01/14/2016  2429.8  2016
 01/14/2016  2421.5  2016
 01/14/2016  2422.2  2016