Question

我认为这必须与缺乏单位/我的单位没有连续编号的事实有关。

我有一个df，其中包含在不同年份测量的batchID，但许多batchID只能在一年内完成，可能是两年。

我仍然希望在一段时间内绘制样本的开发。

任何想法如何解决？ df.to_dict（）是here

import pandas as pd
import numpy as np
import seaborn as sns  
import matplotlib.pyplot as plt  
sns.tsplot(df, time='Year', value='Quality', unit='BatchID', condition='Vegetable')

Answer 1

是的，您的时间行丢失存在问题。

一个可能的解决方案是BatchID groupby和reindex按min和max Year创建的范围，缺少的值会被填充ffill和bffil与fillna中的方法相同：

所以来自：

print (df.head(3))
   BatchID      Farmer  Quality Vegetable  Year
0       31  Pepperidge     0.55    Potato     9
1       31  Pepperidge     0.80    Potato    10
2       95     Johnson     0.50    Carrot     6

获取此24行：

df1 = df.groupby('BatchID')
        .apply(lambda x: x.set_index('Year')
                          .reindex(range(df.Year.min(), df.Year.max() + 1)).ffill().bfill())
       .reset_index(level=1)
       .reset_index(drop=True)

print (df1.head(24))
    Year  BatchID      Farmer  Quality Vegetable
0      1     31.0  Pepperidge     0.55    Potato
1      2     31.0  Pepperidge     0.55    Potato
2      3     31.0  Pepperidge     0.55    Potato
3      4     31.0  Pepperidge     0.55    Potato
4      5     31.0  Pepperidge     0.55    Potato
5      6     31.0  Pepperidge     0.55    Potato
6      7     31.0  Pepperidge     0.55    Potato
7      8     31.0  Pepperidge     0.55    Potato
8      9     31.0  Pepperidge     0.55    Potato
9     10     31.0  Pepperidge     0.80    Potato
10    11     31.0  Pepperidge     0.80    Potato
11    12     31.0  Pepperidge     0.80    Potato
12     1     95.0     Johnson     0.50    Carrot
13     2     95.0     Johnson     0.50    Carrot
14     3     95.0     Johnson     0.50    Carrot
15     4     95.0     Johnson     0.50    Carrot
16     5     95.0     Johnson     0.50    Carrot
17     6     95.0     Johnson     0.50    Carrot
18     7     95.0     Johnson     0.50    Carrot
19     8     95.0     Johnson     0.50    Carrot
20     9     95.0     Johnson     0.50    Carrot
21    10     95.0     Johnson     0.50    Carrot
22    11     95.0     Johnson     0.50    Carrot
23    12     95.0     Johnson     0.50    Carrot


sns.tsplot(df1, time='Year', value='Quality', unit='BatchID', condition='Vegetable')

tsplot没有密谋（再次）

1 个答案: