我有一个大约2400行的数据集,有些行缺少值,如以下示例所示:
+------+------+-------+-------+--------+
| Name | Year | Value | Name2 | Value2 |
+------+------+-------+-------+--------+
| A | 2010 | 10 | | ... |
| A | 2011 | 10 | AA | ... |
| A | 2012 | 12 | AA | ... |
| A | 2013 | 14 | AA | ... |
| A | 2014 | 9 | AA | ... |
| A | 2015 | 13 | | ... |
| B | 2010 | 8 | | ... |
| B | 2011 | 10 | BB | ... |
| B | 2012 | 11 | BB | ... |
| B | 2013 | 12 | BB | ... |
| B | 2014 | 10 | | ... |
| B | 2015 | 11 | | ... |
| C | 2010 | 11 | CC | ... |
| C | 2011 | 10 | CC | ... |
| C | 2012 | 9 | CC | ... |
| C | 2013 | 8 | CC | ... |
| C | 2014 | 10 | CC | ... |
| C | 2015 | 10 | | ... |
| ... | ... | ... | ... | ... |
+------+------+-------+-------+--------+
我想用正确的值填充“名称2”列中的缺失值,因此看起来像这样:
+------+------+-------+-------+--------+
| Name | Year | Value | Name2 | Value2 |
+------+------+-------+-------+--------+
| A | 2010 | 10 | AA | ... |
| A | 2011 | 10 | AA | ... |
| A | 2012 | 12 | AA | ... |
| A | 2013 | 14 | AA | ... |
| A | 2014 | 9 | AA | ... |
| A | 2015 | 13 | AA | ... |
| B | 2010 | 8 | BB | ... |
| B | 2011 | 10 | BB | ... |
| B | 2012 | 11 | BB | ... |
| B | 2013 | 12 | BB | ... |
| B | 2014 | 10 | BB | ... |
| B | 2015 | 11 | BB | ... |
| C | 2010 | 11 | CC | ... |
| C | 2011 | 10 | CC | ... |
| C | 2012 | 9 | CC | ... |
| C | 2013 | 8 | CC | ... |
| C | 2014 | 10 | CC | ... |
| C | 2015 | 10 | CC | ... |
| ... | ... | ... | ... | ... |
+------+------+-------+-------+--------+
我已经尝试过fill()
命令,但没有成功。 用最新的非NA值填充值是行不通的,因为有时会添加错误的值(例如B 2010将填充AA)!
有人可以告诉我该怎么做吗?