Question

我有一个很长的电子表格（CSV），其中过长的网址已经从“网站”列移到了几列的“注释”列。不幸的是，标记此类细胞的惯例并不一致（有人说“看过去”，其他人可能会说“看到过长的网址”，但我相信所有都包括“在不同情况下看到”）。我正在使用一个非常复杂的熊猫脚本来创建一个损坏的数据透视表，但我仍然在想弄掉所有熊猫的快捷方式。如何创建条件将给定行的notes列单元格移动到网站列单元格中，条件是后者包含“see over”？在这种情况下，我还要删除注释列单元格（在其他情况下，我希望保持该单元格完整）。

在类似的说明中，如何创建一个条件，以便如果“Foo”列中的值显示“Bar”，我可以忽略将其写入输出，而是在“Foo”列中添加一个值。那说“是”？

Answer 1

为此我不认为我会尝试任何太可爱的东西：只需找出哪些行需要移动然后移动它们。从像

这样的框架开始

>>> df
                      Website                               Notes Other
0    http://stackoverflow.com                 home away from home     a
1  http://mapleleafs.nhl.com/                                1967     b
2                    see over  http://www.example.com/not_so_long     c
3       http://www.colts.com/            the Luck of the Hoosiers     d

我会做类似

的事情

>>> to_shift_over = df.Website.str.lower().str.contains("see over")
>>> df.loc[to_shift_over, "Website"] = df["Notes"]
>>> df.loc[to_shift_over, "Notes"] = ""

制造

>>> df
                              Website                     Notes Other
0            http://stackoverflow.com       home away from home     a
1          http://mapleleafs.nhl.com/                      1967     b
2  http://www.example.com/not_so_long                               c
3               http://www.colts.com/  the Luck of the Hoosiers     d

在str上使用Series是对其执行矢量操作的便捷方式：

>>> df["Website"].str
<pandas.core.strings.StringMethods object at 0xa9dcfac>
>>> df["Website"].str.lower()
0      http://stackoverflow.com
1    http://mapleleafs.nhl.com/
2                      see over
3         http://www.colts.com/
Name: Website, dtype: object
>>> df["Website"].str.lower().str.contains("see over")
0    False
1    False
2     True
3    False
Name: Website, dtype: bool

然后我们可以使用该布尔数组使用df索引到.loc：

>>> df.loc[to_shift_over]
    Website                               Notes Other
2  see over  http://www.example.com/not_so_long     c
>>> df.loc[to_shift_over, "Website"]
2    see over
Name: Website, dtype: object

Pandas中的条件细胞替换，Python 2.7

1 个答案: