如何在一定条件下将行与Pandas相乘?

时间:2017-11-01 06:07:39

标签: python-3.x pandas

如何在一定条件下将行与熊猫相乘? 条件只是以Pref.结尾的名称。 排序顺序不介意。

import pandas as pd

if __name__ == '__main__':

df = pd.DataFrame({"area": ["Aomori Pref.", "Saitama", "GifuPref."],
                    "x": [30, 40, 55],
                    "y": ["l", "m", "n"]})

# I want to get:
#    area         x     y
# 0  Aomori       30    l
# 1  Aomori Pref. 30    l
# 2  Saitama      40    m
# 3  Gifu         55    n
# 4  GifuPref.    55    n

```

2 个答案:

答案 0 :(得分:3)

Pref.结尾的replace个值,并为b添加新的新列NaN,用于不匹配的值{/ 1}}:

df1 = df['area'].str.replace('\s*(Pref.$)','').to_frame('a')
df1['b'] = df['area'].mask(df1['a'] == df['area'])

print (df1)
         a             b
0   Aomori  Aomori Pref.
1  Saitama           NaN
2     Gifu     GifuPref.

然后按mask创建Series,为Series提供新列名称的名称,最后按stack删除MultiIndex的第二级:

s = df1.stack().rename('area').reset_index(level=1, drop=True)
print (s)
0          Aomori
0    Aomori Pref.
1         Saitama
2            Gifu
2       GifuPref.
Name: area, dtype: object

删除orifinal列areareset_index s,最后为唯一index添加join

df2 = df.drop('area', 1).join(s).reset_index(drop=True)[df.columns]
print (df2)
           area   x  y
0        Aomori  30  l
1  Aomori Pref.  30  l
2       Saitama  40  m
3          Gifu  55  n
4     GifuPref.  55  n

正则表达式\s*(Pref.$)表示 - \s*至少为零次,然后匹配()中的字符串,$表示字符串结束。

答案 1 :(得分:1)

pattern = "\s?Pref\\.$"
m = df.area.str.contains(pattern, regex=True)
tmp = df.copy()
tmp.loc[m,"area"] = tmp.area.str.replace(pattern, "")
(pd.concat([df, tmp])
   .sort_values("area")
   .drop_duplicates()
   .reset_index(drop=True))            

           area   x  y
0        Aomori  30  l
1  Aomori Pref.  30  l
2          Gifu  55  n
3     GifuPref.  55  n
4       Saitama  40  m