熊猫:替换

时间:2016-07-25 10:00:54

标签: python pandas replace dataframe

我有df,我想替换那里的值

df = pd.read_csv('visits_april.csv')

df1 = pd.read_csv('participants.csv')`

url, date
aliexpress.com/payment/payment-result.htm...  2016-04-30 22:57:41   
shoppingcart.aliexpress.com/order/confirm_orde...  2016-04-30 23:05:21   
shoppingcart.aliexpress.com/order/confirm_orde...  2016-04-30 23:05:33   
shoppingcart.aliexpress.com/order/confirm_orde...  2016-04-30 23:11:37   
aliexpress.com/payment/payment-result.htm...  2016-04-30 23:12:07   
shoppingcart.aliexpress.com/order/confirm_orde...  2016-04-30 23:15:31   
shoppingcart.aliexpress.com/order/confirm_orde...  2016-04-30 23:15:37   
aliexpress.com/payment/payment-result.htm...  2016-04-30 23:16:13   
            kupivip.ru/shop/checkout/confirmation  2016-04-30 23:18:15

如果来自df的字符串包含来自df2的字符串,我想将其替换为来自df2的字符串

visits
aliexpress.com
ozon.ru
wildberries.ru
lamoda.ru
mvideo.ru
eldorado.ru
ulmart.ru
ebay.com
kupivip.ru

欲望输出

url
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
kupivip.ru

我尝试df[df['url'].isin(df2['visits'])],但它会返回Empty DataFrame

1 个答案:

答案 0 :(得分:2)

extract可以tolist加入| {regex or)列visits {+ 3}}

L = df2.visits.tolist()
print (L)
['aliexpress.com', 'ozon.ru', 'wildberries.ru', 'lamoda.ru', 
'mvideo.ru', 'eldorado.ru', 'ulmart.ru', 'ebay.com', 'kupivip.ru']

print (df.url.str.extract('(' + '|'.join(L) + ')', expand=False))
0    aliexpress.com
1    aliexpress.com
2    aliexpress.com
3    aliexpress.com
4    aliexpress.com
5    aliexpress.com
6    aliexpress.com
7    aliexpress.com
8        kupivip.ru
Name: url, dtype: object