我有df,我想替换那里的值
df = pd.read_csv('visits_april.csv')
df1 = pd.read_csv('participants.csv')`
url, date
aliexpress.com/payment/payment-result.htm... 2016-04-30 22:57:41
shoppingcart.aliexpress.com/order/confirm_orde... 2016-04-30 23:05:21
shoppingcart.aliexpress.com/order/confirm_orde... 2016-04-30 23:05:33
shoppingcart.aliexpress.com/order/confirm_orde... 2016-04-30 23:11:37
aliexpress.com/payment/payment-result.htm... 2016-04-30 23:12:07
shoppingcart.aliexpress.com/order/confirm_orde... 2016-04-30 23:15:31
shoppingcart.aliexpress.com/order/confirm_orde... 2016-04-30 23:15:37
aliexpress.com/payment/payment-result.htm... 2016-04-30 23:16:13
kupivip.ru/shop/checkout/confirmation 2016-04-30 23:18:15
如果来自df
的字符串包含来自df2
的字符串,我想将其替换为来自df2
的字符串
visits
aliexpress.com
ozon.ru
wildberries.ru
lamoda.ru
mvideo.ru
eldorado.ru
ulmart.ru
ebay.com
kupivip.ru
欲望输出
url
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
aliexpress.com
kupivip.ru
我尝试df[df['url'].isin(df2['visits'])]
,但它会返回Empty DataFrame
。
答案 0 :(得分:2)
您extract
可以tolist
加入|
{regex or
)列visits
{+ 3}}
L = df2.visits.tolist()
print (L)
['aliexpress.com', 'ozon.ru', 'wildberries.ru', 'lamoda.ru',
'mvideo.ru', 'eldorado.ru', 'ulmart.ru', 'ebay.com', 'kupivip.ru']
print (df.url.str.extract('(' + '|'.join(L) + ')', expand=False))
0 aliexpress.com
1 aliexpress.com
2 aliexpress.com
3 aliexpress.com
4 aliexpress.com
5 aliexpress.com
6 aliexpress.com
7 aliexpress.com
8 kupivip.ru
Name: url, dtype: object