如何使用另一列替换一列的零件字符串值。
我的数据集是:
ID Product Name Size ID Size Name
1 24 Mantra Ancient Grains Foxtail Millet 500 gm 1 500 gm
2 24 Mantra Ancient Grains Little Millet 500 gm 2 500 gm
3 24 Mantra Naturals Almonds 100 gm 3 100 gm
4 24 Mantra Naturals Kismis 100 gm 4 100 gm
5 24 Mantra Organic Ajwain 100 gm 5 100 gm
6 24 Mantra Organic Apple Blast Drink 250 ml 6 250 ml
7 24 Mantra Organic Apple Juice 1 Ltr Tetra Pack 7 1000 ml
8 24 Mantra Organic Apple Juice 200 ml 8 200 ml
9 24 Mantra Organic Assam Tea 100 gm 9 100 gm
这里的要求是Product Name
列的值为24 Mantra Ancient Grains Foxtail Millet 500 gm
,而Size Name
列的值为500 Gm
。在这种情况下,我的输出将为24 Mantra Ancient Grains Foxtail Millet
。
如果Size Name
字符串中包含Product Name
,则忽略大小写单词而忽略大小写,否则无需采取任何措施。
答案 0 :(得分:1)
假设您要将“尺寸名称”值替换为“产品名称”的子字符串,则将其替换为“无”
df = pd.DataFrame({
'Product Name' : ['24 Mantra Ancient Grains Foxtail Millet 500 gm', '24 Mantra Ancient Grains Little Millet 500 gm ', '24 Mantra Naturals Kismis 100 gm'],
'Size ID' : [1, 2, 3],
'Size Name': ['500 gm', '500 gm', '200 gm']
})
df['same']= df.apply(lambda x: x['Size Name'] in x['Product Name'], axis = 1)
df['Size Name'] = np.where(df['same'], None, df['Size Name'])
df.drop(columns=['same'], inplace = True)
df
Product Name Size ID Size Name
0 24 Mantra Ancient Grains Foxtail Millet 500 gm 1 None
1 24 Mantra Ancient Grains Little Millet 500 gm 2 None
2 24 Mantra Naturals Kismis 100 gm 3 200 gm
答案 1 :(得分:1)
IIUC,您可以使用apply()
和replace()
:
df['Product Name'] = df.apply(lambda x: x['Product Name'].replace(x['Size Name'], '').strip(), axis=1)
收益:
ID Product Name Size ID Size Name
0 1 24 Mantra Ancient Grains Foxtail Millet 1 500 gm
1 2 24 Mantra Ancient Grains Little Millet 2 500 gm
2 3 24 Mantra Naturals Almonds 3 100 gm
3 4 24 Mantra Naturals Kismis 4 100 gm
4 5 24 Mantra Organic Ajwain 5 100 gm
5 6 24 Mantra Organic Apple Blast Drink 6 250 ml
6 7 24 Mantra Organic Apple Juice 1 Ltr Tetra Pack 7 1000 ml
7 8 24 Mantra Organic Apple Juice 8 200 ml
8 9 24 Mantra Organic Assam Tea 9 100 gm
答案 2 :(得分:0)
假设,您size name
始终是最后一列,这是我认为您需要的:
import re
data = '''ID Product Name Size ID Size Name
1 24 Mantra Ancient Grains Foxtail Millet 500 gm 1 500 gm
2 24 Mantra Ancient Grains Little Millet 500 gm 2 500 gm
3 24 Mantra Naturals Almonds 100 gm 3 100 gm
4 24 Mantra Naturals Kismis 100 gm 4 100 gm
5 24 Mantra Organic Ajwain 100 gm 5 100 gm
6 24 Mantra Organic Apple Blast Drink 250 ml 6 250 ml
7 24 Mantra Organic Apple Juice 1 Ltr Tetra Pack 7 1000 ml
8 24 Mantra Organic Apple Juice 200 ml 8 200 ml
9 24 Mantra Organic Assam Tea 100 gm 9 100 gm
'''
def cleaner(txt):
data = txt
temp = data.split('\n')
products = temp[1:-1]
fixed_products = [temp[0]]
for p in products:
res = re.search('(\d+\s\w*)$', p)
try:
match = res.group(0)
ignore_from = len(match)
found_at = p[:-ignore_from].find(match)
if found_at > -1:#we found a duplicate
fixed_product = p.replace(match,'',1)
fixed_products.append(fixed_product)
except:
pass
products = '\n'.join(fixed_products)
return products
#Example
#cleaner(data)