我有一个这样的熊猫数据框:
Index | Line Item | Insertion Order | Creative Type
_________________________________________________________________________________________________
1 | blbl 33 dEs '300x600' Q3 | hello 444 | UNKNOWN
2 | QQQ4 Hello trueview Apple | something 68793274 | UNKNOWN
3 | A useless string | pre-roll Video <10 tttt 89 CASIO | UNKNOWN
4 | Something not in dict | Neither here | UNKNOWN
还有这样的字典:
dct = {
'RISING STARS': ['300x600', 'Box 300x600', '300x250', 'Box 300x250', 'Classic Skin', 'Main Banner', 'Half Banner', 'Masthead', 'Push Bar', 'Strip', 'In Image', 'Mix formati display rising'],
'VIDEO': ['trueview', 'Video Banner', 'Video in Picture', 'Videobox', 'Mid-roll Video', 'Pre-roll+Inread', 'Pre-roll Video <10', 'Pre-roll Video =10', 'Pre-roll Video =15', 'Pre-roll Video =20', 'Pre-roll Video =30' ,'Pre-roll Video >30','Inread / Intext / Outstream','Mix formati video','Post-roll Video','Inread XXX (Landscape/Vertical/Square)', 'Pre-roll Video Sponsored Session' ,'Pre-roll Video Viewmax' ,'Pre-roll Video Takeover']}
我想替换数据框的广告素材类型列中的值:如果列Line Item
或Insertion Order
的值与字典的值匹配,则列{的对应行{1}}应该使用字典键的名称。如果不匹配,则列广告类型的相应行应接收值Creative Type
。
预期输出为:
NaN
最简单的方法是什么? (如果可能的话,可以减少计算上的开销)
答案 0 :(得分:1)
Dim WordDoc = WordApp.Documents.Add
Dim Linha = WordDoc.Paragraphs.Add
的键值对来创建替换字典,即对于列表中的每个值,将其映射到其对应的键,然后使用Series.replace
替换匹配时,来自合并列dict
和Line Item
的字符串及其来自替换字典的相应值,最后mask
不可替换的字符串:
Insertion Order
r = {rf'(?i).*?\b{z}\b.*':x for x, y in dct.items() for z in y}
s = df['Line Item'].add(':' + df['Insertion Order'])
df['Creative Type'] = s.replace(r, regex=True).mask(lambda x: x.eq(s))