将字典的值与数据帧的两列的值进行匹配,并用字典的键替换第三列的值

时间:2020-09-21 13:58:55

标签: python pandas dataframe dictionary

我有一个这样的熊猫数据框:

Index | Line Item                   |        Insertion Order                    | Creative Type
_________________________________________________________________________________________________
1     | blbl 33 dEs '300x600' Q3    | hello 444                                 | UNKNOWN
2     | QQQ4 Hello trueview Apple   | something 68793274                        | UNKNOWN
3     |   A useless  string         | pre-roll Video <10 tttt 89 CASIO          | UNKNOWN
4     | Something not in dict       | Neither here                              | UNKNOWN

还有这样的字典:

 dct = {
'RISING STARS': ['300x600', 'Box 300x600', '300x250', 'Box 300x250', 'Classic Skin', 'Main Banner', 'Half Banner', 'Masthead', 'Push Bar', 'Strip', 'In Image', 'Mix formati display rising'],
'VIDEO': ['trueview', 'Video Banner', 'Video in Picture', 'Videobox', 'Mid-roll Video', 'Pre-roll+Inread', 'Pre-roll Video <10', 'Pre-roll Video =10', 'Pre-roll Video =15', 'Pre-roll Video =20', 'Pre-roll Video =30' ,'Pre-roll Video >30','Inread / Intext / Outstream','Mix formati video','Post-roll Video','Inread XXX (Landscape/Vertical/Square)', 'Pre-roll Video Sponsored Session' ,'Pre-roll Video Viewmax' ,'Pre-roll Video Takeover']}

我想替换数据框的广告素材类型列中的值:如果列Line ItemInsertion Order的值与字典的值匹配,则列{的对应行{1}}应该使用字典键的名称。如果不匹配,则列广告类型的相应行应接收值Creative Type

预期输出为:

NaN

最简单的方法是什么? (如果可能的话,可以减少计算上的开销)

1 个答案:

答案 0 :(得分:1)

通过反转给定Dim WordDoc = WordApp.Documents.Add Dim Linha = WordDoc.Paragraphs.Add 的键值对来创建替换字典,即对于列表中的每个值,将其映射到其对应的键,然后使用Series.replace替换匹配时,来自合并列dictLine Item的字符串及其来自替换字典的相应值,最后mask不可替换的字符串:

Insertion Order

r = {rf'(?i).*?\b{z}\b.*':x for x, y in dct.items() for z in y}
s = df['Line Item'].add(':' + df['Insertion Order'])
df['Creative Type'] = s.replace(r, regex=True).mask(lambda x: x.eq(s))