我在消息A和B的pandas DF中有以下内容:
Message_A
"(Live Storage: 20.00 included in Plan for $15.00 - Exceess of 10.0 @ $6.0)"
"(Live Storage: 5.00 included in Plan for $5.00 - Exceess of 11.0 @ $40.0)"
"(Live Storage: 10.0 out of 150.00 included in Plan for $10.00)"
"(Live Storage: 146.0 out of 200.00 included in Plan for $150.00)"
"(Live Storage: 150.0 - Tier 1501 to 2000 @ $350)"
"(PY Solution -Flat Fee- of $30.00 applied)"
"(Live Storage: 17.0 out of 40.00 included in Plan for $20.00)"
"(Live Storage: 67.0 @ $5.00)"
"(Live Storage: 5.00 included in Plan for $55.00 - Exceess of 13.0 @ $6.0)"
"(Live Storage: 741.0 @ $3.00)"
"(Live Storage: 30.00 included in Plan for $150.00 - Exceess of 39.0 @ $6.0)"
"(Live Storage: 65.0 - Tier 51 to 75 @ $250)"
"(Live Storage: 567.0 - Tier 501 to 750 @ $1750)"
Message_B
"(! Price for Live Storage not found in Pricing Plan !)"
"(! Price for Live Storage not found in Pricing Plan !) ( ABC Storage: 141.0 @ $2.00) (Discount of 10.0% applied to storage amount)"
"(! Price for Live Storage not found in Pricing Plan !)"
"(! Price for Live Storage not found in Pricing Plan !) ( ABC Storage: 1.0 @ $3.00)"
"( ABC Storage: 137.0 - Tier 1251 to 150 @ $100) (! ABC Storage Limit of 00 Exceeded !) (Local Allocated Storage: 20.00 @ $0.40) (Live Storage: 16.0 @ $??)"
"(Discount of 10.0% applied to storage amount) (! Price for Live Storage not found in Pricing Plan !)"
"(! Live Storage not found in Pricing Plan !) (Discount of 10.0% applied to storage amount)"
"(! Price for Live Storage not found in Pricing Plan !) (Local Allocated Storage: 100.00 @ $0.50)"
"(! Price for Storage not found in Pricing Plan !) (Live Storage: 18.0 @ $??)"
"(! Price for Storage not found in Pricing Plan !)(Live Storage: 69.0 @ $??) ( ABC Storage: 401.0 @ $1.50)"
"(Live Storage: 6.0 @ $??) (! Price for Storage not found in Pricing Plan !)"
"(! Price for Live Storage not found in Pricing Plan !) (Discount of 10.0% applied to storage amount)"
"(! Price for Live Storage not found in Pricing Plan !) ( ABC Storage: 270.0 - Tier 201 to 300 @ $400)"
我希望从message_B中删除错误消息。这些是一些文本发生更改的消息,但所有错误消息都包含“'!'或者'?$$'在他们中。然后,我想加入message_A获取单列消息。 为清楚起见,中间步骤如下:
Message_B
Nan
"( ABC Storage: 141.0 @ $2.00) (Discount of 10.0% applied to storage amount)"
Nan
"( ABC Storage: 1.0 @ $3.00)"
"( ABC Storage: 137.0 - Tier 1251 to 150 @ $100)(Local Allocated Storage: 20.00 @ $0.40)"
"(Discount of 10.0% applied to storage amount)"
"(Discount of 10.0% applied to storage amount)"
"(Local Allocated Storage: 100.00 @ $0.50)"
Nan
"( ABC Storage: 401.0 @ $1.50)"
Nan
"(Discount of 10.0% applied to storage amount)"
"( ABC Storage: 270.0 - Tier 201 to 300 @ $400)"
最终结果只是一个单列字符串(drop Nan)。
我已经能够通过删除'('和.replace')'来分割message_B。用' |'给分隔符分开。
我已将message_B拆分为(新)不同的数据帧,但如何迭代完整 DF并删除不需要的消息? (我不想丢掉整行)
我已经尝试df[df['Message_B'].str.contains("(Live Storage: 18.0 @ $??)")==False]
但我需要为每种类型的消息执行此操作,并且消息中的数字会发生变化。
此外,我现在意识到我不能在完整的DF上使用.str.contains
。
任何帮助将不胜感激,并抱歉我如何在消息中设置DF,发现它是最容易阅读的。感谢
修改 我已经能够用以下内容取出标准错误消息:
error_msg1 = "(! Price for live Storage not found in Pricing Plan !)"
replace_with = ''
bumi_output['Message_B'] = [i.replace(error_msg1, replace_with) for i in bumi_output['Message_B']]
有没有办法使用这种方法来取出错误消息,其中一部分的消息可以改变?例如: (实时存储:18.0 @ $ ??) (实时存储:69.0 @ $ ??)
谢谢。
答案 0 :(得分:1)
以下相当丑陋的列表理解通过简单地找到所有括号并排除带有'!'的括号,从而从消息B中获得您想要的内容和'$ ??'然后将其余部分加在一起
new_B = [' '.join([subs for subs in re.findall('\(.+?\)', val) if '!' not in subs and '$??' not in subs])
for val in df['Message_B']]
然后将其添加到A
df['Message_A'] = df['Message_A'] + new_B
要看到这一点有效:
In [26]: df['Message_A'][1]
Out[26]: '(Live Storage: 5.00 included in Plan for $5.00 - Exceess of 11.0 @ $40.0)( ABC Storage: 141.0 @ $2.00) (Discount of 10.0% applied to storage amount)'