我是Python的新手,所以我为此很抱歉的代码提前道歉。我正在尝试完成一个Web抓取项目,当前有一个带有价格列(当前为字符串)的数据框。我的困境是我想遍历每一行,如果价格显示为每周(包含pw),那么我想将价格更新为每月;也就是说,将其乘以4。对于已经是每月价格的行,我什么也不想做。
mydf = pd.DataFrame({"prices":["350pw", "1000pm", "600pw", "1000pm", "1000pm"], "Column2":["H", "E", "L", "P", "!"]})
它将生成:
prices Column2
0 350pw H
1 1000pm E
2 600pw L
3 1000pm P
4 1000pm !
我能够找到行并仅提取数字。从那里,我转换为int并乘以4,但无法将replace函数与int一起使用。
for x in mydf[mydf['prices'].str.contains('pw')]['prices']:
weekly_price = int(x[0:3])
monthly_price_int = weekly_price * 4
不确定从这里要去哪里。...
最终结果将是:
prices Column2
0 1400pw H
1 1000pm E
2 2400pw L
3 1000pm P
4 1000pm !
答案 0 :(得分:0)
import pandas as pd
def preprocess(x):
if(x['position']>=0):
x['prices']=str(int(x['prices'][:x['position']])*4)+"pm"
return x
return x
mydf = pd.DataFrame({"prices":["350pw", "1000pm", "600pw", "1000pm", "1000pm"], "Column2":["H", "E", "L", "P", "!"]})
mydf["position"]=mydf.prices.str.find('pw')
mydf=mydf.apply(preprocess, axis=1)
mydf.drop(['position'],axis=1,inplace=True)
print(mydf)
答案 1 :(得分:0)
这更多是一个熊猫问题,但这是您可能应该执行的操作:
import pandas as pd
mydf = [your df above]
#define a function to convert from weekly to monthly
def make_monthly(cell):
if 'pw' in cell:
weekly_price = int(cell[0:3])
monthly_price_int = weekly_price * 4
new_cell = str(monthly_price_int)+'pm' #you need to update the period designation as well
return new_cell
else:
return cell
最后,在必要时修改“价格”行中的值:
mydf['prices'] = mydf['prices'].map(make_monthly)
输出:
prices Column2
0 1400pm H
1 1000pm E
2 2400pm L
3 1000pm P
4 1000pm !