我有一个包含以下数据的数据框:
average_x, average_y, average_z, Result
1,2,3,x | y
4,5,6,x | y |z
8,7,9,z
11,12,31,x | z
67,56,43,y | z
,并且要求将结果列中的值替换为相应列中的值:
Result
Average X is 1 | Average Y is 2
Average X is 4 | Average Y is 5 | Average Z is 6
Average Z is 9
Average X is 11 | Average Z is 31
Average Y is 56 | Average Z is 43
我尝试了以下代码,但最终收到错误消息:
df_test['Result']=np.where(df_test['Result'].str.contains('x'),df_test['Result'].astype(np.str).replace(to_replace='x',"Average X is " + df_test[average_x]),df_test['Result'])
df_test['Result']=np.where(df_test['Result'].str.contains('y'),df_test['Result'].astype(np.str).replace(to_replace='y',"Average Y is " + df_test[average_y]),df_test['Result'])
df_test['Result']=np.where(df_test['Result'].str.contains('z'),df_test['Result'].astype(np.str).replace(to_replace='z',"Average X is " + df_test[average_z]),df_test['Result'])
但是收到以下错误消息:
df_test['Result']=np.where(df_test['Result'].str.contains('x'),df_test['Result'].astype(np.str).replace(to_replace='x',"Average X is " + df_test[average_x]),df_test['Result'])
File "<ipython-input-69-50ca75be0ce5>", line 1
df_test['Result']=np.where(df_test['Result'].str.contains('x'),df_test['Result'].astype(np.str).replace(to_replace='x',"Average X is " + df_test[average_x]),df_test['Result'])
^
SyntaxError: positional argument follows keyword argument
请建议如何解决此问题,因为我有将近14-15个关键字,其中的值也需要用其各自列中的值替换为文本来替换。
谢谢。
最好的问候, 索拉比
答案 0 :(得分:0)
问题出在以下方面:
.replace(to_replace='x',"Average X is " + df_test[average_x])
假设这是一种pandas.DataFrame.replace
方法,并假设您想对value
使用第二个位置参数,则可以将to_replace=
关键字参数片段作为消息放在异常建议,或在第二个参数中添加value=
。基本上:
.replace('x', "Average X is " + df_test[average_x])
或
.replace(to_replace='x', value="Average X is " + df_test[average_x])
应该适合您的情况。
答案 1 :(得分:0)
使用apply()
在Result
上拆分|
,然后在构造新的average_?
输出时捕获相关的Result
列:
df.apply(
lambda row: " | ".join(
["Average {} is {}".format(x.upper(), row["average_{}".format(x)])
for x in row.Result.split("|")]
), axis=1)
输出:
0 Average X is 1 | Average Y is 2
1 Average X is 4 | Average Y is 5 | Average Z is 6
2 Average Z is 9
3 Average X is 11 | Average Z is 31
4 Average Y is 56 | Average Z is 43
dtype: object
您还可以将事物移至一个函数中,这使其更具可读性:
def describe_results(row):
results = row.Result.split("|")
updated = ["Average {} is {}".format(x.upper(), row["average_{}".format(x)]) for x in results]
return " | ".join(updated)
df.apply(describe_results, axis=1)
数据:
df
average_x average_y average_z Result
0 1 2 3 x|y
1 4 5 6 x|y|z
2 8 7 9 z
3 11 12 31 x|z
4 67 56 43 y|z
注意:我使用提供的原始数据中的df.Result = df.Result.str.replace(" ","")
来消除Result
中的间距。
答案 2 :(得分:0)
感谢大家,通过以下代码成功解决了问题:
for i in range(df_test.shape[0]):
if "x" in df_test.ix[i,"Result"]:
df_test.ix[i,"Result"]=df_test.ix[i,"Result"].replace("x","Average X is " + df_test.ix[i,"average_x"].astype(np.str))
for i in range(df_test.shape[0]):
if "y" in df_test.ix[i,"Result"]:
df_test.ix[i,"Result"]=df_test.ix[i,"Result"].replace("y","Average Y is " + df_test.ix[i,"average_y"].astype(np.str))
for i in range(df_test.shape[0]):
if "z" in df_test.ix[i,"Result"]:
df_test.ix[i,"Result"]=df_test.ix[i,"Result"].replace("z","Average Z is " + df_test.ix[i,"average_z"].astype(np.str))
BR // Saurabh