我有一个像这样的数据框
df1 = pd.DataFrame({'Site': ["S1", "S2", "S3", "S4", "S5", "S6","S7","S8","S9"],
'Sitelink': [" ","S1","S2","S6","S4"," ","S8"," ","S7"],
'level': ["R", "T", "P", "T", "P", "R","T","R","P"],
'Weight':["55","55","55","85","85","80","150","190","200"]})
列“网站”将始终是唯一的
列“ Sitelink”将下一个较低级别的站点捕获到每个站点
列“级别”具有3个值-R,T,P,其中层次结构为R 列“重量”可以是任何值。 输出应满足以下条件:较高级别的站点的重量应始终小于或等于较低级别的站点。预期结果数据框应类似于 我正在尝试循环数据框,并将每个站点与下一个级别进行比较。有更好的方法吗?
答案 0 :(得分:0)
如果我的理解正确,那么您想检查该站点的权重是否小于或等于标记为 Sitelink 的站点的权重。
单行代码将是:
Array
(
[0] => some@example.com
[1] => some@example.co.uk
[2] => hello@åä-ö.com
[3] => example@so.il.uk
)
因此,我们可以使用def is_error(row):
if row['Sitelink'] == " ":
return 'No Error'
site_link = df.loc[df['Site'] == row['Sitelink']]
if int(row['Weight']) <= int(site_link['Weight']):
return 'No Error'
else:
return 'Higher than lower'
函数将此行应用于每一行:
apply