从我问here的问题中,我得到了一个类似于以下内容的JSON响应:
(请注意:以下示例数据中的id
是数字字符串,但有些是字母数字)
data=↓**
{
"state": "active",
"team_size": 20,
"teams": {
"id": "12345679",
"name": "Good Guys",
"level": 10,
"attacks": 4,
"destruction_percentage": 22.6,
"members": [
{
"id": "1",
"name": "John",
"level": 12
},
{
"id": "2",
"name": "Tom",
"level": 11,
"attacks": [
{
"attackerTag": "2",
"defenderTag": "4",
"damage": 64,
"order": 7
}
]
}
]
},
"opponent": {
"id": "987654321",
"name": "Bad Guys",
"level": 17,
"attacks": 5,
"damage": 20.95,
"members": [
{
"id": "3",
"name": "Betty",
"level": 17,
"attacks": [
{
"attacker_id": "3",
"defender_id": "1",
"damage": 70,
"order": 1
},
{
"attacker_id": "3",
"defender_id": "7",
"damage": 100,
"order": 11
}
],
"opponentAttacks": 0,
"some_useless_data": "Want to ignore, this doesn't show in every record"
},
{
"id": "4",
"name": "Fred",
"level": 9,
"attacks": [
{
"attacker_id": "4",
"defender_id": "9",
"damage": 70,
"order": 4
}
],
"opponentAttacks": 0
}
]
}
}
我使用以下方法加载了该代码:
df = json_normalize([data['team'], data['opponent']],
'members',
['id', 'name'],
meta_prefix='team.',
errors='ignore')
print(df.iloc(1))
attacks [{'damage': 70, 'order': 4, 'defender_id': '9'...
id 4
level 9
name Fred
opponentAttacks 0
some_useless_data NaN
team.name Bad Guys
team.id 987654321
Name: 3, dtype: object
我本质上有一个三部分的问题。
如何使用member tag获得类似于上面的行?我尝试过:
member = df[df['id']=="1"].iloc[0]
#Now this works, but am I correctly doing this?
#It just feels weird is all.
仅考虑到仅记录了攻击但没有记录(即使给出了defender_id),我如何才能检索成员的防御?我尝试过:
df.where(df['tag']==df['attacks'].str.get('defender_id'), df['attacks'], axis=0)
#This is totally not working.. Where am I going wrong?
由于我要从API检索新数据,因此我需要检查数据库中的旧数据,以查看是否有任何新的攻击。然后,我可以遍历新的攻击,然后在其中向用户显示攻击信息。
老实说,我无法弄清楚,我也尝试研究this question和this one,以至于觉得自己离我需要的地方很近,但仍无法解决问题围绕这个概念。本质上,我的逻辑如下:
def get_new_attacks(old_data, new_data)
'''params
old_data: Dataframe loaded from JSON in database
new_data: Dataframe loaded from JSON API response
hopefully having new attacks
returns:
iterator over the new attacks
'''
#calculate a dataframe with new attacks listed
return df.iterrows()
我知道除了我给提供的文档(基本上是为了显示所需的输入/输出)以外,上面的功能几乎没有付出任何努力,但是相信我,我一直在为这部分努力最。我一直在研究merg
,然后再进行reset_index()
的所有攻击,由于这些攻击是列表,因此只会引起一个错误。我上面链接的第二个问题中的map()
函数让我很困惑。
答案 0 :(得分:1)
依次回答您的问题(以下代码):
id
是数据的唯一索引,因此您可以使用df.set_index('id')
,例如,您可以通过df.loc['1']
通过玩家ID访问数据。attacks
中列出的所有字典都是独立的,在某种意义上来说不需要相应的玩家ID(例如attacker_id
或{{1 }}似乎足以识别数据)。因此,建议不要处理包含列表的行,而建议将数据交换到其自己的数据框中,以使其易于访问。defender_id
存储在其自己的数据框中,您就可以简单地比较索引以过滤掉旧数据。下面是一些示例代码来说明各个要点:
attacks