使用Pandas Dataframe将嵌套的Json格式/文件转换为Flat文件

时间:2018-04-05 05:47:16

标签: python pandas dataframe

我在这里面临一个问题。我有嵌套的数据集

 {
    "members": [
      {
        "firstname": "John", 
        "lastname": "Doe",
        "orgname": "Anon",
        "phone": "916-555-1234",
        "mobile": "",
      },
      {
        "firstname": "Jane",
        "lastname": "Doe",
        "orgname": "Anon",
        "phone": "916-555-4321",
        "mobile": "916-555-7890",
      },
    "teamname": "1",
    "team_size": "5",
    "team_status": "low"
    }

和另一个没有嵌套的

{
"members": [
  {
    "firstname": "John", 
    "lastname": "Doe",
    "orgname": "Anon",
    "phone": "916-555-1234",
    "mobile": "",
  },
"teamname": "1",
"team_size": "5",
"team_status": "low"
}

我已通过代码处理嵌套的

df2 = pd.DataFrame.from_dict(json_normalize(json_file2),orient ='columns')

打印(DF2)

df3 = pd.concat([json_normalize(x)for x in df2 ['members']。values.tolist()],keys = df2.index)

df3 = df2.drop('members',1).join(df3.reset_index(level = 1,drop = True))。reset_index(drop = True)

我收到的错误是“说”TypeError:'float'对象不可订阅“”

请你帮我解决这个问题。

{
"teams": [
{

"members": [
  {
    "firstname": "John", 
    "lastname": "Doe",
    "orgname": "Anon",
    "phone": "916-555-1234",
    "mobile": "",
  },
  {
    "firstname": "Jane",
    "lastname": "Doe",
    "orgname": "Anon",
    "phone": "916-555-4321",
    "mobile": "916-555-7890",
  },
"teamname": "1",
"team_size": "5",
"team_status": "low"
},
{

"members": [
  {
    "firstname": "Mickey",
    "lastname": "Moose",
    "orgname": "Moosers",
    "phone": "916-555-0000",
    "mobile": "916-555-1111",
  },
"teamname": "2",
"team_size": "5",
"team_status": "low"
]
}       
]

}

1 个答案:

答案 0 :(得分:0)

对我来说工作:

d1 = {
    "members": [
      {
        "firstname": "John", 
        "lastname": "Doe",
        "orgname": "Anon",
        "phone": "916-555-1234",
        "mobile": "",
      },
      {
        "firstname": "Jane",
        "lastname": "Doe",
        "orgname": "Anon",
        "phone": "916-555-4321",
        "mobile": "916-555-7890",
      }],
    "teamname": "1",
    "team_size": "5",
    "team_status": "low"
    }

d2 = {
"members": [
  {
    "firstname": "John", 
    "lastname": "Doe",
    "orgname": "Anon",
    "phone": "916-555-1234",
    "mobile": "",
  }],
"teamname": "1",
"team_size": "5",
"team_status": "low"
}
df1 = json_normalize(d1, 'members', ['team_size', 'team_status','teamname'])
print (df1)
  firstname lastname        mobile orgname         phone team_size teamname  \
0      John      Doe                  Anon  916-555-1234         5        1   
1      Jane      Doe  916-555-7890    Anon  916-555-4321         5        1   

  team_status  
0         low  
1         low  

df2 = json_normalize(d2, 'members', ['team_size', 'team_status','teamname'])
print (df2)
  firstname lastname mobile orgname         phone team_size teamname  \
0      John      Doe           Anon  916-555-1234         5        1   

  team_status  
0         low