将对象转换为Panda数据帧?

时间:2018-01-30 09:44:23

标签: python pandas parsing dictionary dataframe

我有一个带

列的数据框
  • created_at
  • ID
  • 数据(我在解析此专栏时遇到问题)

数据列中的每个对象都是字典。我希望字典中的每个对象都是一个独立的列。任何帮助或方向的方式将不胜感激。

下面是数据框中一个对象的样子。

pd.Series(data.data[1])

backers_count                                                              37
blurb                       Nano Art will make and market customized piece...
category                    {'id': 21, 'name': 'Digital Art', 'slug': 'art...
converted_pledged_amount                                                 1974
country                                                                    US
created_at                                                         1332823105
creator                     {'id': 300795038, 'name': 'Sameer Walavalkar',...
currency                                                                  USD
currency_symbol                                                             $
currency_trailing_code                                                   True
current_currency                                                          USD
deadline                                                           1337287105
disable_communication                                                   False
fx_rate                                                                     1
goal                                                                     5000
id                                                                  120596924
is_starrable                                                            False
launched_at                                                        1333399105
location                    {'id': 2468964, 'name': 'Pasadena', 'slug': 'p...
name                                                       Nano Art: Reloaded
photo                       {'ed': 'https://ksr-ugc.imgix.net/assets/011/3...
pledged                                                                  1974
profile                     {'id': 118685, 'name': None, 'blurb': None, 's...
slug                                                        nano-art-reloaded
source_url                  https://www.kickstarter.com/discover/categorie...
spotlight                                                               False
staff_pick                                                               True
state                                                                  failed
state_changed_at                                                   1337287105
static_usd_rate                                                             1
urls                        {'web': {'project': 'https://www.kickstarter.c...
usd_pledged                                                            1974.0
usd_type                                                             domestic

data.data[1]
Out[61]: 
{'backers_count': 37,
 'blurb': 'Nano Art will make and market customized pieces, in a variety of materials, featuring etchings smaller than an eyelash.',
 'category': {'color': 16760235,
  'id': 21,
  'name': 'Digital Art',
  'parent_id': 1,
  'position': 3,
  'slug': 'art/digital art',
  'urls': {'web': {'discover': 'http://www.kickstarter.com/discover/categories/art/digital%20art'}}},
 'converted_pledged_amount': 1974,
 'country': 'US',
 'created_at': 1332823105,
 'creator': {'avatar': {'medium': 'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
   'small': 'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
   'thumb': 'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=40&h=40&fit=crop&v=1461381464&auto=format&q=92&s=a4881fe982e2d57b041e2a591cd3e04e'},
  'chosen_currency': None,
  'id': 300795038,
  'is_registered': True,
  'name': 'Sameer Walavalkar',
  'urls': {'api': {'user': 'https://api.kickstarter.com/v1/users/300795038?signature=1515881698.259ca61a8b86731ffbea53f82f4a05e8d1d9f965'},
   'web': {'user': 'https://www.kickstarter.com/profile/300795038'}}},
 'currency': 'USD',
 'currency_symbol': '$',
 'currency_trailing_code': True,
 'current_currency': 'USD',
 'deadline': 1337287105,
 'disable_communication': False,
 'fx_rate': 1,
 'goal': 5000,
 'id': 120596924,
 'is_starrable': False,
 'launched_at': 1333399105,
 'location': {'country': 'US',
  'displayable_name': 'Pasadena, CA',
  'id': 2468964,
  'is_root': False,
  'localized_name': 'Pasadena',
  'name': 'Pasadena',
  'short_name': 'Pasadena, CA',
  'slug': 'pasadena-ca-us',
  'state': 'CA',
  'type': 'Town',
  'urls': {'api': {'nearby_projects': 'https://api.kickstarter.com/v1/discover?signature=1515876022.17e1296a181b009b97c854cfded1e99beeefd9fa&woe_id=2468964'},
   'web': {'discover': 'https://www.kickstarter.com/discover/places/pasadena-ca-us',
    'location': 'https://www.kickstarter.com/locations/pasadena-ca-us'}}},
 'name': 'Nano Art: Reloaded',
 'photo': {'1024x576': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=1024&h=576&fit=crop&v=1463681196&auto=format&q=92&s=4fe4bf0d8a75fa43bbf253f8c1eb5710',
  '1536x864': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=1552&h=873&fit=crop&v=1463681196&auto=format&q=92&s=76cdc06919b3b8df0f3daae78ab57301',
  'ed': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=352&h=198&fit=crop&v=1463681196&auto=format&q=92&s=4446e512c7a07794efcf131e35eb0111',
  'full': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=560&h=315&fit=crop&v=1463681196&auto=format&q=92&s=1d312059cce791c9058255f83c123f47',
  'key': 'assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg',
  'little': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=208&h=117&fit=crop&v=1463681196&auto=format&q=92&s=c386ed08b0c603e1912b1620f9bb58d6',
  'med': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=272&h=153&fit=crop&v=1463681196&auto=format&q=92&s=51a7ae3eadcb187a8dd58c14396b0d8c',
  'small': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=160&h=90&fit=crop&v=1463681196&auto=format&q=92&s=9b1a141c6269c9940f44e273f416b73e',
  'thumb': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=48&h=27&fit=crop&v=1463681196&auto=format&q=92&s=73ef45a6f4df337a942b3a75fec87996'},
 'pledged': 1974,
 'profile': {'background_color': None,
  'background_image_opacity': 0.8,
  'blurb': None,
  'feature_image_attributes': {'image_urls': {'baseball_card': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=560&h=315&fit=crop&v=1463681196&auto=format&q=92&s=1d312059cce791c9058255f83c123f47',
    'default': 'https://ksr-ugc.imgix.net/assets/011/336/473/c6f18bfa9e9933bce11272db919a0bb9_original.jpg?crop=faces&w=1552&h=873&fit=crop&v=1463681196&auto=format&q=92&s=76cdc06919b3b8df0f3daae78ab57301'}},
  'id': 118685,
  'link_background_color': None,
  'link_text': None,
  'link_text_color': None,
  'link_url': None,
  'name': None,
  'project_id': 118685,
  'should_show_feature_image_section': True,
  'show_feature_image': False,
  'state': 'inactive',
  'state_changed_at': 1425915807,
  'text_color': None},
 'slug': 'nano-art-reloaded',
 'source_url': 'https://www.kickstarter.com/discover/categories/art/digital%20art',
 'spotlight': False,
 'staff_pick': True,
 'state': 'failed',
 'state_changed_at': 1337287105,
 'static_usd_rate': 1,
 'urls': {'web': {'project': 'https://www.kickstarter.com/projects/300795038/nano-art-reloaded?ref=category_newest',
   'rewards': 'https://www.kickstarter.com/projects/300795038/nano-art-reloaded/rewards'}},
 'usd_pledged': '1974.0',
 'usd_type': 'domestic'}

我尝试过转置数据帧并使用for循环来堆叠pd.Series生成的第二列。但它不起作用。

2 个答案:

答案 0 :(得分:1)

使用json_normalize

from pandas.io.json import json_normalize

df = json_normalize(data.data[1])
print (df)
   backers_count                                              blurb  \
0             37  Nano Art will make and market customized piece...   

   category.color  category.id category.name  category.parent_id  \
0        16760235           21   Digital Art                   1   

   category.position    category.slug  \
0                  3  art/digital art   

                          category.urls.web.discover  \
0  http://www.kickstarter.com/discover/categories...   

   converted_pledged_amount    ...     \
0                      1974    ...      

                                          source_url  spotlight staff_pick  \
0  https://www.kickstarter.com/discover/categorie...      False       True   

    state state_changed_at static_usd_rate  \
0  failed       1337287105               1   

                                    urls.web.project  \
0  https://www.kickstarter.com/projects/300795038...   

                                    urls.web.rewards usd_pledged  usd_type  
0  https://www.kickstarter.com/projects/300795038...      1974.0  domestic  

[1 rows x 84 columns]

答案 1 :(得分:1)

这是您的词典示例的一个子集:

d = {
    'backers_count':
    37,
    'blurb':
    'Nano Art will make and market customized pieces, in a variety of materials, featuring etchings smaller than an eyelash.',
    'category': {
        'color': 16760235,
        'id': 21,
        'name': 'Digital Art',
        'parent_id': 1,
        'position': 3,
        'slug': 'art/digital art',
        'urls': {
            'web': {
                'discover':
                'http://www.kickstarter.com/discover/categories/art/digital%20art'
            }
        }
    },
    'converted_pledged_amount':
    1974,
    'country':
    'US',
    'created_at':
    1332823105,
    'creator': {
        'avatar': {
            'medium':
            'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
            'small':
            'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=160&h=160&fit=crop&v=1461381464&auto=format&q=92&s=24a8cda7b064a8610c1334200a306a2d',
            'thumb':
            'https://ksr-ugc.imgix.net/assets/006/332/754/bd9efceb28856e93ff226c9b853773a9_original.jpg?w=40&h=40&fit=crop&v=1461381464&auto=format&q=92&s=a4881fe982e2d57b041e2a591cd3e04e'
        },
        'chosen_currency': None,
        'id': 300795038,
        'is_registered': True,
        'name': 'Sameer Walavalkar',
        'urls': {
            'api': {
                'user':
                'https://api.kickstarter.com/v1/users/300795038?signature=1515881698.259ca61a8b86731ffbea53f82f4a05e8d1d9f965'
            },
            'web': {
                'user': 'https://www.kickstarter.com/profile/300795038'
            }
        }
    }
}

我创建了一个最小的例子:

df = pd.DataFrame({"data":[d, d]})

如果您要应用从dictionnary到DataFrame的转换,您可以使用map函数:

list_df = df.data.map(lambda d : pd.DataFrame.from_dict(d, orient="index").transpose()).tolist()

然后,您可以连接结果:

df_concat = pd.concat(list_df)

完成此操作后,您可以连接原始DataFrame datadf_concat