如何将多个json / python字典合并到1个数据框中

时间:2020-09-06 02:01:02

标签: python pandas api csv dataset

我有以下从api调用中获取的json文件,我希望能够将数据合并为1个数据帧,以便可以使用熊猫将其写入csv文件。

原始json

{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Life Sciences Tools & Services', 'ipo': '1999-11-18', 'logo': 'https://static.finnhub.io/logo/5f1f8412-80eb-11ea-bd05-00000000092a.png', 'marketCapitalization': 30719.97, 'name': 'Agilent Technologies Inc', 'phone': '14083458886', 'shareOutstanding': 308.309635, 'ticker': 'A', 'weburl': 'https://www.agilent.com/'}

{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Metals & Mining', 'ipo': '2016-10-18', 'logo': '', 'marketCapitalization': 2727.509, 'name': 'Alcoa Corp', 'phone': '14123152900', 'shareOutstanding': 185.924291, 'ticker': 'AA', 'weburl': 'https://www.alcoa.com/global/en/home.asp'}

{'country': 'CN', 'currency': 'CNY', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'Diversified Consumer Services', 'ipo': '2008-01-29', 'logo': '', 'marketCapitalization': 35.81037, 'name': 'ATA Creativity Global', 'phone': '861065181133', 'shareOutstanding': 47.592384, 'ticker': 'AACG', 'weburl': 'http://www.ata.net.cn'}

{'country': 'US', 'currency': 'USD', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'N/A', 'ipo': '', 'logo': '', 'marketCapitalization': 738.99, 'name': 'Artius Acquisition Inc', 'phone': '12123097668', 'shareOutstanding': 87.54375, 'ticker': 'AACQU', 'weburl': ''}

这就是我要做的数据

ticker(as index)   country   currency   exchange   finnhubIndustry    ipo      logo   ...
    
A                 'US'      'USD'      'NEW YORK.. 'Life Science..'   1999-11  'http://..
AA                'US'      'USD'      'NYSE'      'Metals & Mi...'   2016-10  'http://..
AACG              'CN'      'CNY'      'NASDAQ'    'Diversified...'   2008-01  'http://..
AADR              'US'      'USD'      'NASDAQ'    'N/A'              ''       'http://..

cols = ['country', 'currency', 'exchange', 'finnhubIndustry', 'ipo', 'logo', 'marketCapitalization', 'name', 'phone', 'shareOutstanding', 'ticker', 'weburl']

过去我做了类似的事情

                datastock = requests.get(url).json()
                cols = ['o', 'h', 'l', 'c', 'v', 't', 's']
                df = pandas.DataFrame(datastock, columns=cols)

但是我已经像这样收集数据了

{'c': [10.35, 10.36, 10.37, 10.36, 10.44, 10.45, 10.4, 10.416, 10.37, 10.43, 10.4, 10.35, 10.3, 10.12, 10.04, 10.23, 10.1, 10.1, 10.13, 10.09, 10.2, 10.15, 10.15, 10.1, 10.15, 10.125, 10.08, 10.055, 10.03, 10.04, 10.01, 10.04, 10.03, 10.03, 10.05, 10.1, 10.2, 10.08, 10.44], 'h': [10.44, 10.41, 10.4, 10.42, 10.45, 10.49, 10.45, 10.46, 10.5, 10.5, 10.45, 10.45, 10.4, 10.28, 10.39, 10.25, 10.2, 10.16, 10.17, 10.15, 10.2, 10.17, 10.18, 10.13, 10.24, 10.22, 10.15, 10.097, 10.07, 10.1, 10.09, 10.08, 10.04, 10.07, 10.1, 10.12, 10.2, 10.2, 10.45], 'l': [10.3, 10.34, 10.33, 10.35, 10.37, 10.425, 10.38, 10.33, 10.35, 10.38, 10.37, 10.34, 10.23, 10.1, 10, 10.042, 10.05, 10.05, 10.05, 10.06, 10.07, 10.11, 10.11, 10.05, 10.03, 10.07, 10.05, 10.02, 9.97, 10, 10, 10.02, 10.02, 10.02, 10.01, 10.01, 10.03, 10.06, 10.18], 'o': [10.42, 10.4, 10.35, 10.35, 10.37, 10.46, 10.41, 10.46, 10.5, 10.38, 10.37, 10.45, 10.365, 10.28, 10.39, 10.05, 10.2, 10.1, 10.1, 10.1, 10.1, 10.15, 10.17, 10.125, 10.24, 10.22, 10.07, 10.09, 10.07, 10.1, 10.09, 10.045, 10.04, 10.07, 10.02, 10.01, 10.1, 10.157, 10.2], 's': 'ok', 't': [1594684800, 1594771200, 1594857600, 1594944000, 1595203200, 1595289600, 1595376000, 1595462400, 1595548800, 1595808000, 1595894400, 1595980800, 1596067200, 1596153600, 1596412800, 1596499200, 1596585600, 1596672000, 1596758400, 1597017600, 1597104000, 1597190400, 1597276800, 1597363200, 1597622400, 1597708800, 1597795200, 1597881600, 1597968000, 1598227200, 1598313600, 1598400000, 1598486400, 1598572800, 1598832000, 1598918400, 1599004800, 1599091200, 1599177600], 'v': [17017800, 2752500, 1143800, 391000, 446900, 484800, 682300, 79600, 1295100, 15616, 537200, 99700, 717200, 682300, 329229, 371700, 939100, 214000, 149700, 461200, 304200, 411900, 37200, 141800, 371200, 488900, 750300, 311800, 443000, 554029, 176300, 152400, 48700, 571900, 136227, 85200, 49300, 200700, 329555]}

我不确定我最好的方法是尝试合并json数据以使其看起来像然后进行转换,还是有更简单的方法。

1 个答案:

答案 0 :(得分:1)

我想知道您的“原始json”是否真的是您的意思。通常,一个json文件包含一个对象,在您的示例中为4。我希望您的原始json文件像

[
{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Life Sciences Tools & Services', 'ipo': '1999-11-18', 'logo': 'https://static.finnhub.io/logo/5f1f8412-80eb-11ea-bd05-00000000092a.png', 'marketCapitalization': 30719.97, 'name': 'Agilent Technologies Inc', 'phone': '14083458886', 'shareOutstanding': 308.309635, 'ticker': 'A', 'weburl': 'https://www.agilent.com/'},
{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Metals & Mining', 'ipo': '2016-10-18', 'logo': '', 'marketCapitalization': 2727.509, 'name': 'Alcoa Corp', 'phone': '14123152900', 'shareOutstanding': 185.924291, 'ticker': 'AA', 'weburl': 'https://www.alcoa.com/global/en/home.asp'},
{'country': 'CN', 'currency': 'CNY', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'Diversified Consumer Services', 'ipo': '2008-01-29', 'logo': '', 'marketCapitalization': 35.81037, 'name': 'ATA Creativity Global', 'phone': '861065181133', 'shareOutstanding': 47.592384, 'ticker': 'AACG', 'weburl': 'http://www.ata.net.cn'},
{'country': 'US', 'currency': 'USD', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'N/A', 'ipo': '', 'logo': '', 'marketCapitalization': 738.99, 'name': 'Artius Acquisition Inc', 'phone': '12123097668', 'shareOutstanding': 87.54375, 'ticker': 'AACQU', 'weburl': ''}
]

而不是对象数组。或者,您可能想拥有多个json文件,每个文件只有一个对象。根据您的文件格式,您可以使用pandas.read_json

但是,如果您了解如何将对象按摩到dicts的Python列表中,则可以使用pandas.DataFrame来创建它。这将完全像您想要的:

>>> x = [
... {'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Life Sciences Tools & Services', 'ipo': '1999-11-18', 'logo': 'https://static.finnhub.io/logo/5f1f8412-80eb-11ea-bd05-00000000092a.png', 'marketCapitalization': 30719.97, 'name': 'Agilent Technologies Inc', 'phone': '14083458886', 'shareOutstanding': 308.309635, 'ticker': 'A', 'weburl': 'https://www.agilent.com/'},
... {'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Metals & Mining', 'ipo': '2016-10-18', 'logo': '', 'marketCapitalization': 2727.509, 'name': 'Alcoa Corp', 'phone': '14123152900', 'shareOutstanding': 185.924291, 'ticker': 'AA', 'weburl': 'https://www.alcoa.com/global/en/home.asp'},
... {'country': 'CN', 'currency': 'CNY', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'Diversified Consumer Services', 'ipo': '2008-01-29', 'logo': '', 'marketCapitalization': 35.81037, 'name': 'ATA Creativity Global', 'phone': '861065181133', 'shareOutstanding': 47.592384, 'ticker': 'AACG', 'weburl': 'http://www.ata.net.cn'},
... {'country': 'US', 'currency': 'USD', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'N/A', 'ipo': '', 'logo': '', 'marketCapitalization': 738.99, 'name': 'Artius Acquisition Inc', 'phone': '12123097668', 'shareOutstanding': 87.54375, 'ticker': 'AACQU', 'weburl': ''}
... ]
>>> pandas.DataFrame(x)
  country currency  ... ticker                                    weburl
0      US      USD  ...      A                  https://www.agilent.com/
1      US      USD  ...     AA  https://www.alcoa.com/global/en/home.asp
2      CN      CNY  ...   AACG                     http://www.ata.net.cn
3      US      USD  ...  AACQU                                          

[4 rows x 12 columns]