Question

我有以下数据框（可能会在行和“信息”列中增加）：

City    Country   Info1  Info2
BCN      Spain    3      5.6   
Moscow   Russia   4      7

我正尝试按以下方式拆分信息：

[
{Info1: 3,
 City: BCN,
 Country: Spain},

{Info2: 5.6,
 City: BCN,
 Country: Spain},

{Info1: 4,
 City: Moscow,
 Country: Russia},

{Info2: 7,
 City: Moscow,
 Country: Russia}
]

这有效：

import pandas as pd

dict = {'city':["BCN", "Moscow"], 
        'country': ["Spain", "Russia"], 
        'inf_1':[3, 5],
        'inf_2':[4,7]} 

#we make the dict a dataframe
df = pd.DataFrame(dict) 

# We make a list of the indicators
columns = list(df)[2:]
j=0
i=0


for rows in df.itertuples():
    for col in columns:
        print(" ")
        print("city: " + str(rows.city) )
        print("country: " + str(rows.country))
        print("ind_id: "+ str(columns[j]))
        print("value: "+ str(df[col][i]))
        print(" ")
        j=j+1
    j=0
    i=i+1

但是，这个结果对我来说似乎并不美好。由于我刚接触Pandas，是否有办法制作出更优雅的代码来获得相同的结果？

Answer 1

如果您可以在输出中进行一些细微调整，则可以直接使用melt和to_dict来获取每个信息的单独词典：

>>> df.melt(['City', 'Country']).to_dict('r')

[{'City': 'BCN', 'Country': 'Spain', 'value': 3.0, 'variable': 'Info1'},
 {'City': 'Moscow', 'Country': 'Russia', 'value': 4.0, 'variable': 'Info1'},
 {'City': 'BCN', 'Country': 'Spain', 'value': 5.6, 'variable': 'Info2'},
 {'City': 'Moscow', 'Country': 'Russia', 'value': 7.0, 'variable': 'Info2'}]

Answer 2

对于非特定于熊猫的解决方案，此split_rows函数可用于任何可重复的namedtuple（或者如果更改rd = ...行，则可以指定任何内容）。

import pandas as pd


def split_rows(namedtuple_iterable, cols):
    for row in namedtuple_iterable:
        rd = row._asdict()
        cs = [(col, rd.pop(col)) for col in cols]
        for key, value in cs:
            yield {**rd, key: value}


df = pd.DataFrame(
    {
        "city": ["BCN", "Moscow"],
        "country": ["Spain", "Russia"],
        "inf_1": [3, 5],
        "inf_2": [4, 7],
    }
)


for sr in split_rows(df.itertuples(), ("inf_1", "inf_2")):
    print(sr)

输出

{'Index': 0, 'city': 'BCN', 'country': 'Spain', 'inf_1': 3}
{'Index': 0, 'city': 'BCN', 'country': 'Spain', 'inf_2': 4}
{'Index': 1, 'city': 'Moscow', 'country': 'Russia', 'inf_1': 5}
{'Index': 1, 'city': 'Moscow', 'country': 'Russia', 'inf_2': 7}

熊猫数据框在行和列中增长

2 个答案: