我有一个在特定属性上展平的数据框:
id property_a properties_b
id_1 property_a_1 [property_b_11, property_b_12]
id_2 property_a_2 [property_b_21, property_b_22, property_b_23]
..................
我想扩展专栏properties_b
以返回到如下所示的数据框:
id property_a property_b
id_1 property_a_1 property_b_11
id_1 property_a_1 property_b_12
id_2 property_a_2 property_b_21
id_2 property_a_2 property_b_22
id_2 property_a_2 property_b_23
..................
我怀疑这对Pandas来说非常简单,但对于Python来说,我很难找到一种优雅的方法。
答案 0 :(得分:3)
以下是另一种使用to_records
,一些元组映射和from_records
的方法。
import pandas as pd
import itertools
def expand_column(df, col_id):
records = map(lambda r: [r[1:col_id] + (l,) + r[col_id + 1:] for l in r[col_id]], map(tuple, df.to_records()))
return pd.DataFrame.from_records(itertools.chain.from_iterable(records), columns=df.columns)
df = pd.DataFrame([['a', [1,2,3], 'a'],['b', [4,5], 'b']], columns=['C1', 'L', 'C2'])
print(df)
print(expand_column(df, 2))
# C1 L C2
# 0 a [1, 2, 3] a
# 1 b [4, 5] b
#
# C1 L C2
# 0 a 1 a
# 1 a 2 a
# 2 a 3 a
# 3 b 4 b
# 4 b 5 b
答案 1 :(得分:2)
此问题已针对here和here。如果您发现这些问题和答案有用,请随时投票。
df = pd.DataFrame([
['id_1', 'property_a_1', ['property_b_11', 'property_b_12']],
['id_2', 'property_a_2', ['property_b_21', 'property_b_22', 'property_b_23']],
], columns=['id', 'property_a', 'properties_b'])
df
rows = []
for i, row in df.iterrows():
for a in row.properties_b:
row.properties_b = a
rows.append(row)
pd.DataFrame(rows, columns=df.columns)
def loc_expand(df, loc):
rows = []
for i, row in df.iterrows():
vs = row.at[loc]
new = row.copy()
for v in vs:
new.at[loc] = v
rows.append(new)
return pd.DataFrame(rows)
def iloc_expand(df, iloc):
rows = []
for i, row in df.iterrows():
vs = row.iat[iloc]
new = row.copy()
for v in vs:
row.iat[iloc] = v
rows.append(row)
return pd.DataFrame(rows)
这些都应该返回与上面相同的结果。
loc_expand(df, 'properties_b')
iloc_expand(df, 2)