这个问题实际上是2种情况:
1。数据框部分
针对数据框
df = pd.DataFrame({'Company': ['Huawei','Huawei','Huawei', 'Apple', 'Apple', 'Samsung', 'Samsung'],
'Year': [2011, 2011, 2018, 2011, 2019, 2018, 2019],
'Product': ['H1', 'H2', 'H3', 'A1', 'A2', 'S1', 'S2']})
df = df.sort_values(by=['Company', 'Year'])
df
即
Company Year Product 3 Apple 2011 A1 4 Apple 2019 A2 0 Huawei 2011 H1 1 Huawei 2011 H2 2 Huawei 2018 H3 5 Samsung 2018 S1 6 Samsung 2019 S2
我需要的是mergeCell(df, on = ['Company'])
返回
Company Year Product 3 Apple 2011 A1 4 2019 A2 0 Huawei 2011 H1 1 2011 H2 2 2018 H3 5 Samsung 2018 S1 6 2019 S2
mergeCell(df, on = ['Company', 'Year'])
返回时
Company Year Product 3 Apple 2011 A1 4 Apple 2019 A2 0 Huawei 2011 H1 1 H2 2 Huawei 2018 H3 5 Samsung 2018 S1 6 Samsung 2019 S2
我写了一个,但是显然它并不简洁并且有错误
def mergeCell(df, on):
import copy
dfMerged = df[on]
dfTmp = np.empty((df.shape[0], len(on)), dtype=object)
lastRow = ()
idx = 0
for row in dfMerged.itertuples():
if idx == 0:
lastRow = row[1:]
dfTmp[idx, :] = lastRow
else:
if row[1:] != lastRow:
lastRow = row[1:]
dfTmp[idx, :] = lastRow
else:
dfTmp[idx, :] = np.empty((1, len(on)), dtype=object)
idx += 1
dfTmp = pd.DataFrame(dfTmp)
dfTmp.columns = on
dfCopied = copy.deepcopy(df)
for idxRow in range(df.shape[0]):
for idxCol in on:
dfCopied.loc[idxRow, idxCol] = dfTmp.loc[idxRow, idxCol]
return dfCopied
那么,有内置的方法吗?
2。使用合并的单元格将结果数据框保存为excel,并且文本位于垂直中心
对于这一部分,除了执行上面的mergeCell
函数
谢谢
答案 0 :(得分:1)
那么,有内置的方法吗?
是的,您可以使用use duplicated
。但是请注意,熊猫中的“空单元格”可能意味着以下两种情况之一:NaN
或空字符串''
。由于您担心演示,因此我想您要演示。
示例1:pd.Series.duplicated
col = 'Company'
df[col] = df[col].mask(df[col].duplicated(), '')
print(df)
# Company Year Product
# 3 Apple 2011 A1
# 4 2019 A2
# 0 Huawei 2011 H1
# 1 2011 H2
# 2 2018 H3
# 5 Samsung 2018 S1
# 6 2019 S2
示例2: pd.DataFrame.duplicated
cols = ['Company', 'Year']
df[cols] = df[cols].mask(df[cols].duplicated(), '')
print(df)
# Company Year Product
# 3 Apple 2011 A1
# 4 Apple 2019 A2
# 0 Huawei 2011 H1
# 1 H2
# 2 Huawei 2018 H3
# 5 Samsung 2018 S1
# 6 Samsung 2019 S2