用熊猫构建数据框

时间:2019-07-22 22:47:36

标签: python pandas numpy

我正在读取一个Excel文件,其中产品和其他标签(每天,每月等)在同一列中。我想创建一个新列,并将产品名称带到与该产品相关的每一行上。有人可以支持吗?提前致谢! :)

现状:

8HP70 
Production/Day
Production/Month
Cum.Production
8HP70X 
Production/Day
Production/Month
Cum.Production
8HP75 
Production/Day
Production/Month
Cum.Production
**how I expect:**
Column A | Column B

8HP70 | Production/Day
8HP70 | Production/Month
8HP70 | Cum.Production
8HP70X | Production/Day
8HP70X | Production/Month
8HP70X | Cum.Production
8HP75 | Production/Day
8HP75 | Production/Month
8HP75 | Cum.Production

1 个答案:

答案 0 :(得分:3)

如何处理此问题的一个示例:

import pandas as pd
l = [
    ['8HP70'],
    ['Production/Day'],
    ['Production/Month'],
    ['Cum.Production'],
    ['8HP70X'],
    ['Production/Day'],
    ['Production/Month'],
    ['Cum.Production'],
    ['8HP75'],
    ['Production/Day'],
    ['Production/Month'],
    ['Cum.Production'],
]

df = pd.DataFrame(l, columns=['Column B'])

## repeating product label for every 4 rows
products = df[df['Column B'].index % 4 == 0]

## replicating to a new column
df['Column A'] = products.values.repeat(4)

## removing the product duplication
df = df[df['Column A']!=df['Column B']]

Out[3]: 
            Column B Column A
1     Production/Day    8HP70
2   Production/Month    8HP70
3     Cum.Production    8HP70
5     Production/Day   8HP70X
6   Production/Month   8HP70X
7     Cum.Production   8HP70X
9     Production/Day    8HP75
10  Production/Month    8HP75
11    Cum.Production    8HP75