我是python的新手,我在StackOverflow中搜索我的情况,但找不到技术答案。我有大量的BS行。
我这样的问题,我有一个dataframe
:
df
BS N
BS1 - BS5 1
BS2 - BS7 2
BS1 - BS9 2
BS9 - BS1 1
我想自动制作新数据。我的预期结果是这样的:
New_BS BS1 - BS5 BS2 - BS7 BS1 - BS9 BS9 - BS1 Total
BS1-2 1 2 3
BS2-3 1 2 2 5
BS3-4 1 2 2 5
BS4-5 1 2 2 5
BS5-6 2 2 4
BS6-7 2 2 4
BS7-8 2 2
BS8-9 2 2
BS9-8 1 1
BS8-7 1 1
BS7-6 1 1
BS6-5 1 1
BS5-4 1 1
BS4-3 1 1
BS3-2 1 1
BS2-1 1 1
预先感谢您对我的帮助
答案 0 :(得分:1)
好吧-完全是骇客-但这很有趣...
import pandas as pd
import numpy as np
df = df_flat = pd.DataFrame({"BS": ['BS1 - BS5', 'BS2 - BS7', 'BS1 - BS9', 'BS9 - BS1'],
"N" : [1, 2, 2, 1]})
df = df.pivot(columns='BS',
values='N')
df_flat = df_flat.pivot_table(
columns='BS',
values='N')
for column_name, column in zip(list(df), df):
if int(column[2:3]) < int(column[8:9]):
for stop in range(int(column[2:3]), int(column[8:9])):
index = "BS" + str(stop) + "-" + str(stop + 1)
if index not in list(df.index.values):
df.loc[index] = np.nan
df.loc[index, column] = df_flat.loc['N', column]
else:
for stop in range(int(column[2:3]), int(column[8:9]), -1):
index = "BS" + str(stop) + "-" + str(stop - 1)
if index not in list(df.index.values):
df.loc[index] = np.nan
df.loc[index, column] = df_flat.loc['N', column]
df['Total'] = df.sum(axis=1)
df = df.iloc[len(list(df_flat)):]
print(df.fillna(''))
$ python bus.py
BS BS1 - BS5 BS1 - BS9 BS2 - BS7 BS9 - BS1 Total
BS1-2 1 2 3.0
BS2-3 1 2 2 5.0
BS3-4 1 2 2 5.0
BS4-5 1 2 2 5.0
BS5-6 2 2 4.0
BS6-7 2 2 4.0
BS7-8 2 2.0
BS8-9 2 2.0
BS9-8 1 1.0
BS8-7 1 1.0
BS7-6 1 1.0
BS6-5 1 1.0
BS5-4 1 1.0
BS4-3 1 1.0
BS3-2 1 1.0
BS2-1 1 1.0
大约有1,000种方法可以对此进行改进-但是这是一个好的开始...
请注意,切片是对数据集的非常重要的约束--您必须真正对其进行重新加工以使其动态化。