出于格式化目的,需要基于列值在数据框中添加空白单元格。这相当于您在excel中看到的带有插入单元格和向右移动单元格的内容
我使用openpyxl和loop完成了操作,我使用的步骤
def check_pair(x = "", y = "", swap = True):
if x == "A" and y == "C":
return True
elif x == "G" and y == "T":
return True
else:
return swap and check_pair(y, x, False)
def check_valid(a = [], b = [], i = 0):
if i >= min(len(a), len(b)):
return
elif check_pair(a[i], b[i]):
yield (a[i], b[i])
yield from check_valid(a, b, i + 1)
else:
yield from check_valid(a, b, i + 1)
a = ['A', 'A', 'T', 'C', 'G', 'C', 'T', 'A']
b = ['C', 'G', 'G', 'A', 'C', 'A', 'C', 'T']
print(list(check_valid(a,b)))
# [('A', 'C'), ('T', 'G'), ('C', 'A'), ('C', 'A')]
我想要的是在熊猫中实现这一目标的简单方法
wb = openpyxl.load_workbook('file.xlsx')
ws = wb['sheet']
if 'SheetArranged' not in wb.sheetnames:
wb.create_sheet('SheetArranged')
wb.save('file.xlsx')
ws3 = wb.get_sheet_by_name('SheetArranged')
b = 1
for i in range(1, ws.max_row):
'''
if lvl data is absent
'''
if lvl != None:
lvl == lvl
else:
lvl = 1
'''
indented data
'''
try:
for j in range(1,5):
ws3.cell(row = b, column = lvl+j+2).value = ws.cell(row = i, column = 3+j).value
except:
pass
预期结果
df>>
A B C
P1 1 C1
P2 3 C2
P3 2 C3
任何帮助将不胜感激,谢谢
答案 0 :(得分:2)
我认为在大熊猫中,您可以使用pivot
pd.concat([df.assign(C=np.nan),df.pivot(columns='B',values='C')],axis=1)
Out[89]:
A B C 1 2 3
0 P1 1 NaN C1 NaN NaN
1 P2 3 NaN NaN NaN C2
2 P3 2 NaN NaN C3 NaN
更新
s=pd.DataFrame([[np.nan]*x+y for x,y in zip(df.B,df.loc[:,'C':].values.tolist())],index=df.index)
df=pd.concat([df,s],1)
df
Out[1007]:
A B C 0 1 2 3
0 P1 1 C1 NaN C1 None None
1 P2 3 C2 NaN NaN NaN C2
2 P3 2 C3 NaN NaN C3 None