我正在尝试将strings
pandas
中的特定df
上移到row
。这些strings
位于相同或相邻的列中。
下面的df是一个示例。指定的字符串为Cat
,Dog
。我想将这些值上移row
。这些值在Column C
和Column D
中。
import pandas as pd
d = ({
'A' : ['A','Yy','A','Xy','A','Zy','Yy'],
'B' : ['Big','X','Big','X','Very','X','X'],
'C' : ['','Cat','YY','Dog','Big','XY','YY'],
'D' : ['','','Xy','Yy','','Cat','Yy'],
'E' : ['','','Xy','XX','','','Xy'],
})
df = pd.DataFrame(data=d)
我的预期输出是
A B C D E
0 A Big Cat
1 Yy X
2 A Big Dog Xy Xy
3 Xy X Yy XX
4 A Very Big Cat
5 Zy X XY
6 Yy X YY Yy Xy
我尝试过:
df['C'] = df['C'].shift(-1)
但这会将所有值上移。我只想在某些列中选择特定的值(例如Cat
,Dog
)并将其向上移动一行。
我当时正在考虑列出指定值,然后将其向上移动。像
val = ['Cat','Dog']
if val is in df['C',D'].shift up one row
注意:我无法根据周围的字符串对此进行排序。我实际的df包含各种不同的字符串,需要很长时间才能通过。
答案 0 :(得分:1)
在这种情况下,请执行以下操作:
df['C'][0],df['C'][1] = df['C'][1],df['C'][0] # swap the index
df['D'] = df['D'].shift(-1).fillna('X')
print(df)
输出:
A B C D E
0 A Big Cat
1 X X
2 X X X X X
3 X X X X X
4 Foo Bar Foobar Fubur
5 X X X
6 X X X X X
答案 1 :(得分:0)
对于通用解决方案,请将熊猫eq()
与np.where()
结合使用:
import numpy as np
def shift_value(df, value):
row, col = np.where(df.eq(value))
old_row = row[0]
old_col = col[0]
new_row = old_row - 1
new_col = old_col
df.iat[new_row, new_col] = value
df.iat[old_row, old_col] = "X"
for v in ["Cat", "Foobar"]:
shift_value(df, v)
df
A B C D E
0 A Big Cat
1 X X X
2 X X X X X
3 X X Foobar X X
4 Foo Bar X
5 X X X Fubur
6 X X X X X
原始OP数据:
d = ({
'A' : ['A','X','X','X','Foo','X','X'],
'B' : ['Big','X','X','X','Bar','X','X'],
'C' : ['','Cat','X','X','Foobar','X','X'],
'D' : ['','','X','X','','Fubur','X'],
'E' : ['','','X','X','','','X'],
})
df = pd.DataFrame(data=d)
答案 2 :(得分:0)
如果您需要的是该行中的所有值都有一个有意义的单词要移位,那么这应该是一个答案:
In [36]: import pandas as pd
...: d = ({
...: 'A' : ['A','X','X','X','Foo','X','X'],
...: 'B' : ['Big','X','X','X','Bar','X','X'],
...: 'C' : ['','Cat','X','X','Foobar','X','X'],
...: 'D' : ['','','X','X','','Fubur','X'],
...: 'E' : ['','','X','X','','','X'],
...: })
...: df = pd.DataFrame(data=d)
...:
...: index = ((df!='X') & (df!='') & df.notna()).sum(axis=1) == 1
...: for row in df[index].index.values:
...: for col in df.columns.values:
...: if df.loc[row, col]!='X' and bool(df.loc[row, col]):
...: df.loc[row-1, col] = df.loc[row, col]
...: df.loc[row, col] = ''
...:
In [37]: df
Out[37]:
A B C D E
0 A Big Cat
1 X X
2 X X X X X
3 X X X X X
4 Foo Bar Foobar Fubur
5 X X X
6 X X X X X
答案 3 :(得分:0)
因此,如果数据不太大,可以尝试for循环:
for row in range(1, len(df)):
for col in df.columns.values:
if (df.loc[row, col] != '') and (df.loc[row-1, col] == ''):
df.loc[row-1, col] = df.loc[row, col]
df.loc[row, col] = '######'
df = df.replace('######', '')
答案 4 :(得分:0)
I think you need df.combine_first,
mylist=['Cat','Dog']
a=df[df.isin(mylist)].shift(-1)
df[df.isin(mylist)]=""
out_df=a.combine_first(df)
print(out_df)
A B C D E
0 A Big Cat
1 Yy X
2 A Big Dog Xy Xy
3 Xy X Yy XX
4 A Very Big Cat
5 Zy X XY
6 Yy X YY Yy XyX