我正在把头发拔掉。帮助表示赞赏。
我正在处理一个数据框,其中一部分涉及将驻留在多行上的数据合并为一个。我正在尝试使用df.loc来做到这一点:
df.loc[df['foo'] == 1, 'Output Column'] = df.loc[df['bar'] == 2, 'Desired Column']
所以我想要的是'foo'= 1的任何行,去寻找'bar'= 2的地方,然后将“所需列”中的值放到原始行中。本质上,这将合并行以创建更清晰的输出。作为玩具示例...
(编辑以显示我的代码出了问题的地方) 这就是我想要的 之前:
idx foo bar Desired Column Output Column
0 1
1 2 Hi there!
2 1
3 6
之后:
idx foo bar Desired Column Output Column
0 1 Hi there!
1 2 Hi there!
2 1 Hi there!
3 6
不过,这实际上是我得到的: 之前:
idx foo bar Desired Column Output Column
0 1
1 2 Hi there!
2 1
3 6
之后:
idx foo bar Desired Column Output Column
0 1
1 2 Hi there! Hi there!
2 1
3 6
感谢您的帮助!
答案 0 :(得分:0)
尝试使用where
:
df['Output Column']=df['Output Column'].where(df['bar']==2,'Hi There!')
print(df)
输出:
idx foo bar Desired Column Output Column
0 0 1 NaN NaN Hi there!
1 1 NaN 2 Hi there! NaN
要用''
替换NaN,请执行以下操作:
df=df.fillna('')
在where
之后。
然后:
print(df)
将会是:
idx foo bar Desired Column Output Column
0 0 1 Hi there!
1 1 2 Hi there!
或更不随意地,请执行以下操作:
df['Output Column']=df['Output Column'].where(df['bar']==2,df.loc[df['bar']==2,'Desired Column'].tolist())
print(df)
然后可以做同样的事情,以''
代替NaN's
df['Output Column']=df['Output Column'].where(df['foo']!=1,'Hi There!')
print(df)
输出:
Desired Column Output Column bar foo idx
0 NaN Hi There! NaN 1.0 0
1 Hi There! NaN 2.0 NaN 1
2 NaN Hi There! NaN 1.0 2
3 NaN NaN NaN 6.0 3
df['Output Column']=df['Output Column'].where(df['foo'].notnull(),'Hi There!')
print(df)
输出:
Desired Column Output Column bar foo idx
0 NaN NaN NaN 1.0 0
1 Hi There! Hi There! 2.0 NaN 1
2 NaN NaN NaN 1.0 2
3 NaN NaN NaN 6.0 3
可以执行相同的操作,以''
答案 1 :(得分:0)
这行得通...不确定它是否是有史以来最pythonic的解决方案,但是在这里:
df.loc[df['foo'] == 1, 'Output Column'] = df.loc[df['bar'] == 2, 'Desired Column']
df['Output Column'] = df.groupby(['foo'])['Output Column'].transform(max)
在我的玩具示例中,该填充了对应于bar = 2的单个数字