我想将数据帧的行与一个公共列值合并,然后合并其余用逗号分隔的列值作为字符串值,并转换为数组/列表以获取int值。
A B C D
1 one 100 value
4 four 400 value
5 five 500 value
2 two 200 value
预期结果如下:
A B C D
[1,4,5,2] one,four,five,two [100,400,500,200] value
我可以对D列使用groupby,但是如何一次将d,B列中的apply(np.array)和apply(','。join)分别用作A,C列?
答案 0 :(得分:3)
动态解决方案-连接字符串列,并将数字转换为带有GroupBy.agg
的列表:
import { Subscription } from "react-apollo";
import { gql } from "apollo-boost";
const subscribe = gql`
subscription {
newMeasurement {
value
}
}
`;
const query = gql`
query {
getMetrics
}
`;
const HomePage = props => {
return (
<Subscription subscription={subscribe}>
{({ data }) => {
console.log(data);
return <div className="home">Hello, here is the newest number -
</div>
}}
</Subscription>
)
}
另一种解决方案是为每列分别指定每个函数:
f = lambda x: x.tolist() if np.issubdtype(x.dtype, np.number) else ','.join(x)
#similar for test strings - https://stackoverflow.com/a/37727662
#f = lambda x: ','.join(x) if np.issubdtype(x.dtype, np.flexible) else x.tolist()
df1 = df.groupby('D').agg(f).reset_index().reindex(columns=df.columns)
print (df1)
A B C D
0 [1, 4, 5, 2] one,four,five,two [100, 400, 500, 200] value
答案 1 :(得分:2)
df = df.groupby('D').apply(lambda x: pd.Series([list(x.A),','.join(x.B),list(x.C)])).reset_index().rename({0:'A',1:'B',2:'C'}, axis=1)
df = df[['A','B','C','D']]
输出
A B C D
0 [1, 4, 5, 2] one,four,five,two [100, 400, 500, 200] value
答案 2 :(得分:0)
为什么不是单线agg
:
>>> df.groupby('D', as_index=False).agg(lambda x: x.tolist() if x.dtype != object else ','.join(x))[df.columns]
A B C D
0 [1, 4, 5, 2] one,four,five,two [100, 400, 500, 200] value
>>>