具有串联列的Pandas数据框

时间:2019-10-01 17:28:46

标签: python pandas

我有一个类似以下代码的Pandas数据框。我需要添加一个动态列,该列将给定行之前的序列中的每个值连接起来。循环听起来像逻辑解决方案,但在非常大的数据帧(1M +行)上效率极低。

user_id=[1,1,1,1,2,2,2,3,3,3,3,3]
variable=["A","B","C","D","A","B","C","A","B","C","D","E"]
sequence=[0,1,2,3,0,1,2,0,1,2,3,4]
df=pd.DataFrame(list(zip(ID,variable,sequence)),columns =['User_ID', 'Variables','Seq'])

# Need to add a column dynamically 
df['dynamic_column']=["A","AB","ABC","ABCD","A","AB","ABC","A","AB","ABC","ABCD","ABCDE"]

我需要能够基于user_id和序列号以有效的方式创建动态列。我玩过大熊猫移位功能,只是导致必须创建一个循环。寻找一种简单有效的方法来创建动态串联列。

2 个答案:

答案 0 :(得分:2)

这是 var material_10 = new THREE.MeshBasicMaterial( { color: 0x444444 } ); var loader = new THREE.OBMLoader(); loader.load("https://aroncad.com/wp-content/themes/AronCad/3d/satllite/100.obm", function( obj ){ obj.traverse( function( child ) { ........... } ); scene.add( obj ); });

cumsum

输出:

df['dynamic_column'] = df.groupby('User_ID').Variables.apply(lambda x: x.cumsum())

答案 1 :(得分:0)

您的问题有点含糊,但是这样的事情行得通吗?

df['DynamicColumn'] = df['user_id'] + df['sequencenumber']