大多数熊猫apply()的示例在应用传递的函数后都会返回调用行/系列的增强版,但我希望返回一个与调用DataFrame的行无关的全新行/系列。这意味着我需要为返回的对象显式分配内存:
def newrowgen(oldrow: tuple) -> tuple:
#allocating memory for each new row first
newrowindex = ['A', 'B', 'C', 'D']
newrowdata = [0 for index in newrowindex]
newrow = pd.Series(data = newrowdata, index = newrowindex)
newrow['A'] = do A stuff
newrow['B'] = do B stuff
newrow['C'] = do C stuff
newrow['D'] = do D stuff
return newrow
然后我将其用作:
newdf = olddf.apply(newrowgen, axis=1)
这有效,但是非常慢。有没有效率方面的窍门,比如更好的内存分配?谢谢