我有数据框
datetime city state country shape duration (seconds) duration (hours/min) comments date posted latitude longitude
10/10/1949 20:30 san marcos tx us cylinder 2700 45 minutes This event took place in early fall around 1949-50. It occurred after a Boy Scout meeting in the Baptist Church. The Baptist Church sit 4/27/2004 29.8830556 -97.9411111
10/10/1949 21:00 lackland afb tx light 7200 1-2 hrs 1949 Lackland AFB, TX. Lights racing across the sky & making 90 degree turns on a dime. 12/16/2005 29.38421 -98.581082
10/10/1955 17:00 chester (uk/england) gb circle 20 20 seconds Green/Orange circular disc over Chester, England 1/21/2008 53.2 -2.916667
10/10/1956 21:00 edna tx us circle 20 1/2 hour My older brother and twin sister were leaving the only Edna theater at about 9 PM,...we had our bikes and I took a different route home 1/17/2004 28.9783333 -96.6458333
10/10/1960 20:00 kaneohe hi us light 900 15 minutes AS a Marine 1st Lt. flying an FJ4B fighter/attack aircraft on a solo night exercise, I was at 50ꯠ' in a "clean" aircraft (no ordinan 1/22/2004 21.4180556 -157.8036111
我尝试按state
分组
我用
result = df.groupby("state").\
agg({"state": pd.Series.nunique, "duration (seconds)": np.sum}).\
rename(columns={"state": "frequency", "duration (seconds)": "whole time"}).\
reset_index()
但它返回错误TypeError: must be str, not float
。
我尝试转换duration (seconds)
但它返回
duration (seconds)
。
我该如何检查这个问题?
答案 0 :(得分:0)
做类似的事情:
# Group df by df.state, then apply a sum lambda function to df.duration(seconds)
df.groupby('state')['duration (seconds)'].apply(lambda x:x.mean())
或者如果你想要滚动总和:
df.groupby('state')['duration (seconds)'].apply(lambda x:x.rolling(center=False,window=2).sum())