我有一个来自csv文件的熊猫数据帧,我需要取3列的平均值,并在新列中添加结果。 数据就是这样-
0 week 12 exp exp exp
1 Subject Group 1 2 3
2 255 HD 0 117.4 104.8 87.0
3 418 WT 0 61.2 56.1 97.9
4 300 HD 0 111.7 126.9 118.4
5 299 HD 0 50.7 37.8 30.6
6 258 WT 0 56.0 67.9 58.5
7 173 HD 0 76.2 131.7 119.5
我的代码是-
with open('final results.csv', 'r') as frame:
date_again = csv.reader(frame)
frame = []
for line in date_again:
frame = frame + [line]
panda_file = pd.DataFrame(frame)
panda_file ['average'] = frame [3:]。mean(axis = 1)
我得到的错误是 AttributeError:“列表”对象没有属性“均值”
我该如何解决?
谢谢
答案 0 :(得分:0)
首先创建document
时将read_csv
与参数Option Explicit
Public Sub MakeStateSelection()
Dim ie As New InternetExplorer, html As HTMLDocument
With ie
.Visible = True
.navigate "https://tools.usps.com/go/ZipLookupAction!input.action?mode=1&refresh=true"
While .Busy Or .READYSTATE < 4: DoEvents: Wend
Set html = .document
html.querySelector("#zip-lookup-app > div > div:nth-child(1) > div > ul > li:nth-child(1) > a > span").Click
While .Busy Or .READYSTATE < 4: DoEvents: Wend
html.querySelector("#tState option[value='MA']").Selected = True
'other code
Stop '<== Delete me
'.Quit '<== Uncomment me
End With
End Sub
一起使用,因为csv在DataFrame
的列中有header=[0,1]
有2行标题:
DataFrame
然后为MultiIndex
选择最后3列:
import pandas as pd
temp=u"""week,12,exp,exp,exp
Subject,Group,1,2,3
255,HD,0,117.4,104.8,87.0
418,WT,0,61.2,56.1,97.9
300,HD,0,111.7,126.9,118.4
299,HD,0,50.7,37.8,30.6
258,WT,0,56.0,67.9,58.5
173,HD,0,76.2,131.7,119.5"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=[0,1])
print (df)
week 12 exp
Subject Group 1 2 3
255 HD 0 117.4 104.8 87.0
418 WT 0 61.2 56.1 97.9
300 HD 0 111.7 126.9 118.4
299 HD 0 50.7 37.8 30.6
258 WT 0 56.0 67.9 58.5
173 HD 0 76.2 131.7 119.5
对于新列,需要分配给元组新MultiIndex列的定义名称:
mean
但为简化起见,可以使用扁平列:
df1 = df.iloc[:, -3:].mean(axis=1)
print (df1)
255 103.066667
418 71.733333
300 119.000000
299 39.700000
258 60.800000
173 109.133333
dtype: float64