此问题与之前asked许多关于向数据框添加列的问题有关,但我找不到解决问题的方法。
我有2个列表,我想为它们创建一个数据帧,其中每个列表都是一列,而索引来自之前的数据帧。
当我尝试:
STNAME = Filter3['STNAME'].tolist() #first list to be converted to column
CTYNAME = Filter3['CTYNAME'].tolist() #second list to be converted to column
ORIG_INDEX = Filter3.index #index pulled from previous dataframe
FINAL = pd.Series(STNAME, CTYNAME, index=ORIG_INDEX)
return FINAL
我收到索引已经存在的错误: TypeError: init ()为参数'index'
获取了多个值所以我只用两列试了它,没有索引声明,结果是
FINAL = pd.Series(STNAME,CTYNAME)使CTYNAME成为索引:
STNAME = Filter3['STNAME'].tolist()
CTYNAME = Filter3['CTYNAME'].tolist()
ORIG_INDEX = Filter3.index
FINAL = pd.Series(STNAME, CTYNAME)
return FINAL
华盛顿县爱荷华州
华盛顿县明尼苏达州
宾夕法尼亚州华盛顿县华盛顿县罗德岛
华盛顿县威斯康星州
dtype:object
我如何创建一个数据框,接受2个列表作为列,第三个索引(匹配长度)作为索引?
非常感谢
答案 0 :(得分:3)
如果想与DataFrame
合作,我认为需要Series
而不是list
:
FINAL = pd.DataFrame({'STNAME':STNAME, 'CTYNAME': CTYNAME},
index=ORIG_INDEX,
columns = ['STNAME', 'CTYNAME'])
或者更好的是,只按列列创建子集,并避免可能SettingWithCopyWarning
添加DataFrame.copy
:
FINAL = Filter3[['STNAME', 'CTYNAME']].copy()
<强>示例强>:
d = {'COL': ['a', 'b', 's', 'b', 'b'],
'STNAME': ['Iowa', 'Minnesota', 'Pennsylvania', 'Rhode Island', 'Wisconsin'],
'CTYNAME': ['Washington County', 'Washington County', 'Washington County',
'Washington County', 'Washington County'],}
Filter3 = pd.DataFrame(d,index=[10,20,3,50,40])
print (Filter3)
COL CTYNAME STNAME
10 a Washington County Iowa
20 b Washington County Minnesota
3 s Washington County Pennsylvania
50 b Washington County Rhode Island
40 b Washington County Wisconsin
FINAL = Filter3[['STNAME', 'CTYNAME']].copy()
print (FINAL)
STNAME CTYNAME
10 Iowa Washington County
20 Minnesota Washington County
3 Pennsylvania Washington County
50 Rhode Island Washington County
40 Wisconsin Washington County