我有两个数据框
结果:
0 2211 E Winston Rd Ste B, 92806, CA 33.814547 -117.886028 4
1 P.O. Box 5601, 29304, SC 34.945855 -81.930035 6
2 4113 Darius Dr, 17025, PA 40.287768 -76.967292 8
acctypeDF:
0 rooftop
1 place
2 rooftop
我想将这两个数据帧合并为一个,所以我做到了:
import pandas as pd
resultsfinal = pd.concat([results, acctypeDF], axis=1)
但是输出是:
resultsfinal
Out[155]:
0 1 2 3 0
0 2211 E Winston Rd Ste B, 92806, CA 33.814547 -117.886028 4 rooftop
1 P.O. Box 5601, 29304, SC 34.945855 -81.930035 6 place
2 4113 Darius Dr, 17025, PA 40.287768 -76.967292 8 rooftop
您可以看到输出重复了索引号0。为什么会这样?我的目标是删除具有地址的第一个索引(第一列),但出现此错误:
resultsfinal.drop(columns='0')
raise KeyError('{} not found in axis'.format(labels))
KeyError: "['0'] not found in axis"
我也尝试过:
resultsfinal = pd.concat([results, acctypeDF], axis=1,ignore_index=True)
resultsfinal
Out[158]:
0 1 ... 4 5
0 2211 E Winston Rd Ste B, 92806, CA 33.814547 ... rooftop rooftop
1 P.O. Box 5601, 29304, SC 34.945855 ... place place
但是正如您在上面看到的,即使索引0重复的问题消失了,它也会创建重复的列(5)
如果我这样做:
resultsfinal = results[results.columns[1:]]
resultsfinal
Out[161]:
1 2 ... 0 0
0 33.814547 -117.886028 ... 2211 E Winston Rd Ste B, 92806, CA rooftop
1 34.945855 -81.930035 ... P.O. Box 5601, 29304, SC place
print(resultsfinal.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 5 columns):
0 10 non-null object
1 10 non-null float64
2 10 non-null float64
3 10 non-null int64
4 10 non-null object
dtypes: float64(2), int64(1), object(2)
memory usage: 480.0+ bytes
答案 0 :(得分:1)
使用resultsfinal = pd.concat([results, acctypeDF], axis=1,ignore_index=True)
:
resultsfinal = pd.concat([results, acctypeDF], axis=1)
resultsfinal.columns=range(len(resultsfinal.columns))
print(resultfinal)
或
resultsfinal[resultsfinal.columns[1:]]
删除第一列:
pagelist = driver.execute_script("""
return [...document.querySelectorAll('a[href]')].map(a => a.href)
""")