我正在读取有关与熊猫的城市距离的一些数据,我只需要将距离作为数字进行矩阵计算即可。熊猫确实可以导入所有内容,但我仍将城市名称作为Headers。这将用于经典的多维缩放。
我的CSV(简称为CSV)如下:
"","Athens","Barcelona","Brussels"
"Athens",0,3313,2963
"Barcelona",3313,0,1318
"Brussels",2963,1318,0
那很好,但是在我的函数中,我只需要像这样的值:
0,3313,2963
3313,0,1318
2963,1318,0
我不能仅从上述CSV中获取此矩阵。我该怎么办?
答案 0 :(得分:1)
您可以使用:
data=""""","Athens","Barcelona","Brussels"
"Athens",0,3313,2963
"Barcelona",3313,0,1318
"Brussels",2963,1318,0"""
df = pd.read_csv(pd.compat.StringIO(data),index_col=0) #replace pd.compat..() with filename
df.to_numpy() #df.values
array([[ 0, 3313, 2963],
[3313, 0, 1318],
[2963, 1318, 0]], dtype=int64)
答案 1 :(得分:1)
好的,这就是我们所拥有的
a = StringIO(""""","Athens","Barcelona","Brussels"
"Athens",0,3313,2963
"Barcelona",3313,0,1318
"Brussels",2963,1318,0""")
df = pd.read_csv(a,sep=',',engine='python')
print(df)
Unnamed: 0 Athens Barcelona Brussels
0 Athens 0 3313 2963
1 Barcelona 3313 0 1318
2 Brussels 2963 1318 0
df.loc[:,'Athens':].values
输出
array([[ 0, 3313, 2963],
[3313, 0, 1318],
[2963, 1318, 0]])
pd.to_csv
[i for i in df.loc[:,'Athens':].to_csv(header=None).split('\n') if i ]
['0,0,3313,2963', '1,3313,0,1318', '2,2963,1318,0']
答案 2 :(得分:1)
首先,我们以csv的格式读取您的数据,将其转换为第一列的数组和切片:
df = pd.read_csv(a).to_numpy()[:, 1:]
array([[0, 3313, 2963],
[3313, 0, 1318],
[2963, 1318, 0]], dtype=object)
注意,我以a
的形式读取您的csv,如下所示:
a = StringIO('''
"","Athens","Barcelona","Brussels"
"Athens",0,3313,2963
"Barcelona",3313,0,1318
"Brussels",2963,1318,0
''')