我现在拥有的:
import numpy as np
# 1) Read CSV with headers
data = np.genfromtxt("big.csv", delimiter=',', names=True)
# 2) Get absolute values for column in a new ndarray
new_ndarray = np.absolute(data["target_column_name"])
# 3) Append column in new_ndarray to data
# I'm having trouble here. Can't get hstack, concatenate, append, etc; to work
# 4) Sort by new column and obtain a new ndarray
data.sort(order="target_column_name_abs")
我想:
答案 0 :(得分:3)
这是一种方法。
首先,让我们创建一个示例数组:
In [39]: a = (np.arange(12).reshape(4, 3) - 6)
In [40]: a
Out[40]:
array([[-6, -5, -4],
[-3, -2, -1],
[ 0, 1, 2],
[ 3, 4, 5]])
好的,我们说
In [41]: col = 1
这是我们要排序的列,
这是排序代码 - 使用Python sorted
:
In [42]: b = sorted(a, key=lambda row: np.abs(row[col]))
让我们将b从列表转换为数组,我们有:
In [43]: np.array(b)
Out[43]:
array([[ 0, 1, 2],
[-3, -2, -1],
[ 3, 4, 5],
[-6, -5, -4]])
哪个是按行分类的行数组 第1列的绝对值。
答案 1 :(得分:1)
以下是使用pandas的解决方案:
In [117]: import pandas as pd
In [118]: df = pd.read_csv('test.csv')
In [119]: df
Out[119]:
a b
0 1 -3
1 2 2
2 3 -1
3 4 4
In [120]: df['c'] = abs(df['b'])
In [121]: df
Out[121]:
a b c
0 1 -3 3
1 2 2 2
2 3 -1 1
3 4 4 4
In [122]: df.sort_values(by='c')
Out[122]:
a b c
2 3 -1 1
1 2 2 2
0 1 -3 3
3 4 4 4