Question

我使用np.concatenate逐行连接了两个大小相同（即列数相同）的csv数据集。

combined = np.concatenate((price1,price2))

如何使用numpy逐行连接两个大小不同的csv数据集（它们包含公共标头，但其中一个数据集具有附加的列）？

dataset1的标头：a，b，c，d，e，f，g，h，i，k

dataset2的标头：a，b，c，d，e，f，g，h，i，j（分析不需要的附加列），k

非常感谢。

Answer 1

您可以使用np.delete删除多余的列，然后使用np.concatenate

headers = list('abcdefghik')
a = np.arange(len(headers)).reshape(1, -1)
#Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

headers_2 = list('abcdefghijk')
b = np.arange(len(headers_2)*2).reshape(2,-1)
#Output: array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
#       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]])

col_to_remove = headers_2.index('j')
np.delete(b, col_to_remove, axis = 1) #note that this does not modify original array, returns a copy.
#Output: array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8, 10],
#       [11, 12, 13, 14, 15, 16, 17, 18, 19, 21]])

result = np.concatenate((a, np.delete(b, col_to_remove, axis = 1)))
#Output: array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
#       [ 0,  1,  2,  3,  4,  5,  6,  7,  8, 10],
#       [11, 12, 13, 14, 15, 16, 17, 18, 19, 21]])

使用numpy将不同大小的数组按行连接

1 个答案: