Question

Python列表是指针，所以我可以执行以下操作：

a = []
b = a

b.append(1)

>>> print a, b
[1] [1]

使用numpy实现此行为的方法是什么？因为numpy的append会创建一个新数组。那就是：

a = np.array([])
b = a

b = np.append(b, 1)
>>> print a, b
[] [1.]

修改我想要完成的任务：

我有一个大文本文件，我正在尝试使用re进行解析：根据文件中的标记，我想要更改我要附加的数组。例如：

x = np.array([])
y = np.array([])

with open("./data.txt", "r") as f:
    for line in f:
        if re.match('x values', line):
            print "reading x values"
            array = x
        elif re.match('y', line):
            print "reading y values"
            array = y
        else:
            values = re.match("^\s+((?:[0-9.E+-]+\s*)*)", line)
            if values:
                np.append(array, values.groups()[0].split())

Answer 1

根据您更新的问题，看起来您可以通过保留numpy数组字典来轻松解决问题：

x = np.array([])
y = np.array([])
Arrays = {"x": x, "y": y}

with open("./data.txt", "r") as f:
    for line in f:
        if re.match('x values', line):
            print "reading x values"
            key = "x"
        elif re.match('y', line):
            print "reading y values"
            key = "y"
        else:
            values = re.match("^\s+((?:[0-9.E+-]+\s*)*)", line)
            if values:
                Arrays[key] = np.append(Arrays[key], values.groups()[0].split())

正如Sven Marnach在这里和你的问题的评论中指出的那样，这是对numpy数组的低效使用。

更好的方法（再次，正如斯文指出的那样）将是：

Arrays = {"x": [], "y": []}

with open("./data.txt", "r") as f:
    for line in f:
        if re.match('x values', line):
            print "reading x values"
            key = "x"
        elif re.match('y', line):
            print "reading y values"
            key = "y"
        else:
            values = re.match("^\s+((?:[0-9.E+-]+\s*)*)", line)
            if values:
                Arrays[key].append(values.groups()[0].split())

Arrays = {key: np.array(Arrays[key]) for key in Arrays}

Answer 2

因此，简单切换到list append可以写成：

x, y = [], []
with open("./data.txt", "r") as f:
    for line in f:
        if re.match('x values', line):
            print "reading x values"
            alist = x
        elif re.match('y', line):
            print "reading y values"
            alist = y
        else:
            values = re.match("^\s+((?:[0-9.E+-]+\s*)*)", line)
            if values:
                alist.append(values.groups()[0].split())

现在，x和y都是列表。如果子列表的大小都相同，那么

x_array = np.array(x)

获得2d数组。但是如果子列表的大小不同，这将产生dtype=object的1d数组，这仅仅是具有array开销的列表。例如：

In [98]: np.concatenate([[1,2,3],[1,2]])
Out[98]: array([1, 2, 3, 1, 2])

In [99]: np.array([[1,2,3],[1,2]])
Out[99]: array([[1, 2, 3], [1, 2]], dtype=object)

In [100]: np.array([[1,2,3],[1,2,4]])
Out[100]: 
array([[1, 2, 3],
       [1, 2, 4]])

我不希望使用这两个全局变量和列表{"x": [], "y": []}方法之间有太大的时间差异。全局变量也保存在字典中。

真正的问题是你是否在列表或数组中收集中间值。

Answer 3

查看numpy.hstack

http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.hstack.html

import numpy as np
a = np.arange(0, 10, 1)
b = np.array([5])
np.hstack((a,b))

返回array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 5])

numpy - 附加到数组而不进行复制

3 个答案: