将多个numpy数组存储到.npy文件中,同时保留数组顺序

时间:2019-04-26 02:48:44

标签: python numpy

在Tensorflow的consuming numpy arrays文档中,它显示了以下代码:

import {h} from 'preact';
import {useState} from 'preact/hooks'
const interests = [
  {name: 'the future',},
  {name: 'architecture',},
  {name: 'my work',},
  {name: 'your work',},
  {name: 'collaboration',},
  {name: 'dank memes',},
  {name: 'OOP vs. Functional',},
  {name: 'design',},
  {name: 'guitar',},
  {name: 'inspirational people',},
  {name: 'love',},
  {name: 'travel',},
  {name: 'singularity',},
  {name: 'creativity',},
  {name: 'mixed, virtual, augmented reality',},
  {name: 'art',},
  {name: 'imagination',},
  {name: 'problem solving',},
  {name: 'space',},
  {name: 'cooking',},
  {name: 'FOMO',},
  {name: 'ontological design',},
  {name: 'flow state',},
  {name: 'foreign languages',},
  {name: 'streaming on the internet',},
  {name: 'video games',},
  {name: 'coffee',},
  {name: 'crypto currency',},
  {name: 'javascript fatigue',},
  {name: 'framework wars',},
  {name: 'blockchain',},
  {name: 'smart contracts',},
  {name: 'just emailing me'},
  {name: 'ethereum'},
  {name: 'university'},
  {name: 'engineering software'},
];


const RunningHeader = () => {
  const [count, setCount] = useState(0);
  setInterval(() => {setCount(c => c + 1)}, 1000);
  return (
    <header>
      <p>{interests[count].name}</p>
    </header>
)}

export {RunningHeader};

很显然,# Load the training data into two NumPy arrays, for example using `np.load()`. with np.load("/var/data/training_data.npy") as data: features = data["features"] labels = data["labels"] # Assume that each row of `features` corresponds to the same row as `labels`. assert features.shape[0] == labels.shape[0] dataset = tf.data.Dataset.from_tensor_slices((features, labels)) 由两个数组组成:要素和标签。因此,现在让我们说我有两个numpy数组training_data.npyfeatures,它们具有相同的第0维,并且以这样的方式排序,使得每个对应的labelsfeature都具有相同的索引。如何将它们保存在一个单独的label文件中,可以使用一个简单的键(就像上面显示的代码一样)从中访问数组,而必须保留数组顺序?

1 个答案:

答案 0 :(得分:0)

with np.load("/var/data/training_data.npy") as data:
  features = data["features"]
  labels = data["labels"]

在此代码中,data的索引像字典一样。但是尚不清楚npy文件中存储的内容的细节。可能是

  • np.savez创建的zip归档文件(通常将其标记为npz),而featureslabels是从各自归档文件中加载的数组

  • 它可以是结构化数组,具有两个字段“功能”和“标签”。在这种情况下,data.shapedata.dtype会有用。

  • 我还要说这可能是一本具有两个键和值的真实字典。但是np.save会将其放在1元素对象dtype数组中,需要以data.item()['features']等访问。

因此,对datanpy文件有更多了解。

保存数组具有相同的两个选项-使用savez创建zip归档文件,或者通过创建结构化数组并保存-实际上是np.save(data)

维护两个数组的顺序和索引很容易-顺序是正常的。只要您以相同的方式对它们进行切片,索引和/或混洗,两个数组之间的元素对就会保留。