Question

我试图将几个文件加载到管道中，每个文件包含3个信号，并且以10分钟为间隔对3个信号进行排序。当我加载第一个文件时，它具有此形状（86，75000,3）。我正在使用tensorflow 1.14

我尝试了以下代码，以使您可以使用我模拟零加载的代码：

import numpy as np
import tensorflow as tf


def my_func(x):
    p = np.zeros([86, 75000, 3])
    return p

def load_sign(path):
    sign = tf.compat.v1.numpy_function(my_func, [path], tf.float64)
    return sign

s = [1, 2]  # list with filenames, these are paths, here i simulate with numbers

AUTOTUNE = tf.data.experimental.AUTOTUNE  
ds = tf.data.Dataset.from_tensor_slices(s)
ds = ds.map(load_sign, num_parallel_calls=AUTOTUNE)

itera = tf.data.make_one_shot_iterator(ds)
x = itera.get_next()

with tf.Session() as sess:
    # sess.run(itera.initializer)
    va_sign = sess.run([x])
    va = np.array(va_sign)
    print(va.shape)

我得到这个形状：（1、86、75000、3）虽然我想获得3个不同的变量，每个变量都具有以下形状：（，75000）

我该怎么办？我也尝试过此代码，但出现错误

import numpy as np
import tensorflow as tf


def my_func(x):
    p = np.zeros([86, 75000, 3])
    x = p[:,:,0]
    y = p[:, :, 1]
    z = p[:, :, 2]
    return x, y, z

# load the signals, in my example it creates the signals using zeros
def load_sign(path):
    a, b, c = tf.compat.v1.numpy_function(my_func, [path], tf.float64)
    return tf.data.Dataset.zip((a,b,c))

s = [1, 2]  # list with filenames, these are paths, here i simulate with numbers

AUTOTUNE = tf.data.experimental.AUTOTUNE  
ds = tf.data.Dataset.from_tensor_slices(s)
ds = ds.map(load_sign, num_parallel_calls=AUTOTUNE)

itera = tf.data.make_one_shot_iterator(ds)
x, y, z = itera.get_next()

with tf.Session() as sess:
    # sess.run(itera.initializer)
    va_sign = sess.run([x])
    va = np.array(va_sign)
    print(va.shape)

在这里我希望x具有以下形状：（86，75000），但我却得到了这个错误。我该如何运作？甚至更好的是，我可以获得具有此形状（，75000）的x

TypeError：仅在启用急切执行后，张量对象才可迭代。要遍历此张量，请使用tf.map_fn。

Answer 1

numpy_function：
a, b, c = tf.compat.v1.numpy_function(my_func, [path], tf.float64) 应该返回一个可以在图环境中使用的python函数。变量本身由my_func返回。因此，以下代码应如下所示：

def my_func(x):
    p = np.zeros([86, 75000, 3])
    x = p[:,:,0]
    y = p[:, :, 1]
    z = p[:, :, 2]
    return x, y, z

def load_sign(path):
    func = tf.compat.v1.numpy_function(my_func, [path], [tf.float64, tf.float64, tf.float64])
    return func

其余部分几乎相同，但有一些细微调整：

s = [1, 2]  

AUTOTUNE = tf.data.experimental.AUTOTUNE  
ds = tf.data.Dataset.from_tensor_slices(s)
ds = ds.map(load_sign, num_parallel_calls=AUTOTUNE)

itera = tf.data.make_one_shot_iterator(ds)
output = itera.get_next() # Returns tuple of 3: x,y,z from my_func

with tf.Session() as sess:
    va_sign = sess.run([output])[0] # Unnest single-element list
    for entry in va_sign:
      print(entry.shape)

这将产生3个元素，每个元素的形状为(86, 75000)。

要进一步预处理数据并到达(75000,)，可以使用tf.data.Dataset.unbatch()：

AUTOTUNE = tf.data.experimental.AUTOTUNE  
ds = tf.data.Dataset.from_tensor_slices(s)
ds = ds.map(load_sign, num_parallel_calls=AUTOTUNE).unbatch()

itera = tf.data.make_one_shot_iterator(ds)
output = itera.get_next() # Returns tuple of 3: x,y,z from my_func

与上述相同的迭代现在将为您提供三个大小为（75000，）的元素。

如何在张量流数据集中映射Numpy数组

1 个答案: