Question

我正在使用TensorFlow 2.0进行文本分类。

数据的结构大致如下：

第一种方法：

x: List[List[int]] # list of sentences consisting of a list of word IDs for each word in the sentence
y: List[int] # binary truth indicator

但是，在致电model.fit(...)时，我收到以下错误消息：

Failed to find data adapter that can handle input: (<class 'list'> containing values of types {'(<class \'list\'> containing values of types {"<class \'int\'>"})', "(<class 'list'> containing values of types set())"}), <class 'numpy.ndarray'>

尽管未在任何地方使用set。

第二种方法：

我尝试对内部列表使用numpy数组，如下所示：

x: List[np.ndarray[np.int32]]
y: np.ndarray[np.int32]

但是我收到以下错误：

Input arrays should have the same number of samples as target arrays. Found 32 input samples and 479 target samples.

第三种方法：

这促使我将数据结构更改为：

x: np.ndarray[np.ndarray[np.int32]]
y: np.ndarray[np.int32]

这导致以下错误：

Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

第四种方法：

尝试，

x: np.ndarray[List[int]]
y: np.ndarray[int]

导致以下类似错误消息：

Failed to convert a NumPy array to a Tensor (Unsupported object type list).

TLDR;

所以问题是：怎么回事？为什么model.fit(...)不接受这些参数？

请在下面查看我的答案。

Answer 1

我记录了此混乱的原因是，根本的问题与错误消息无关。

潜在的问题是输入数据（x）需要填充。

句子自然具有不同的长度。 TensorFlow的model.fit(...)不喜欢那样。为了使它演奏得更好，我需要填充句子以确保句子列表中每个句子包含相同数量的单词。（我只是将它们零填充）。

如果您填充输入，则3rd Approach和4th Approach都应该起作用。

model.fit（...）和“无法将NumPy数组转换为张量”

第一种方法：

第二种方法：

第三种方法：

第四种方法：

TLDR;

1 个答案: