我正在阅读Tensorflow 2.0的Google网站tutorial,他们在其中讨论Feature Columns API。在讨论数字列的第二篇文章中,示例代码在下面生成警告。该警告似乎与强制转换某些数据有关,但是该消息并未确切说明如何解决问题-也就是说,用户应在何处显式强制转换数据,以便避免此警告。
WARNING:tensorflow:Layer dense_features is casting an input tensor from dtype
float64 to the layer's dtype of float32, which is new behavior in TensorFlow
2. The layer has dtype float32 because it's dtype defaults to floatx.
If you intended to run this layer in float32, you can safely ignore this
warning. If in doubt, this warning is likely only an issue if you are porting
a TensorFlow 1.X model to TensorFlow 2.
To change all layers to have dtype float64 by default, call
`tf.keras.backend.set_floatx('float64')`. To change just this layer, pass
dtype='float64' to the layer constructor. If you are the author of this layer,
you can disable autocasting by passing autocast=False to the base Layer
constructor.
我正在尝试找出解决此警告的方法,因为它也在我自己的一些代码中显示。生成此警告的代码是:
from __future__ import absolute_import, division, print_function, unicode_literals
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
URL = 'https://storage.googleapis.com/applied-dl/heart.csv'
dataframe = pd.read_csv(URL)
train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')
def df_to_dataset(dataframe, shuffle=True, batch_size=32):
dataframe = dataframe.copy()
labels = dataframe.pop('target')
ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
if shuffle:
ds = ds.shuffle(buffer_size=len(dataframe))
ds = ds.batch(batch_size)
return ds
batch_size = 5 # A small batch sized is used for demonstration purposes
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)
# We will use this batch to demonstrate several types of feature columns
example_batch = next(iter(train_ds))[0]
# A utility method to create a feature column
# and to transform a batch of data
def demo(feature_column):
feature_layer = layers.DenseFeatures(feature_column)
print(feature_layer(example_batch).numpy())
age = feature_column.numeric_column("age")
demo(age) # <-- SHOULD TRIGGER OR DISPLAY THE WARNING
关于如何解决此问题的任何建议?
答案 0 :(得分:0)
tl; dr 以消除此警告,将您的输入手动投射到float32
X = tf.cast(X, tf.float32)
y = tf.cast(y, tf.float32)
或使用numpy
:
X = np.array(X, dtype=np.float32)
y = np.array(y, dtype=np.float32)
说明
默认情况下,Tensorflow使用floatx
,它默认为float32
,因为它对于深度学习而言已经足够了。您可以验证以下内容:
import tensorflow as tf
tf.keras.backend.floatx()
Out[3]: 'float32'
您提供的输入的dtype为float64
,因此Tensorflow的默认权重dtype与输入之间不匹配。 Tensorflow不喜欢这样,因为强制转换(更改dtype)的成本很高。在处理不同dtypes的张量时(例如,比较float32
logits和float64
标签),Tensorflow通常会引发错误。
正在谈论的新行为:
图层my_model_1正在将dtype float64的输入张量强制转换为float32层的dtype,这是TensorFlow 2中的新行为
是它将自动将输入dtype强制转换为float32
。在这种情况下,Tensorflow 1.X可能引发了异常,尽管我不能说我曾经使用过它。