有人可以给我看看tf.data.experimental.group_by_reducer的示例吗?我发现文档比较棘手,无法完全理解。
如何使用它来计算平均值?
答案 0 :(得分:3)
假设我们提供了一个包含['ids', 'features']
的数据集,并且我们想通过添加对应于同一'features'
的{{1}}对数据进行分组。我们可以使用'ids'
来实现。
原始数据
tf.group_by_reducer(key_func, reducer)
所需数据
ids | features
--------------
1 | 1
2 | 2.2
3 | 7
1 | 3.0
2 | 2
3 | 3
TensorFlow代码:
ids | features
--------------
1 | 4
2 | 4.2
3 | 10
考虑者ID ==1。我们使用import tensorflow as tf
tf.enable_eager_execution()
ids = [1, 2, 3, 1, 2, 3]
features = [1, 2.2, 7, 3.0, 2, 3]
# Define reducer
# Reducer requires 3 functions - init_func, reduce_func, finalize_func.
# init_func - to define initial value
# reducer_func - operation to perform on values with same key
# finalize_func - value to return in the end.
def init_func(_):
return 0.0
def reduce_func(state, value):
return state + value['features']
def finalize_func(state):
return state
reducer = tf.contrib.data.Reducer(init_func, reduce_func, finalize_func)
# Group by reducer
# Group the data by id
def key_f(row):
return tf.to_int64(row['ids'])
t = tf.contrib.data.group_by_reducer(
key_func = key_f,
reducer = reducer)
ds = tf.data.Dataset.from_tensor_slices({'ids':ids, 'features' : features})
ds = ds.apply(t)
ds = ds.batch(6)
iterator = ds.make_one_shot_iterator()
data = iterator.get_next()
print(data)
将初始值设置为0。 init_func
将执行reducer_func
和0 + 1
操作,而1 + 3.0
将返回4.0。
在group_by_reducer函数中,finalize_func
是一个返回该数据行的键的函数。密钥应为Int64。在我们的例子中,我们使用“ ids”作为密钥。
答案 1 :(得分:0)
我调整了@ Illuminati0x5B代码以与tf2.0一起使用。感谢@ Illuminati0x5B,您的示例代码确实很有帮助。
TensorFlow代码(已调整):
ids = [1, 2, 3, 1, 2, 3]
features = [1, 2.2, 7, 3.0, 2, 3]
# Define reducer
# Reducer requires 3 functions - init_func, reduce_func, finalize_func.
# init_func - to define initial value
# reducer_func - operation to perform on values with same key
# finalize_func - value to return in the end.
def init_func(_):
return 0.0
def reduce_func(state, value):
return state + value['features']
def finalize_func(state):
return state
reducer = tf.data.experimental.Reducer(init_func, reduce_func, finalize_func)
# Group by reducer
# Group the data by id
def key_f(row):
return tf.dtypes.cast(row['ids'], tf.int64)
t = tf.data.experimental.group_by_reducer(
key_func = key_f,
reducer = reducer)
ds = tf.data.Dataset.from_tensor_slices({'ids':ids, 'features' : features})
ds = ds.apply(t)
ds = ds.batch(6)
iterator = tf.compat.v1.data.make_one_shot_iterator(ds)
data = iterator.get_next()
print(data)