图表机器人的训练数据,使用tensorflow如何制作这种数据集?

时间:2018-01-10 02:52:13

标签: python tensorflow dataset chatbot corpus

我想用张量流为我的聊天机器人制作一个火车源。 我的语料库文件如下所示:

hello!
hello,nice to meet you!
nice to meet you too!
goodbye
bye
我读完语料库文件后

ds = tf.data.TextLineDataset("./corpus.txt")
ds = ds.map(lambda x: tf.py_func(lambda x: x.lower(), [x], tf.string, stateful=False))
ds = ds.map(lambda x: tf.constant("bos_ ") + x + tf.constant(" _eos"))

我可以获得这样的数据集:

bos_ hello! _eos
bos_ hello ,nice to meet you! _eos
bos_ nice to meet you too! _eos
bos_ goodbye _eos
bos_ bye _eos

但是我如何制作这样的数据集:

('bos_ hello! _eos', 'bos_ hello, nice to meet you! _eos')
('bos_ hello, nice to meet you! _eos', 'bos_ nice to meet you too! _eos')
('bos_ nice to meet you too! _eos', 'bos_ goodbye _eos') 
('bos_ goodbye _eos','bos_ bye _eos')

此外,我如何制作如下数据集:

('bos_ hello! _eos', 'bos_ hello, nice to meet you! _eos')
('bos_ hello! hello, nice to meet you! _eos', 'bos_  nice to meet you too! _eos')
('bos_ hello! hello,nice to meet you! nice to meet you too! _eos', 'bos_ goodbye _eos')
('bos_ hello! hello,nice to meet you! nice to meet you too! goodbye _eos', 'bos_ bye _ eos')

0 个答案:

没有答案