这是我的代码:
def parse_csv(line):
example_defaults = [ [''], ['']]
parsed_line = tf.decode_csv(line,example_defaults)
features = tf.reshape(parsed_line[0:1], shape=())
label = tf.reshape(parsed_line[1:2], shape=())
return features, label
def read_data(input_fname):
train_path = input_fname
tf.enable_eager_execution()
train_dataset = tf.data.TextLineDataset(train_path).skip(1)
train_dataset = train_dataset.map(parse_csv)
train_dataset = train_dataset.shuffle(buffer_size=1000)
train_dataset = train_dataset.batch(32)
# View a single example entry from a batch
features, label = iter(train_dataset).next()
print("example features:", features[0])
print("example label:", label[0])
if __name__ == "__main__":
read_data(sys.argv[1])
csv文件示例:
col1,col2
"test_email@gmailcom is my email", "label1"
content1, label1
content2, label2
输出为:
example features: "test_email@gmailcom is my email"
example label:"label1"
example features:content1
example label:label1
example features:content2
example label:label2
但是我想要这个结果:
example features: "#Email is my email"
example label:"label1"
example features:content1
example label:label1
example features:content2
example label:label2
我想更改CSV内容,但特征是张量 我如何更改csv值?
常规文本转换如下。 re.sub(r“(\ w + [\ w。] )@(\ w + [\ w。] )。([[A-Za-z] +)”,“ #Email”, csv_content)