我有一个csv文件Decoded.csv
Query,Doc,article_id,data_source
5000,how to get rid of serve burn acne,1 Rose water and sandalwood: Make a paste of rose water and sandalwood and gently apply it on your acne scars.
2 Leave the paste on your skin overnight then wash it with cold water the next morning.
3 Do this regularly together with other natural treatments for acne scars to get rid of the scars as quickly as possible.,459,random
5001,what is hypospadia,A birth defect of the male urethra.,409,dummy
5002,difference between alimentary canal and accessory organs,The alimentary canal is the tube going from the mouth to the anus. The accessory organs are the organs located along that canal which produce enzymes to aid the digestion process.,461,nytimes
还有3个查询5000,5001& 5002。 查询5000具有Doc值,该值具有多行,这使得pandas感到困惑。 (1玫瑰水和檀香:将玫瑰水和檀香的糊状物轻轻涂抹在痘痘疤痕上。 2将糊状物留在皮肤上过夜,然后第二天早上用冷水洗净。 3定期与其他痤疮疤痕的自然疗法一起做,以尽快消除疤痕)
我的python代码在
下面def main():
import pandas as pd
dataframe = pd.read_csv("Decoded.csv")
queries, docs = dataframe['Query'], dataframe['Doc']
for idx in range(len(queries)):
print("idx: ", idx, " ", queries[idx], " <-> ", docs[idx])
query_doc_appended = (queries[idx] + " " + docs[idx])
print(query_doc_appended)
if __name__ == '__main__':
main()
它失败了。请指出如何删除新行字符,以便Query 5000具有Doc。
的完整语句集答案 0 :(得分:0)
您的查询5001行中包含太多字段,使其有5列而不是其他行所拥有的4列。
5001,what is hypospadia,A birth defect of the male urethra.,409,dummy
您可以在Decoded.csv中双重引用您的Doc内容来解决此问题。
答案 1 :(得分:0)
2个问题:
所以,csv应该是这样的:
Query,Doc,article_id,data_source
5000,"how to get rid of serve burn acne,1 Rose water and sandalwood: Make a paste of rose water and sandalwood and gently apply it on your acne scars.
2 Leave the paste on your skin overnight then wash it with cold water the next morning.
3 Do this regularly together with other natural treatments for acne scars to get rid of the scars as quickly as possible.",459,random
5001,"what is hypospadia,A birth defect of the male urethra.",409,dummy
5002,"difference between alimentary canal and accessory organs,The alimentary canal is the tube going from the mouth to the anus. The accessory organs are the organs located along that canal which produce enzymes to aid the digestion process.",461,nytimes
如果这些字段中有双引号,则必须使用另一个双引号进行转义。