大家好!
问题是我有一些txt文件,并且我有将它们放在一起的脚本。每个txt文件均始于:
Export Type: by LAI\GCI\SAI
LAI\GCI\SAI: fjdfkj
HLR NUMBER: NA
Routing Category: NA
Telephone Service: NA
Export User Scope: Attached & Detached User
Task Name: lfl;sfd
Data Type: col1/col2
Begin Time of Exporting data: 2019-4-14 19:41
=================================
col1 col2
401885464645645 54634565754
401884645645564 54545454564
401087465836453 54545454565
401885645656567 53434343435
401084569498484 54342340788
401088465836453 56767686334
401439569345656 64545467558
401012993933334 55645342352
401034545566463 34353463464
我想只从col1和col2开头合并(没有列的名称),但是脚本也将它们与单词开头合并。 您可以更新此脚本吗?
import fileinput
import glob
file_list = glob.glob("*.txt")
with open('resultfile.txt', 'w') as file:
input_lines = fileinput.input(file_list)
file.writelines(input_lines)
另一个问题是我想在col2的值开头删除5,并且还删除所有从40108/40188 / 401088e开始的行。谢谢!
答案 0 :(得分:0)
通过指定标题行有选择地导入标题。这样可以访问数据帧中的“标题”数据。从那里,它们可以连接起来并作为csv写回。
假设问题上有标签,假设您希望通过熊猫来做到这一点。
import pandas as pd
from pandas.compat import StringIO
import fileinput
import glob
csvdata = str("""Export Type: by LAI\GCI\SAI
LAI\GCI\SAI: fjdfkj
HLR NUMBER: NA
Routing Category: NA
Telephone Service: NA
Export User Scope: Attached & Detached User
Task Name: lfl;sfd
Data Type: col1/col2
Begin Time of Exporting data: 2019-4-14 19:41
=================================
col1 col2
401885464645645 54634565754
401884645645564 54545454564
401087465836453 54545454565
401885645656567 53434343435
401084569498484 54342340788
401088465836453 56767686334
401439569345656 64545467558
401012993933334 55645342352
401034545566463 34353463464""")
files = ["file{}.txt".format(i) for i in range(3)]
for fn in files:
with open(fn, "w") as f:
f.write(csvdata)
file_list = glob.glob("file*.txt")
dfs = []
for f in file_list:
df = pd.read_csv(f, sep="\s+", header=[10])
dfs.append(df)
df = pd.concat(dfs)
df.reset_index(inplace=True)
df.to_csv("resultfile.txt")
产生
,index,col1,col2
0,0,401885464645645,54634565754
1,1,401884645645564,54545454564
2,2,401087465836453,54545454565
3,3,401885645656567,53434343435
4,4,401084569498484,54342340788
5,5,401088465836453,56767686334
6,6,401439569345656,64545467558
7,7,401012993933334,55645342352
8,8,401034545566463,34353463464
9,0,401885464645645,54634565754
10,1,401884645645564,54545454564
11,2,401087465836453,54545454565
12,3,401885645656567,53434343435
...snip...