Question

我有很多名字只是数字的文件。（从1开始到最大数量）并且这些文件中的每一个都通过它们的“标签”（ObjectID =，X =，Y =等）彼此相似，但这些标签之后的值不同一点都不。

我希望通过手动将数据从一个文件复制/粘贴到另一个文件来使我的工作更轻松，并使用Python制作一个小脚本（因为我对它有一点经验）。

这是完整的脚本：

import os

BASE_DIRECTORY = 'C:\Users\Tom\Desktop\TheServer\scriptfiles\Objects'
output_file = open('output.txt', 'w')
output = {}
file_list = []

for (dirpath, dirnames, filenames) in os.walk(BASE_DIRECTORY):
    for f in filenames:
        if 'txt' in str(f):
            e = os.path.join(str(dirpath), str(f))
            file_list.append(e)

for f in file_list:
    print f
    txtfile = open(f, 'r')
    output[f] = []
    for line in txtfile:
        if 'ObjectID =' in line:
            output[f].append(line)
        elif 'X =' in line:
            output[f].append(line)
        elif 'Y =' in line:
            output[f].append(line)
tabs = []
for tab in output:
    tabs.append(tab)

tabs.sort()
for tab in tabs:
    for row in output[tab]:
        output_file.write(row + '')

现在，一切正常，输出文件如下所示：

ObjectID = 1216
X = -1480.500610
Y = 2610.885742
ObjectID = 970
X = -1517.210693
Y = 2522.842285
ObjectID = 3802
X = -1512.156616
Y = 2521.116210
etc.

但我不希望它像那样（每个值都有一个新行）。我需要它为每个文件执行此操作：

阅读文件。
删除值前面的标记。
格式化单行，该行将在输出文件夹中包含这些值。（假设我想让它看起来像这样：“（1216，-1480.500610,2522.842285）”）
在输出文件夹中写下该行。
对每个文件重复。

请帮忙吗？

Answer 1

在你的循环中，跟踪你是否在＆＃39; in＆＃39;记录：

records = []
in_record = False
id, x, y = 0, 0, 0
for line in txtfile:
    if not in_record:
        if 'ObjectID =' in line:
            in_record = True
            id = line[10:]
    elif 'X =' in line:
        x = line[3:]
    elif 'Y =' in line:
        y = line[3:]
        records.append((id, x, y))
        in_record = False

然后，您将拥有一个元组列表，您可以使用csv模块轻松编写这些元组。

Answer 2

希望这有帮助。

data = open('sam.txt', 'r').read()

>>> print data
ObjectID = 1216
X = -1480.500610
Y = 2610.885742
ObjectID = 970
X = -1517.210693
Y = 2522.842285
ObjectID = 3802
X = -1512.156616
Y = 2521.116210
>>>

现在让我们做一些字符串替换：）

>>> data = data.replace('ObjectID =', '').replace('\nX = ', ',').replace('\nY = ', ',')
>>> print data
 1216,-1480.500610,2610.885742
 970,-1517.210693,2522.842285
 3802,-1512.156616,2521.116210

Answer 3

这是你需要的。我没有足够的时间编写将结果附加到新文件的代码。相反，它只是打印它，但你明白了。

import os.path

path = "path"

#getting the number of files in your folder
num_files = len([f for f in os.listdir(path)
                if os.path.isfile(os.path.join(path, f))])

#function that returns your desired output for a given file
def file_head_ext(file_path, file_num):
    with open(file_path + "/" + file_num) as myfile:
        head = [next(myfile).split("=") for x in range(3)]
        formatted_head = [elm[1].replace("\n",'').replace(" ","") for elm in head]
    return(",".join(formatted_head))


for filnum in range(1,num_files):
    print(file_head_ext(path, str(filnum)))

Answer 4

在此处找到您生成内容的循环版本我重写了它，所以行内容ObjectId，X和Y在同一行。

看起来这就是你想要做的事情：

for f in file_list:
    print f
    txtfile = open(f, 'r')
    output[f] = []
    for line in txtfile:
        myline = ''
        if 'ObjectID =' in line:
            pos = line.rfind("ObjectID =") + len("ObjectID =")
            rest = line[pos:]
            # Here you set the delimiter after the ObjectID value. Can be ","
            numbers = rest.split(" ")
            if len(numbers) > 0: 
                myline.append(numbers[0])

        elif 'X =' in line:
            pos = line.rfind("X =") + len("X =")
            rest = line[pos:]
            # Here you set the delimiter after the ObjectID value. Can be ","
            numbers = rest.split(" ")
            if len(numbers) > 0: 
                myline.append(numbers[0])
        elif 'Y =' in line:
            pos = line.rfind("Y =") + len("Y =")
            rest = line[pos:]
            # Here you set the delimiter after the ObjectID value. Can be ","
            numbers = rest.split(" ")
            if len(numbers) > 0: 
                myline.append(numbers[0])

        output[f].append(myline)

注意您需要知道哪个字符（代码中的分隔符）将您尝试查找的名称与实际值分开：ObjectID =想从线上抓住。

将数据从文本文件提取到输出文件

4 个答案: