用python分割文本

时间:2016-12-13 00:58:25

标签: python text split

我有一个python脚本,它读取文本列表并将其写入四个单独的文件。是否可以只写一个文本文件,其中每一行都有联合坐标x坐标y坐标和z坐标用空格分隔?我非常感谢任何帮助

import os
os.chdir('/Users/JP/DevEnv/ASA')


import re

# regex to extract data line    
r = re.compile(r"\s*(\d+)\s+X=(\S+)\s+Y=(\S+)\s+Z=(\S+)")

a="""SYSTEM

    DOF=UY,UZ,RX  LENGTH=FT  FORCE=Kip

    JOINT
    1  X=0  Y=-132.644  Z=0
    2  X=0  Y=-80  Z=0
    3  X=0  Y=-40  Z=0
    4  X=0  Y=0  Z=0
    5  X=0  Y=40  Z=0
    6  X=0  Y=80  Z=0
    7  X=0  Y=132.644  Z=0""".splitlines().__iter__()

    # open all 4 files with a meaningful name
    files=[open("file_{}.txt".format(x),"w") for x in ["J","X","Y","Z"]]
    for line in a:
        m = r.match(line)
        if m:
            # line matches: write in all 4 files (using zip to avoid doing
            # it one by one)
            for f,v in zip(files,m.groups()):
                f.write(v+"\n")

    # close all output files now that it's done
    for f in files:
        f.close()

输出文本文件的第一行如下所示:     1 0 -132.644 0

2 个答案:

答案 0 :(得分:1)

public void DoProcessing()
{
    TraceMessage("Something happened.");
}

public void TraceMessage(string message,
        [System.Runtime.CompilerServices.CallerMemberName] string memberName = "",
        [System.Runtime.CompilerServices.CallerFilePath] string sourceFilePath = "",
        [System.Runtime.CompilerServices.CallerLineNumber] int sourceLineNumber = 0)
{
    System.Diagnostics.Trace.WriteLine("message: " + message);
    System.Diagnostics.Trace.WriteLine("member name: " + memberName);
    System.Diagnostics.Trace.WriteLine("source file path: " + sourceFilePath);
    System.Diagnostics.Trace.WriteLine("source line number: " + sourceLineNumber);
}

// Sample Output:
//  message: Something happened.
//  member name: DoProcessing
//  source file path: c:\Users\username\Documents\Visual Studio 2012\Projects\CallerInfoCS\CallerInfoCS\Form1.cs
//  source line number: 31

出:

import re
a="""SYSTEM

    DOF=UY,UZ,RX  LENGTH=FT  FORCE=Kip

    JOINT
    1  X=0  Y=-132.644  Z=0
    2  X=0  Y=-80  Z=0
    3  X=0  Y=-40  Z=0
    4  X=0  Y=0  Z=0
    5  X=0  Y=40  Z=0
    6  X=0  Y=80  Z=0
    7  X=0  Y=132.644  Z=0"""
# replace all character except digit, '-', '.' and ' '(space) with nothing, get all the info you need, than split each info into a list
b = re.sub(r'[^\d\. -]','',a).split() 
# split the list to sublists, each contain four elements 
lines = [b[i:i+4] for i in range(0, len(b), 4)]
for line in lines:
    print(line)

或写入文件:

['1', '0', '-132.644', '0']
['2', '0', '-80', '0']
['3', '0', '-40', '0']
['4', '0', '0', '0']
['5', '0', '40', '0']
['6', '0', '80', '0']
['7', '0', '132.644', '0']

或:

print(' '.join(line),file=open('youfilename', 'a'))

出:

with open('filename', 'w') as f:
    for line in lines:
        f.write(' '.join(line) + '\n')
    # or
    f.writelines(' '.join(line)+'\n' for line in lines)

答案 1 :(得分:0)

看起来你想要一个除了数字和空格之外的所有文件。这里有两个解决方案,一个使用RE,一个不使用。

a="""SYSTEM

    DOF=UY,UZ,RX  LENGTH=FT  FORCE=Kip

    JOINT
    1  X=0  Y=-132.644  Z=0
    2  X=0  Y=-80  Z=0
    3  X=0  Y=-40  Z=0
    4  X=0  Y=0  Z=0
    5  X=0  Y=40  Z=0
    6  X=0  Y=80  Z=0
    7  X=0  Y=132.644  Z=0""".splitlines()

for line in a:
    line = line.strip()
    if line and line[0].isdecimal():
        print((''.join(c for c in line if c.isdecimal() or c in '- .')
                .replace('  ', ' ')))

print()
import re
r = re.compile(r"\s*(\d+)\s+X=(\S+)\s+Y=(\S+)\s+Z=(\S+)")
for line in a:
    m = r.match(line)
    if m:
        print(' '.join(n for n in m.groups()))

这将打印以下两次

1 0 -132.644 0
2 0 -80 0
3 0 -40 0
4 0 0 0
5 0 40 0
6 0 80 0
7 0 132.644 0