文本处理-根据字符将第一列分为两列

时间:2019-06-04 11:13:09

标签: shell awk sed text-processing cut

根据字符将文件的第一列拆分为两列。 方括号()中的数据应移至新列,并删除方括号。

给出csv文件:

Col1(col2),col3,col4,col5
a(23),12,test(1),test2
b(30),15,test1(2),test3

预期文件:

Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

我尝试了以下代码。我无法提取方括号之间的数据,并且每次都出现“()”。

awk -F"(" '$1=$1' OFS="," filename

3 个答案:

答案 0 :(得分:2)

选择:

$ sed 's/(\([^)]*\))/,\1/' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

$ sed 's/(/,/; s/)//' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

$ awk '{sub(/\(/,","); sub(/\)/,"")} 1' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

$ awk 'match($0,/\([^)]*\)/){$0= substr($0,1,RSTART-1) "," substr($0,RSTART+1,RLENGTH-2) substr($0,RSTART+RLENGTH) } 1' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

$ awk 'BEGIN{FS=OFS=","} split($1,a,/[()]/) > 1{$1=a[1] "," a[2]} 1' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

$ gawk '{$0=gensub(/\(([^)]*)\)/,",\\1",1)} 1' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

$ gawk 'match($0,/([^(]*)\(([^)]*)\)(.*)/,a){$0=a[1] "," a[2] a[3]} 1' file
Col1,col2,col3,col4,col5
a,23,12,test(1),test2
b,30,15,test1(2),test3

后2个分别需要GNU awk的gensub()和第三个arg的match()。也有其他选择。

答案 1 :(得分:0)

为了完整性...

posix shell

while IFS= read -r line; do
    car=${line%%)*}
    caar=${car%%(*}
    cdar=${car##*(}
    cdr=${line#*)}
    printf '%s\n' "$caar,$cdar$cdr"
done < file

我认为您不能单独使用cut来解决问题。

答案 2 :(得分:0)

能否请您再尝试一种sed 's/\([^(]*\)(\([^)]*\))\(.*\)/\1,\2\3/' Input_file 解决方案,这可能会对您有所帮助。

# Python Script to Call Ruby
import os
import signal
import subprocess
import sys
from threading import Thread
from subprocess import Popen, PIPE, STDOUT

slave_process = None
var = None
read = False
command = "ruby"

def slaveRead(): # Function read the output form the Child Process.
        global out, errout, var
        out = slave_process.comminicate()
        #out = slave_process.stdout.read() try
        #out = slave_process.read() try
        #slave_process.stdin.write() try
        var = out.decode('utf-8')
        print(var)

def runCommand(): # Function to create the Child Process  
    global slave_process
    slave_process = subprocess.Popen(['ruby', 'C:/_ddm/TEST_SUITE/test_exec/XXX.rb'], shell=True, stdin = PIPE , stdout=PIPE,stderr=STDOUT);

rootpath="C:/_ddm/TEST_SUITE"
os.environ['TEST_SUITE_DIR'] = rootpath
os.environ['RUBYLIB'] = rootpath
filepath="C:/_ddm/TEST_SUITE/XXX.bat"

print('launching slave process...')

runCommand()
slaveRead()
slaveRead()
slaveRead()
slaveRead()