我使用Python来解析来自以下csv
文件的数据 -
{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\nTry Again,1,0,12
{::[name]str1_str2_str3[0]},2,U0.01 - Sensor2 Not Ready\nTry Again,1,0,12
{::[name]str1_str2_str3[0]},3,U0.02 - \n,1,0,12
{::[name]str1_str2_str3[0]},4,U0.03 - Sensor4 Not Ready\nTry Again,1,0,12
{::[name]str1_str2_str3[0]},5,U0.04 - \n,1,0,12
在column1中,我正在解析0
中的值[ ]
。然后是column2和column3中的值,我正在解析子字符串“Sensor1 Not Ready
”,然后按如下方式打印到另一个文件 -
SENSOR1_NOT_READY 0,1
SENSOR2_NOT_READY 0,2
依旧......
现在,当我打印解析后的值时,我得到以下内容 -
SENSOR1_NOT_READY 0,1
SENSOR2_NOT_READY 0,2
SENSOR2_NOT_READY 0,3
SENSOR4_NOT_READY 0,4
SENSOR4_NOT_READY 0,5
我想跳过第3列中没有数据的行(例如 - 3
文件中的行5
和csv
。我该怎么做?
预期产出 -
SENSOR1_NOT_READY 0,1
SENSOR2_NOT_READY 0,2
SENSOR4_NOT_READY 0,4
以下是我的Python脚本 -
with open('filename.csv','rb') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
tag_name = row[0]
bit_num = row[1]
error_name = row[2]
# Regular expressions
term0 = '\[(\d)\].*'
term1 = '(\d+)'
term2 = r'.*-\s([\w\s]+)\\n'
capture0 = list(re.search(term0, tag_name).groups())
capture1 = list(re.search(term1, bit_num).groups())
temp = re.search(term2, error_name)
if temp:
result = list(temp.groups())
else:
None
result[-1] = '_'.join(result[-1].split()).upper()
capture2 = ','.join(result)
tp = (capture0[0], capture1[0], capture2) # Tuple
f.write('{2} {0},{1},\n'.format(tp[0], tp[1], tp[2]))
答案 0 :(得分:0)
构建一个搜索“正常”行的正则表达式。也许像"^U0.0[1-5] - \n$"
这样的东西?然后在打印错误之前使用类似if not re.search(x):
的内容。
with open('filename.csv','rb') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
tag_name = row[0]
bit_num = row[1]
error_name = row[2]
# Regular expressions
term0 = '\[(\d)\].*'
term1 = '(\d+)'
term2 = r'.*-\s([\w\s]+)\\n'
term3 = '^U0.0[1-5] - \n$'
capture0 = list(re.search(term0, tag_name).groups())
capture1 = list(re.search(term1, bit_num).groups())
temp = re.search(term2, error_name)
if temp:
result = list(temp.groups())
else:
None
result[-1] = '_'.join(result[-1].split()).upper()
capture2 = ','.join(result)
tp = (capture0[0], capture1[0], capture2) # Tuple
if not re.search(temp3,error_name):
f.write('{2} {0},{1},\n'.format(tp[0], tp[1], tp[2])) #I assume this is the print line?