我的输入如下:
MSG1 .STRINGZ “This is my sample string : "
MEMORYSPACE .BLKW 9
NEWLINE .FILL #10
NEG48 .FILl #-48
.END
现在我有一个代码,可以通过单词分割输入文件中的每一行:
['MSG1', '.STRINGZ', '"This', 'is', 'a' , 'sample' , 'string','"']
['MEMORYSPACE', '.BLKW', '9']
['NEWLINE', '.FILL', '#10']
['NEG48', '.FILl', '#-48']
[]
['.END']
在输入文件中,在我的第一行,我有字符串,我希望它将整个字符串视为一个元素,以便我可以在我的代码中计算它的长度。有没有办法做到这一点?这是我的代码:
f = open ('testLC31.txt', 'r')
line_count = 0
to_ignore = ["AND", "ADD", "LEA", "PUTS", "JSR", "LD", "JSRR" , "NOT", "LDI" ,
"LDR", "ST", "STI", "STR", "BR" , "JMP", "TRAP" , "JMP", "RTI" ,
"BR", "ST", "STI" , "STR" , "BRz", "BRn" , "HALT"]
label = []
instructions = []
for line in f:
elem = line.split() if line.split() else ['']
if len(elem) > 1 and elem[0] not in to_ignore:
label.append(elem[0])
instructions.append(elem[1])
line_count += 1
elif elem[0] in to_ignore:
line_count += 1
答案 0 :(得分:1)
str.split
方法有一个可选参数maxsplit
,它限制了结果列表中元素的数量:
>>> 'MSG1 .STRINGZ “This is my sample string : "'.split(None, 2)
['MSG1', '.STRINGZ', '“This is my sample string : "']
如果你想要的东西比获得前两个单词更复杂并且保留其余单词,shlex.split
可能适合你。它使用类似shell的语法来拆分字符串的一部分,从而将引号中的字符串视为单个元素。您可以通过创建shlex
对象实例并更改其属性来精确设置格式。有关详细信息,请参阅文档。
>>> shlex.split('MSG1 .STRINGZ "This is my sample string : "')
['MSG1', '.STRINGZ', 'This is my sample string : ']
>>> shlex.split('MSG1 .STRINGZ "This is my sample string : "', posix=False)
['MSG1', '.STRINGZ', '"This is my sample string : "']
如果这还不够,剩下的选择是为您的格式编写一个完整的解析器,例如使用pyparsing库。
答案 1 :(得分:1)
您可以尝试这种粗略的方法来手动连接这些字符串,如下所示:
tags = ['MSG1', '.STRINGZ', '"This', 'is', 'a' , 'sample' , 'string','"']
FirstOccurance = 0
longtag = ""
for tag in tags:
if FirstOccurance == 1:
if tag == "\"":
longtag += tag
else:
longtag += " "+tag
if ("\"" in tag) and (FirstOccurance == 0):
longtag += tag
FirstOccurance = 1
elif ("\"" in tag) and (FirstOccurance == 1):
FirstOccurance = 0
print longtag
希望这有帮助。
答案 2 :(得分:0)
这可以通过假设.STRINGZ在表示字符串时总是在一行上。
结果
"这是我的示例字符串:" len(strinz_):32
text_ = """
MSG1 .STRINGZ "This is my sample string : "
MEMORYSPACE .BLKW 9
NEWLINE .FILL #10
NEG48 .FILl #-48
.END
"""
STRINGZ_ = '.STRINGZ'
line_count_ = 0
lines_ = text_.split('\n')
to_ignore = ["AND", "ADD", "LEA", "PUTS", "JSR", "LD", "JSRR" , "NOT", "LDI" ,
"LDR", "ST", "STI", "STR", "BR" , "JMP", "TRAP" , "JMP", "RTI" ,
"BR", "ST", "STI" , "STR" , "BRz", "BRn" , "HALT"]
label = []
instructions = []
for line in lines_:
if STRINGZ_ in line:
stringz_ = line.split(STRINGZ_)[1]
print stringz_
print 'len(stringz_): ' + str(len(stringz_))
elem = line.split() if line.split() else ['']
if len(elem) > 1 and elem[0] not in to_ignore:
label.append(elem[0])
instructions.append(elem[1])
line_count_ += 1
elif elem[0] in to_ignore:
line_count_ += 1
答案 3 :(得分:0)
with open("filename") as f:
rd = f.readlines()
print (rd[0].split("\n")[0].split())
拆分\n
和空格。打印每个列表的第一个元素。 readlines()
会返回一个列表,操作起来会更容易。另外with open()
方法更好。
答案 4 :(得分:0)
一个简单的汇编程序?这是使用pyparsing的粗略传递:
code = """
MSG1 .STRINGZ "This is my sample string : "
MEMORYSPACE .BLKW 9
NEWLINE .FILL #10
NEG48 .FILL #-48
.END"""
from pyparsing import Word, alphas, alphanums, Regex, Combine, quotedString, Optional
identifier = Word(alphas, alphanums+'_')
command = Word('.', alphanums)
integer = Regex(r'[+-]?\d+')
byte_literal = Combine('#' + integer)
command_arg = quotedString | integer | byte_literal
codeline = Optional(identifier)("label") + command("instruction") + Optional(command_arg("arg"))
for line in code.splitlines():
line = line.strip()
if not line:
continue
print line
assemline = codeline.parseString(line)
print assemline.dump()
print
打印
MSG1 .STRINGZ "This is my sample string : "
['MSG1', '.STRINGZ', '"This is my sample string : "']
- arg: "This is my sample string : "
- instruction: .STRINGZ
- label: MSG1
MEMORYSPACE .BLKW 9
['MEMORYSPACE', '.BLKW', '9']
- arg: 9
- instruction: .BLKW
- label: MEMORYSPACE
NEWLINE .FILL #10
['NEWLINE', '.FILL', '#10']
- arg: #10
- instruction: .FILL
- label: NEWLINE
NEG48 .FILL #-48
['NEG48', '.FILL', '#-48']
- arg: #-48
- instruction: .FILL
- label: NEG48
.END
['.END']
- instruction: .END