Question

按\n通配符/拆分。我有以下文字：

wiring /(cid:3)(cid:9)waərŋ/ noun 1. a network of wires
wisdom tooth /(cid:3)(cid:9)wzdəm tu(cid:11)θ/ noun one of
the four teeth in the back of the jaw
witch  hazel  /(cid:3)(cid:9)wtʃ  (cid:4)hez(ə)l/ noun  a  lotion
made  from  the  bark  of  a  tree

我想将其拆分以获取已定义的单词，因此我希望按\n./拆分，但是当我使用

时

  txt.split('\n./')

或

  txt.split('\\n./')

它只返回txt

Answer 1

str.split()与re.split()不同。 .是str.split()中的一个简单点，而不是通配符。

s = "I like dogs"
print(s.split('.'))   # Prints ['I like dogs']

仅提取＆＃34;单词＆＃34;比如：'wiring', 'wisdom tooth', 'witch hazel'您可以使用regular expressions：

l = re.findall(r'(.+?)\s*/.*?\n', s)

findall()返回包含所有匹配项的列表。

.匹配任何非换行符，+匹配其中的一个或多个。 ()是一个捕获组（匹配的一部分将是＆＃34;存储＆＃34;）。 *表示之前的0或更多。 \s是空白字符。

Answer 2

这是另一种选择，虽然我认为正则表达是最佳方式。

您可以先拆分\n，重复列表，找到/并拆分/以返回第一项：

txt = '''wiring /(cid:3)(cid:9)waərŋ/ noun 1. a network of wires
wisdom tooth /(cid:3)(cid:9)wzdəm tu(cid:11)θ/ noun one of
the four teeth in the back of the jaw
witch  hazel  /(cid:3)(cid:9)wtʃ  (cid:4)hez(ə)l/ noun  a  lotion
made  from  the  bark  of  a  tree'''

for line in txt.split('\n'):
    if '/' in line:
        print line.split('/')[0].strip()

wiring
wisdom tooth
witch  hazel

或列表理解一次完成所有操作：

print [line.split('/')[0].strip() for line in txt.split('\n') if '/' in line]

['wiring', 'wisdom tooth', 'witch  hazel']

Answer 3

要回答提出的问题，需要对实际变量执行.split()运算符。当您输入txt.split(...)时，实际上是在拆分变量txt。因此，将上面的文本定义为字符串，然后拆分该字符串。

textarray = 'wiring...'
textarray.split('\n./')

由换行符通配符反斜杠拆分

3 个答案: