Question

Perl的chomp函数的Python等价物是什么，如果它是换行符，它会删除字符串的最后一个字符？

Answer 1

尝试rstrip()方法（参见文档Python 2和Python 3）

>>> 'test string\n'.rstrip()
'test string'

Python的rstrip()方法默认剥离所有种类的尾随空格，而不仅仅是Perl对chomp所做的一个换行。

>>> 'test string \n \r\n\n\r \n\n'.rstrip()
'test string'

仅删除换行符：

>>> 'test string \n \r\n\n\r \n\n'.rstrip('\n')
'test string \n \r\n\n\r '

还有方法lstrip()和strip()：

>>> s = "   \n\r\n  \n  abc   def \n\r\n  \n  "
>>> s.strip()
'abc   def'
>>> s.lstrip()
'abc   def \n\r\n  \n  '
>>> s.rstrip()
'   \n\r\n  \n  abc   def'

Answer 2

我会说“pythonic”获取没有尾随换行符的行的方法是splitlines（）。

>>> text = "line 1\nline 2\r\nline 3\nline 4"
>>> text.splitlines()
['line 1', 'line 2', 'line 3', 'line 4']

Answer 3

剥离行尾（EOL）字符的规范方法是使用字符串rstrip（）方法删除任何尾随\ n或\ n \ n。以下是Mac，Windows和Unix EOL字符的示例。

>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'

使用'\ r \ n'作为rstrip的参数意味着它将去除'\ r'或'\ n'的任何尾随组合。这就是为什么它适用于上述所有三种情况。

这种细微差别在极少数情况下很重要。例如，我曾经不得不处理一个包含HL7消息的文本文件。 HL7标准要求尾随'\ n'作为其EOL字符。我使用此消息的Windows机器附加了自己的'\ r \ n'EOL字符。因此，每行的结尾看起来像'\ r \ n \ r \ n'。使用rstrip（'\ r \ n'）会取消整个'\ r \ n \ n \ n'，这不是我想要的。在这种情况下，我只是将最后两个字符切掉。

请注意，与Perl的chomp函数不同，这将删除字符串末尾的所有指定字符，而不仅仅是一个：

>>> "Hello\n\n\n".rstrip("\n")
"Hello"

Answer 4

请注意，rstrip的行为与Perl的chomp（）完全不同，因为它不会修改字符串。也就是说，在Perl中：

$x="a\n";

chomp $x

结果$x为"a"。

但是在Python中：

x="a\n"

x.rstrip()

意味着x的值仍然 "a\n"。即使x=x.rstrip()并不总是给出相同的结果，因为它从字符串的末尾删除所有空格，而不是最多只有一个换行符。

Answer 5

我可能会使用这样的东西：

import os
s = s.rstrip(os.linesep)

我认为rstrip("\n")的问题在于您可能希望确保行分隔符是可移植的。（传闻一些过时的系统使用"\r\n"）。另一个问题是rstrip将删除重复的空格。希望os.linesep将包含正确的字符。以上对我有用。

Answer 6

您可以使用line = line.rstrip('\n')。这将从字符串末尾删除所有换行符，而不仅仅是一行。

Answer 7

s = s.rstrip()

将删除字符串s末尾的所有换行符。因为rstrip返回一个新字符串而不是修改原始字符串，所以需要进行赋值。

Answer 8

这将完全复制perl的chomp（减去数组上的行为）＆＃34; \ n＆＃34;行终止符：

def chomp(x):
    if x.endswith("\r\n"): return x[:-2]
    if x.endswith("\n") or x.endswith("\r"): return x[:-1]
    return x

（注意：它不会修改字符串＆＃39;到位＆＃39 ;;它不会删除额外的尾随空格;在帐户中取\ r \ n）

Answer 9

你可以使用strip：

line = line.strip()

演示：

>>> "\n\n hello world \n\n".strip()
'hello world'

Answer 10

"line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
>>> 'line 1line 2...'

或者你可以随时使用正则表达式：）

玩得开心！

Answer 11

小心"foo".rstrip(os.linesep)：这只会扼杀正在执行Python的平台的换行符。想象一下，你正在Linux下使用Windows文件的行，例如：

$ python
Python 2.7.1 (r271:86832, Mar 18 2011, 09:09:48) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys
>>> sys.platform
'linux2'
>>> "foo\r\n".rstrip(os.linesep)
'foo\r'
>>>

改为使用"foo".rstrip("\r\n")，正如迈克上面所说的那样。

Answer 12

example in Python's documentation只使用line.strip()。

Perl的chomp函数只在字符串末尾删除了一个换行序列，只有它实际存在。

以下是我计划在Python中执行此操作的方法，如果process在概念上是我需要的功能，以便对此文件中的每一行执行有用的操作：

import os
sep_pos = -len(os.linesep)
with open("file.txt") as f:
    for line in f:
        if line[sep_pos:] == os.linesep:
            line = line[:sep_pos]
        process(line)

Answer 13

在很多层面上，rstrip与chomp没有做同样的事情。阅读http://perldoc.perl.org/functions/chomp.html并确定chomp非常复杂。

然而，我的主要观点是chomp最多删除1行结尾，而rstrip会删除尽可能多的行。

在这里你可以看到删除所有新行的rstrip：

>>> 'foo\n\n'.rstrip(os.linesep)
'foo'

使用re.sub可以更加接近典型的Perl chomp用法，如下所示：

>>> re.sub(os.linesep + r'\Z','','foo\n\n')
'foo\n'

Answer 14

我没有使用Python编程，但是我在python.org上遇到FAQ，主张用于python 2.2或更高版本的S.rstrip（“\ r \ n”）。

Answer 15

import re

r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)

Answer 16

如果您的问题是清理多行str对象（oldstr）中的所有换行符，可以根据分隔符'\ n'将其拆分为列表，然后将此列表连接到新的str（newstr））。

newstr = "".join(oldstr.split('\n'))

Answer 17

特殊情况的解决方案解决方案：

如果换行符是最后一个字符（与大多数文件输入的情况一样），那么对于集合中的任何元素，您可以索引如下：

foobar= foobar[:-1]

切出你的换行符。

Answer 18

我发现能够通过迭代器获取chomped行很方便，与从文件对象中获取未选择的行的方式并行。您可以使用以下代码执行此操作：

def chomped_lines(it):
    return map(operator.methodcaller('rstrip', '\r\n'), it)

样本用法：

with open("file.txt") as infile:
    for line in chomped_lines(infile):
        process(line)

Answer 19

看起来perl的chomp没有完美的类比。特别是，rstrip无法处理多字符换行符分隔符\r\n。但是，splitlines会as pointed out here。关于其他问题的my answer，您可以合并join和splitlines来删除/替换字符串s中的所有换行符：

''.join(s.splitlines())

以下删除了正好一个尾随 换行符（我相信就像chomp一样）。将True作为keepends参数传递给splitlines会保留分隔符。然后，再次调用splitlines以删除最后一行＆＃34;行的分隔符＆＃34;：

def chomp(s):
    if len(s):
        lines = s.splitlines(True)
        last = lines.pop()
        return ''.join(lines + last.splitlines())
    else:
        return ''

Answer 20

我在前面的另一个回答的评论中发布了我的正则表达式答案。我认为使用re比str.rstrip更明确地解决这个问题。

>>> import re

如果要删除一个或多个尾随换行符：

>>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
'\nx'

如果你想删除所有地方的换行符（不仅仅是尾随）：

>>> re.sub(r'[\n\r]+', '', '\nx\r\n')
'x'

如果您只想删除1-2个尾随换行符（即\r，\n，\r\n，\n\r，\r\r，{{ 1}}）

\n\n

我有一种感觉，大多数人真正想要的是，只删除一个出现的尾随换行符，>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n') '\nx\r' >>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r') '\nx\r' >>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n') '\nx'或\r\n，仅此而已。< / p>

\n

（>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1) '\nx\n' >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1) '\nx\r\n' >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1) '\nx' >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1) '\nx'是创建一个非捕获组。）

（顺便说一下，这是不 ?:所做的事情，对于其他绊倒在这个帖子上的人来说可能并不清楚。'...'.rstrip('\n', '').rstrip('\r', '')剥离尽可能多的尾随字符，所以像str.rstrip这样的字符串会导致foo\n\n\n误报，而你可能想要在删除单个尾随字符后保留其他换行符。）

Answer 21

只需使用：

line = line.rstrip("\n")

或

line = line.strip("\n")

你不需要任何这些复杂的东西

Answer 22

<body ng-App="myAPP">
  <table-component></table-component>
</body>

Answer 23

我们通常会遇到三种类型的行结尾：\n，\r和\r\n。 re.sub中的一个相当简单的正则表达式，即r"\r?\n?$"，能够捕捉到它们。

（我们要抓住所有，我是对的吗？）

import re

re.sub(r"\r?\n?$", "", the_text, 1)

使用最后一个参数，我们将被替换的次数限制为1，在某种程度上模仿chomp。例如：

import re

text_1 = "hellothere\n\n\n"
text_2 = "hellothere\n\n\r"
text_3 = "hellothere\n\n\r\n"

a = re.sub(r"\r?\n?$", "", text_1, 1)
b = re.sub(r"\r?\n?$", "", text_2, 1)
c = re.sub(r"\r?\n?$", "", text_3, 1)

...其中a == b == c为True。

Answer 24

如果你担心速度（比如你有一个字符串的冗长列表）并且你知道换行符char的性质，字符串切片实际上比rstrip更快。一个小小的测试来说明这一点：

import time

loops = 50000000

def method1(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string[:-1]
    t1 = time.time()
    print('Method 1: ' + str(t1 - t0))

def method2(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string.rstrip()
    t1 = time.time()
    print('Method 2: ' + str(t1 - t0))

method1()
method2()

输出：

Method 1: 3.92700004578
Method 2: 6.73000001907

Answer 25

这对于windows和linux都有效（如果你只想寻找解决方案，那么re sub有点贵）

import re 
if re.search("(\\r|)\\n$", line):
    line = re.sub("(\\r|)\\n$", "", line)

Answer 26

s = '''Hello  World \t\n\r\tHi There'''
# import the module string   
import string
# use the method translate to convert 
s.translate({ord(c): None for c in string.whitespace}
>>'HelloWorldHiThere'

使用正则表达式

s = '''  Hello  World 
\t\n\r\tHi '''
print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
>HelloWorldHi

替换\ n，\ t，\ r

s.replace('\n', '').replace('\t','').replace('\r','')
>'  Hello  World Hi '

使用正则表达式

s = '''Hello  World \t\n\r\tHi There'''
regex = re.compile(r'[\n\r\t]')
regex.sub("", s)
>'Hello  World Hi There'

使用加入

s = '''Hello  World \t\n\r\tHi There'''
' '.join(s.split())
>'Hello  World Hi There'

Answer 27

首先分割线，然后通过所需的任何分隔符将它们连接起来。

  x = ' '.join(x.splitlines())

应该像魅力一样工作。

Answer 28

抓住所有人：

line = line.rstrip('\r|\n')

如何在Python中删除尾部换行符？

28 个答案: