Python截断一个长字符串

时间:2010-05-20 09:37:04

标签: python

如何在Python中将字符串截断为75个字符?

这是在JavaScript中完成的方式:

var data="saddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsaddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsadddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
var info = (data.length > 75) ? data.substring[0,75] + '..' : data;

18 个答案:

答案 0 :(得分:333)

info = (data[:75] + '..') if len(data) > 75 else data

答案 1 :(得分:106)

更短:

info = data[:75] + (data[75:] and '..')

答案 2 :(得分:81)

更简洁:

data = data[:75]

如果小于75个字符,则不会有任何变化。

答案 3 :(得分:57)

如果您使用的是Python 3.4+,则可以使用标准库中的textwrap.shorten

  

折叠并截断给定文本以适合给定的宽度。

     

首先,文本中的空格被折叠(所有空格都被替换   单个空格)。如果结果适合宽度,则返回。   否则,从末尾删除足够的单词,以便剩下的   单词加上占位符适合宽度:

>>> textwrap.shorten("Hello  world!", width=12)
'Hello world!'
>>> textwrap.shorten("Hello  world!", width=11)
'Hello [...]'
>>> textwrap.shorten("Hello world", width=10, placeholder="...")
'Hello...'

答案 4 :(得分:26)

对于Django解决方案(问题中未提及):

import tkinter as tk
from tkinter import font    

class PiSimulator(tk.Frame):

    def __init__(self, root, **kwargs):
        super(PiSimulator, self).__init__(root, **kwargs)
        self.root = root
        self.root.title = 'Crtac krugova'
        self.root.minsize(width = 600, height = 400)
        self.pack(fill = 'both', expand = 1)
        self.custom_font = font.Font(size = 20)
        self.bind('<Configure>', self._adjust_sizes)
        self.populate()
        return

    def populate(self):

        tk.Label(self, text = 'Radijus:', font = self.custom_font, relief = 'raised',\
                        fg = 'green').grid(sticky='nswe', row = 0, column = 0)      
        self.radius = tk.IntVar()
        tk.Spinbox(self, from_ = 1, to = 16, font = self.custom_font, textvariable = self.radius).grid(row = 1, column = 0) 

        tk.Label(self, text = 'Greška', font = self.custom_font, relief = 'raised', anchor='w',\
                        padx = 5, pady = 5, fg = 'green').grid(sticky='nswe', row = 0, column = 1)

        self.error = tk.DoubleVar(0.0)
        tk.Label(self, textvariable = self.error, font = self.custom_font, relief = 'raised',\
                        anchor='e', pady = 5, fg = 'green').grid(row = 0, column = 2, sticky='nswe')


        self.canvas = PiCanvas(self, highlightthickness = 0)
        self.canvas.grid(row = 1, rowspan = 5, column = 1, columnspan = 2, padx = 5, pady = 5, sticky='nswe')

        tk.Button(self, text = 'Nacrtaj', font = self.custom_font, command = lambda: self.canvas.draw(self.radius.get())).grid(\
                        column = 0, row = 2, sticky='nswe')
        tk.Button(self, text = 'Izbriši',  font = self.custom_font, command = self.canvas.clear ).grid(\
                        column = 0, row = 3, sticky='nswe')
        tk.Label(self, text = 'POVRŠINA', relief = 'groove',  font = self.custom_font).grid(column = 0, row = 4, sticky='nswe')
        self.area = tk.IntVar(0)
        tk.Label(self, textvariable = self.area, relief = 'groove',  font = self.custom_font).grid(column = 0, row = 5, sticky='nswe')


        for i in range(3):
            self.columnconfigure(i, weight = 1)
        for i in range(6):
            self.rowconfigure(i, weight = 1)
        return

    def _adjust_sizes(self, event):
        new_font = int(event.height/15) - 2
        self.custom_font.configure(size = new_font)
        return


class PiCanvas(tk.Canvas):

    def __init__(self, root, **kwargs):
        super(PiCanvas, self).__init__(root, background = 'white', relief = 'groove',  **kwargs)
        self.root = root
        self.cubes = []
        self.circle = None

    def draw(self, radius):
        self.clear()
        c_width = self.winfo_width()
        c_height = self.winfo_height()
        self.grid_side = int(min([c_width, c_height]))
        self.grid_side-= 10
        self.v_offset = (c_height - self.grid_side)/2
        self.h_offset = (c_width - self.grid_side)/2
        self.cube_side = self.grid_side/(2*radius)
        vertix_coord = lambda x, x_cor, y_cor: x_cor*self.h_offset+y_cor*self.v_offset+x*self.cube_side

        for i in range(2*radius):
            new_line = []
            for j in range(2*radius):
                cube = self.create_rectangle(vertix_coord(i, 1, 0), vertix_coord(j, 0, 1),\
                                                vertix_coord(i+1, 1, 0), vertix_coord(j+1, 0, 1), tag = 'unused')
                new_line.append(cube)
            self.cubes.append(new_line)

        self.circle = self.create_oval(vertix_coord(0,1,0), vertix_coord(0,0,1),\
                                    vertix_coord(2*radius,1,0), vertix_coord(2*radius,0,1),width = 2)
        self.color_border_cubes()
        print(self.circle)

    def color_border_cubes(self):
        circle_coords = self.coords(self.circle)
        print(circle_coords)
        for i in self.cubes:
            for j in i:
                cube_coords = self.coords(j)
                if self.circle in self.find_overlapping(*cube_coords) + self.find_enclosed(*circle_coords):
                    self.itemconfigure(j, fill = 'green')


    def clear(self):
        self.cubes = []
        self.circle = None
        self.delete('all')      
        return


root = tk.Tk()
a=PiSimulator(root)
root.mainloop()

看看Truncator的源代码来理解这个问题: https://github.com/django/django/blob/master/django/utils/text.py#L66

关于使用Django进行截断: Django HTML truncation

答案 5 :(得分:9)

你可以使用这个单行:

data = (data[:75] + '..') if len(data) > 75 else data

答案 6 :(得分:7)

使用正则表达式:

re.sub(r'^(.{75}).*$', '\g<1>...', data)

长字符串被截断:

>>> data="11111111112222222222333333333344444444445555555555666666666677777777778888888888"
>>> re.sub(r'^(.{75}).*$', '\g<1>...', data)
'111111111122222222223333333333444444444455555555556666666666777777777788888...'

较短的字符串永远不会被截断:

>>> data="11111111112222222222333333"
>>> re.sub(r'^(.{75}).*$', '\g<1>...', data)
'11111111112222222222333333'

这样,你也可以&#34; cut&#34;字符串的中间部分,在某些情况下更好:

re.sub(r'^(.{5}).*(.{5})$', '\g<1>...\g<2>', data)

>>> data="11111111112222222222333333333344444444445555555555666666666677777777778888888888"
>>> re.sub(r'^(.{5}).*(.{5})$', '\g<1>...\g<2>', data)
'11111...88888'

答案 7 :(得分:3)

此方法不使用if:

data[:75] + bool(data[75:]) * '..'

答案 8 :(得分:3)

limit = 75
info = data[:limit] + '..' * (len(data) > limit)

答案 9 :(得分:2)

另一个解决方案。使用TrueFalse,您可以在最后获得有关测试的一些反馈。

data = {True: data[:75] + '..', False: data}[len(data) > 75]

答案 10 :(得分:1)

       >>> info = lambda data: len(data)>10 and data[:10]+'...' or data
       >>> info('sdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdf')
           'sdfsdfsdfs...'
       >>> info('sdfsdf')
           'sdfsdf'
       >>> 

答案 11 :(得分:1)

这就是:

SELECT Id, "Root" 
FROM NodeTable
WHERE ParentId IS NULL
UNION
SELECT DISTINCT N1.Id, "Inner"
FROM NodeTable N1
JOIN NodeTable N2
  ON N2.ParentId = N1.ID
WHERE N1.ParentId IS NOT NULL
UNION
SELECT N1.Id, "Leaf" 
FROM NodeTable N1
LEFT JOIN NodeTable N2 
  ON N1.Id = N2.ParentId
WHERE N2.ParentId IS NULL;

答案 12 :(得分:0)

简单而简短的辅助函数:

def truncate_string(value, max_length=255, suffix='...'):
    string_value = str(value)
    string_truncated = string_value[:min(len(string_value), (max_length - len(suffix)))]
    suffix = (suffix if len(string_value) > max_length else '')
    return string_truncated+suffix

使用示例:

# Example 1 (default):

long_string = ""
for number in range(1, 1000): 
    long_string += str(number) + ','    

result = truncate_string(long_string)
print(result)


# Example 2 (custom length):

short_string = 'Hello world'
result = truncate_string(short_string, 8)
print(result) # > Hello... 


# Example 3 (not truncated):

short_string = 'Hello world'
result = truncate_string(short_string)
print(result) # > Hello world

答案 13 :(得分:0)

info = data[:min(len(data), 75)

答案 14 :(得分:0)

info = data[:75] + ('..' if len(data) > 75 else '')

答案 15 :(得分:0)

这是我作为新String类的一部分制作的函数...它允许添加后缀(如果字符串在修剪后为大小,并且添加了足够长的时间,尽管您无需强制绝对大小)

我正在改变一些事情,因此存在一些无用的逻辑成本(例如,如果_truncate ...),则不再需要该逻辑成本,并且在顶部有回报...

但是,它仍然是截断数据的好功能...

##
## Truncate characters of a string after _len'nth char, if necessary... If _len is less than 0, don't truncate anything... Note: If you attach a suffix, and you enable absolute max length then the suffix length is subtracted from max length... Note: If the suffix length is longer than the output then no suffix is used...
##
## Usage: Where _text = 'Testing', _width = 4
##      _data = String.Truncate( _text, _width )                        == Test
##      _data = String.Truncate( _text, _width, '..', True )            == Te..
##
## Equivalent Alternates: Where _text = 'Testing', _width = 4
##      _data = String.SubStr( _text, 0, _width )                       == Test
##      _data = _text[  : _width ]                                      == Test
##      _data = ( _text )[  : _width ]                                  == Test
##
def Truncate( _text, _max_len = -1, _suffix = False, _absolute_max_len = True ):
    ## Length of the string we are considering for truncation
    _len            = len( _text )

    ## Whether or not we have to truncate
    _truncate       = ( False, True )[ _len > _max_len ]

    ## Note: If we don't need to truncate, there's no point in proceeding...
    if ( not _truncate ):
        return _text

    ## The suffix in string form
    _suffix_str     = ( '',  str( _suffix ) )[ _truncate and _suffix != False ]

    ## The suffix length
    _len_suffix     = len( _suffix_str )

    ## Whether or not we add the suffix
    _add_suffix     = ( False, True )[ _truncate and _suffix != False and _max_len > _len_suffix ]

    ## Suffix Offset
    _suffix_offset = _max_len - _len_suffix
    _suffix_offset  = ( _max_len, _suffix_offset )[ _add_suffix and _absolute_max_len != False and _suffix_offset > 0 ]

    ## The truncate point.... If not necessary, then length of string.. If necessary then the max length with or without subtracting the suffix length... Note: It may be easier ( less logic cost ) to simply add the suffix to the calculated point, then truncate - if point is negative then the suffix will be destroyed anyway.
    ## If we don't need to truncate, then the length is the length of the string.. If we do need to truncate, then the length depends on whether we add the suffix and offset the length of the suffix or not...
    _len_truncate   = ( _len, _max_len )[ _truncate ]
    _len_truncate   = ( _len_truncate, _max_len )[ _len_truncate <= _max_len ]

    ## If we add the suffix, add it... Suffix won't be added if the suffix is the same length as the text being output...
    if ( _add_suffix ):
        _text = _text[ 0 : _suffix_offset ] + _suffix_str + _text[ _suffix_offset: ]

    ## Return the text after truncating...
    return _text[ : _len_truncate ]

答案 16 :(得分:0)

你实际上不能截断&#34;一个Python字符串就像你可以做一个动态分配的C字符串。 Python中的字符串是不可变的。您可以做的是按照其他答案中的描述切割字符串,生成一个新字符串,其中仅包含切片偏移和步骤定义的字符。 在某些(非实际)情况下,这可能有点烦人,例如当您选择Python作为面试语言并且面试官要求您从字符串中删除重复字符时。卫生署。

答案 17 :(得分:-1)

不需要正则表达式,但您确实希望在接受的答案中使用字符串格式而不是字符串连接。

这可能是以75个字符截断字符串data的最规范的Pythonic方式。

>>> data = "saddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsaddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsadddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
>>> info = "{}..".format(data[:75]) if len(data) > 75 else data
>>> info
'111111111122222222223333333333444444444455555555556666666666777777777788888...'