Question

我到处都看过，但是似乎可以找到一种简单的方法来在python中获取子字符串。

我正在使用tweepy，并且已将tweepy推文存储到textblob数组中，并将该textblob设置为字符串变量。

示例： “ RT @Acosta：特朗普为夏洛茨维尔的“非常优秀的人”辩护：“人们在那里抗议这座纪念碑的倒塌……”

这是一条推文，我想要“ @Acosta”（或Acosta）部分，我将如何对该部分进行子字符串化？

我尝试使用re库，虽然它可以在其他字符串上正常运行，但在推文上却不起作用

Dim counter As Long
Dim maxcount As Long
Dim ws As Worksheet
maxcount = 10000
Set ws = ThisWorkbook.Sheets(1)
ws.Activate
' some code
ws.Cells(1,1).Clear
DoEvents

' Refresh statement
ws.ListObjects(1).Refresh

' Wait for refresh or timer to expire
Do While ws.Cells(1,1).value = ""
  Application.Wait(100)
  counter = counter + 100
  If counter >= maxcount then
    MsgBox "Refresh failed"
    End
  End If
Loop
DoEvents

Answer 1

您可以使用force_tz(ymd_hms(dates), tzone='America/Chicago')做这样的事情：

split

将它们放在一起：

>>> test = '"RT @Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"'
>>> mention = test.split('@')
>>> mention
['"RT ', 'Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"']
>>> person = mention[1].split(':')
>>> person
['Acosta', ' Trump defends his “very fine people” comments on Charlottesville', ' “People were there protesting the taking down of the monument…"']
>>> person[0]
'Acosta'

Python脚本

>>> person = test.split('@')[1].split(':')[0]
>>> person
'Acosta'

您应该进行一些错误检查，以确保在拆分内容之前，您是否找到了内容。

Answer 2

无法复制您的问题。修复

之后

SyntaxError：文件main.py中第3行的非ASCII字符'\ xe2'，但未声明编码；有关详情，请参见http://python.org/dev/peps/pep-0263/

由于包含“ and ” and …的数据，它可以正常工作：

randTweet = """RT @Acosta: Trump defends his "very fine people" comments on Charlottesville: "People were there protesting the taking down of the monument..." """

import re

match = re.search("\@(.*?)\:" , randTweet).group(1)
print(match) # Acosta

在python中获取两个字符之间的子字符串的最佳方法？

2 个答案: