我到处都看过,但是似乎可以找到一种简单的方法来在python中获取子字符串。
我正在使用tweepy,并且已将tweepy推文存储到textblob数组中,并将该textblob设置为字符串变量。
示例: “ RT @Acosta:特朗普为夏洛茨维尔的“非常优秀的人”辩护:“人们在那里抗议这座纪念碑的倒塌……”
这是一条推文,我想要“ @Acosta”(或Acosta)部分, 我将如何对该部分进行子字符串化?
我尝试使用re库,虽然它可以在其他字符串上正常运行,但在推文上却不起作用
Dim counter As Long
Dim maxcount As Long
Dim ws As Worksheet
maxcount = 10000
Set ws = ThisWorkbook.Sheets(1)
ws.Activate
' some code
ws.Cells(1,1).Clear
DoEvents
' Refresh statement
ws.ListObjects(1).Refresh
' Wait for refresh or timer to expire
Do While ws.Cells(1,1).value = ""
Application.Wait(100)
counter = counter + 100
If counter >= maxcount then
MsgBox "Refresh failed"
End
End If
Loop
DoEvents
答案 0 :(得分:0)
您可以使用force_tz(ymd_hms(dates), tzone='America/Chicago')
做这样的事情:
split
将它们放在一起:
>>> test = '"RT @Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"'
>>> mention = test.split('@')
>>> mention
['"RT ', 'Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"']
>>> person = mention[1].split(':')
>>> person
['Acosta', ' Trump defends his “very fine people” comments on Charlottesville', ' “People were there protesting the taking down of the monument…"']
>>> person[0]
'Acosta'
Python脚本
>>> person = test.split('@')[1].split(':')[0]
>>> person
'Acosta'
您应该进行一些错误检查,以确保在拆分内容之前,您是否找到了内容。
答案 1 :(得分:0)
无法复制您的问题。修复
之后SyntaxError:文件main.py中第3行的非ASCII字符'\ xe2',但未声明编码;有关详情,请参见http://python.org/dev/peps/pep-0263/
由于包含“ and ” and …
的数据,它可以正常工作:
randTweet = """RT @Acosta: Trump defends his "very fine people" comments on Charlottesville: "People were there protesting the taking down of the monument..." """
import re
match = re.search("\@(.*?)\:" , randTweet).group(1)
print(match) # Acosta