我正在从网上逐行读取文件,每行都是一个列表。列表具有按此模式可见的三列:library(ggplot2)
# devtools::install_github("thomasp85/patchwork")
library(patchwork)
a <- 1:20
b <- sample(a, 20)
c <- sample(b, 20)
d <- sample(c, 20)
mydata <- data.frame(a, b, c, d)
myplot1 <- ggplot(mydata, aes(x=a, y=b)) + geom_point() + labs(tag = "A")
myplot2 <- ggplot(mydata, aes(x=b, y=c)) + geom_point() + labs(tag = "B")
myplot3 <- ggplot(mydata, aes(x=c, y=d)) + geom_point() + labs(tag = "C")
myplot4 <- ggplot(mydata, aes(x=d, y=a)) + geom_point() + labs(tag = "D")
myplot1 + myplot2 + myplot3 + myplot4
。
这是我的代码:
+++$+++
我已尝试使用python3.6中的此指令拆分列表,但无法正常工作。任何建议都将受到赞赏:
列表:
with closing(requests.get(url, stream=True)) as r:
reader = csv.reader(codecs.iterdecode(r.iter_lines(), 'latin-1'))
for i, row in enumerate(reader):
if i < 5:
t = row[0].split('(\s\+{3}\$\+{3}\s)+')
print(t)
这是我的正则表达式:
['m0 +++$+++ 10 things i hate about you +++$+++ http://www.dailyscript.com/scripts/10Things.html']
['m1 +++$+++ 1492: conquest of paradise +++$+++ http://www.hundland.org/scripts/1492-ConquestOfParadise.txt']
['m2 +++$+++ 15 minutes +++$+++ http://www.dailyscript.com/scripts/15minutes.html']
['m3 +++$+++ 2001: a space odyssey +++$+++ http://www.scifiscripts.com/scripts/2001.txt']
['m4 +++$+++ 48 hrs. +++$+++ http://www.awesomefilm.com/script/48hours.txt']
每一行只有一个组成部分-> row[0].split('(\s\+{3}\$\+{3}\s)+')
当我打印结果时,不拆分行。
答案 0 :(得分:1)
做
row[0].split(' +++$+++ ')
应该在没有正则表达式的情况下准确地提供您想要的东西。
答案 1 :(得分:0)
假设您不想使用split(),那么如果您想放松一下并返回一个元组,这可能会有所帮助。
输入
import re
input = '''['m0 +++$+++ 10 things i hate about you +++$+++ http://www.dailyscript.com/scripts/10Things.html']
['m1 +++$+++ 1492: conquest of paradise +++$+++ http://www.hundland.org/scripts/1492-ConquestOfParadise.txt']
['m2 +++$+++ 15 minutes +++$+++ http://www.dailyscript.com/scripts/15minutes.html']
['m3 +++$+++ 2001: a space odyssey +++$+++ http://www.scifiscripts.com/scripts/2001.txt']
['m4 +++$+++ 48 hrs. +++$+++ http://www.awesomefilm.com/script/48hours.txt']'''
output = re.findall('\[\'([\S\s]+?)[\s]+[\+]{3}\$[\+]{3}[\s]+([\S\s]+?)[\s][\+]{3}\$[\+]{3}[\s]+([\S\s]+?)\'\]', input)
print(output)
输出:
[('m0', '10 things i hate about you', 'http://www.dailyscript.com/scripts/10Things.html'), ('m1', '1492: conquest of paradise', 'http://www.hundland.org/scripts/1492-ConquestOfParadise.txt'), ('m2', '15 minutes', 'http://www.dailyscript.com/scripts/15minutes.html'), ('m3', '2001: a space odyssey', 'http://www.scifiscripts.com/scripts/2001.txt'), ('m4', '48 hrs.', 'http://www.awesomefilm.com/script/48hours.txt')]
。
。
我也在尝试使用交替的正则表达式,但是为了我的生命,我最终无法使公式起作用。我稍后再发布,但希望以上内容对您有帮助