我的数据库中有以下记录
[662] #<ChapterSolution:0x000055ec31cdfb40> {
:id => 5071,
:chapter_id => 221,
:created_at => Tue, 19 Sep 2017 18:24:57 IST +05:30,
:updated_at => Sat, 02 Dec 2017 10:24:53 IST +05:30,
:question => "11",
:part => "i",
:answer => "See Explanation",
:solution => "<img src='//cdn.google.in/editor/pictures/573/content.jpeg' />"
[663] #<ChapterSolution:0x000055ec31cdfb40> {
:id => 5071,
:chapter_id => 221,
:created_at => Tue, 19 Sep 2017 18:24:57 IST +05:30,
:updated_at => Sat, 02 Dec 2017 10:24:53 IST +05:30,
:question => "11",
:part => "i",
:answer => "See Explanation",
:solution => "<img src='//cdn.google.in/editor/pictures/574/content.jpeg' />"
[664] #<ChapterSolution:0x000055ec31cdfb40> {
:id => 5071,
:chapter_id => 221,
:created_at => Tue, 19 Sep 2017 18:24:57 IST +05:30,
:updated_at => Sat, 02 Dec 2017 10:24:53 IST +05:30,
:question => "11",
:part => "i",
:answer => "See Explanation",
:solution => "<img src='//cdn.google.in/editor/pictures/575/content.jpeg' />"
我想在 img src ='和 // cdn.google ... 之间的每条记录的解决方案值中添加 https: >使每个记录中的URL正确,
但是我不知道该怎么做,任何帮助/建议将不胜感激。
答案 0 :(得分:1)
一种便宜的方法是使用gsub
和正则表达式:
object[:solution].gsub(/img src='\/\//,"img src='https://")
答案 1 :(得分:1)
获取所有记录,然后在Ruby中使用正则表达式分别更新记录不是很有效。
您实际上可以在Postgres中使用pattern matching functions来执行单个UPDATE查询,该查询可以一次更新所有记录:
ChapterSolution.where("solution ~* 'src=[\\''|\"]//'").update_all(
"solution = regexp_replace(chapter_solutions.solution, '//(.*)[\\''|\"]', 'https://\\1')"
)
结果是:
irb(main):014:0> ChapterSolution.all.map(&:solution)
ChapterSolution Load (0.8ms) SELECT "chapter_solutions".* FROM "chapter_solutions"
=> ["<img src='https://cdn.google.in/editor/pictures/573/content.jpeg />", "<img src='https://cdn.google.in/editor/pictures/574/content.jpeg />", "<img src='https://cdn.google.in/editor/pictures/575/content.jpeg />"]
如果您还想剥离HTML,请更改正则表达式:
ChapterSolution.where("solution ~* 'src=[\\''|\"]//'").update_all(
"solution = regexp_replace(chapter_solutions.solution, '.*\\ssrc=[\''|\"]\/\/(.*)[\\''|\"].*', 'https://\\1')"
)
答案 2 :(得分:0)
如果您要更新solution
列数据,并且它们的形状与'<img src='//cdn......'
相同
update [table name]
set solution = substring(solution, 1, 10)||'https:'||substring(solution, 11);
答案 3 :(得分:0)
避免使用正则表达式的另一种高级解决方案是使用HTML解析器(Nokogiri)和URI模块:
# using .find_each loads the records in batches
ChapterSolution.find_each do |cs|
frag = Nokogiri::HTML.fragment(cs.solution)
uri = URI.parse(frag.children.first["src"])
uri.scheme = 'https'
# Use this if you want to keep the HTML tag
frag.children.first["src"] = uri.to_s
cs.update_column(:solution, frag.to_s)
# or this if you want to strip the HTML and only keep the URI
cs.update_column(:solution, uri.to_s)
end
使用正则表达式在数据库中进行更新的优点在于,它可以处理输入HTML的更多变化,并可以确保输出包含有效的URI。
缺点是速度会慢很多。
答案 4 :(得分:0)
我不知道这是否是最好的方法,但是嘿,它对我有用。
object = ChapterSolution.where("solution ~* 'src=[\\''|\"]//'")
object.each do |cs|
cs.solution = cs.solution.split("src='").join("src='https:")
cs.save
end