好的,我需要的是有点奇怪,如下:
手动创建线索数组,动态创建数据数组。 Xpath函数将我们的线索作为输入,并将结果映射到数据以创建动态数组
clues = Array.new
clues << 'Power supply type'
clues << 'Slots'
clues << 'Software included'
selector = "//td[text()='%s']/following-sibling::td"
data = Array.new
data = clues.map do |clue|
xpath = selector % clue
[clue, doc.at(xpath).text.strip]
end
数据数组中的代码使用两个输入,线索和选择器
clues [index]中的每一项都进入%s的选择器变为
//td[text()='%s']/following-sibling::td
//td[text()='Power supply type']/following-sibling::td
//td[text()='Slots']/following-sibling::td
//td[text()='Software included']/following-sibling::td
Xpath然后关闭并使用我们存储的命令从网页抓取信息,所有这些都作为数据存储为数组数据中的元素[0] ... data [3]
数据[2]看起来像这样,是一大块信息
Symantec Norton Internet Security (60 days live update); Recovery partition (inc
luding possibility to recover system; applications and drivers separately); Opti
onal re-allocation of recovery partition;
我想把这里列出的每一件软件都存放在自己的软件上,例如
data[2]Symantec Norton Internet Security (60 days live update);
data[3]Recovery partition (including possibility to recover system;
data[4]Optional re-allocation of recovery partition;
所以我假设我需要以某种方式分割数据[2]并将其添加回数据数组中?
我正在尝试隔离这个特定的索引,因为我需要在多行上为我的最终输出到电子表格
最终所需输出
答案 0 :(得分:2)
为了澄清,你有一个这样的数组:
data << 'Power supply type'
data << 'Slots'
data << 'Symantec Norton Internet Security (60 days live update); Recovery partition (inc luding possibility to recover system; applications and drivers separately); Optional re-allocation of recovery partition;'
data << 'Something else'
你想要它变成这个吗?
data << 'Power supply type'
data << 'Slots'
data << Symantec Norton Internet Security (60 days live update);
data << Recovery partition (inc luding possibility to recover system;
data << applications and drivers separately);
data << Optional re-allocation of recovery partition;
data << 'Something else'
您可以通过执行以下操作来执行此操作:
temp = []
data[2].split(/(;)/).each_slice(2){ |s| temp << s.join.strip }
data[2] = temp
data.flatten!
或者,如果要迭代数据数组中的所有项目:
data.each_with_index do |x, i|
temp = []
data[i].split(/(;)/).each_slice(2){ |s| temp << s.join.strip }
data[i] = temp
end
data.flatten!
基本上发生的事情是它需要字符串,将其分解为';',重新插入';'删除它的地方,用分割字符串的数组替换数据数组中的原始点,然后将整个数据数组展平成一个数组。
答案 1 :(得分:0)
data = Array.new
clues.each do |clue|
xpath = selector % clue
text = doc.at(xpath).text.strip
if clue == 'Software included'
values = text.scan(/.+?;/)
values << text if values.empty? # text did not contain a semicolon
data << [clue, values.shift.strip]
values.each do |value|
data << ['', value.strip]
end
else
data << [clue, text]
end
end
输出(缩进以便更具可读性):
[
["Power supply type", "400w"],
["Slots", "2"],
["Software included", "Symantec Norton Internet Security (60 days live update);"],
["", "Recovery partition (including possibility to recover system;"],
["", "applications and drivers separately);"],
["", "Optional re-allocation of recovery partition;"]
]
答案 2 :(得分:0)
data = data[0..1] + data[2].scan(/.*?;/) + data[3..-1]