我正在使用ruby解析xml recods。 XML文件具有以下数据结构:
<row Id="27" PostTypeId="2" ParentId="11" CreationDate="2008-08-01T12:17:19.357" Score="13" Body="<p>@jeff</p>

<p>IMHO yours seems a little long. However it does seem a lit
tle more robust with support for "yesterday" and "years". But in my experience when this is used the person is most likely to view the content in the first 30 days. It is only the really har
dcore people that come after that. So that is why I usually elect to keep this short and simple.</p>

<p>This is the method I am currently using on one of my websites. This only re
turns a relative day, hour, time. And then the user has to slap on "ago" in the output.</p>

<pre><code>public static string ToLongString(this TimeSpan time)<br&g
t;{<br> string output = String.Empty;<br><br> if (time.Days &gt; 0)<br> output += time.Days + " days ";<br><br> if ((time.Days == 0 || time.Days =
= 1) &amp;&amp; time.Hours &gt; 0)<br> output += time.Hours + " hr ";<br><br> if (time.Days == 0 &amp;&amp; time.Minutes &gt; 0)<br> outp
ut += time.Minutes + " min ";<br><br> if (output.Length == 0)<br> output += time.Seconds + " sec";<br><br> return output.Trim();<br>}<br>
</code></pre>" OwnerUserId="17" LastEditorUserId="17" LastEditorDisplayName="Nick Berardi" LastEditDate="2008-08-01T13:16:49.127" LastActivityDate="2008-08-01T13:16:49.127" CommentCount="1" CommunityO
wnedDate="2009-09-04T13:15:59.820" />
但是有些记录没有所有元素
<row Id="29" PostTypeId="2" ParentId="13" CreationDate="2008-08-01T12:19:17.417" Score="18" Body="<p>There are no HTTP headers that will report the clients timezone so far although it has been suggested t
o include it in the HTTP specification.</p>

<p>If it was me, I would probably try to fetch the timezone using clientside JavaScript and then submit it to the server using Ajax or so
mething.</p>" OwnerUserId="19" LastActivityDate="2008-08-01T12:19:17.417" CommentCount="0" />
我的ruby解析遍历这些XML记录并将它们插入到MySQL数据库中:
def on_start_element(element, attributes)
if element == 'row'
@post_st.execute(attributes['Id'], attributes['PostTypeId'], attributes['AcceptedAnswerId'], attributes['ParentId'], attributes['Score'], attributes['ViewCount'],
attributes['Body'], attributes['OwnerUserId'] == nil ? -1 : attributes['OwnerUserId'], attributes['LastEditorUserId'], attributes['LastEditorDisplayName'],
DateTime.parse(attributes['LastEditDate']).to_time.strftime("%F %T"), DateTime.parse(attributes['LastActivityDate']).to_time.strftime("%F %T"), attributes['Title'] == nil ? '' : attributes['Title'],
attributes['AnswerCount'] == nil ? 0 : attributes['AnswerCount'], attributes['CommentCount'] == nil ? 0 : attributes['CommentCount'],
attributes['FavoriteCount'] == nil ? 0 : attributes['FavoriteCount'], DateTime.parse(attributes['CreationDate']).to_time.strftime("%F %T"))
post_id = attributes['Id']
tags = attributes['Tags'] == nil ? '' : attributes['Tags']
tags.scan(/<(.*?)>/).each do |tag_name|
tag_id = insert_or_find_tag(tag_name[0])
@post_ot_tag_insert_st.execute(post_id, tag_id)
end
end
end
但是在处理基于最新记录的第二条记录时,我的数据库中插入了(最后一条记录是行id = 27的记录)我收到以下错误:
/format.rb:1031:in `dup': can't dup NilClass (TypeError)
我想知道它是否与缺少的元素有关,让我们说如果它遗漏了我在数据库中期待的一些元素,我想知道我应该如何处理这个或设置为某种默认值。例如,如果它的缺失日期将日期设置为某个默认日期值。
这就是抱怨的路线:
DateTime.parse(attributes['LastEditDate']).to_time.strftime("%F %T"), DateTime.parse(attributes['LastActivityDate']).to_time.strftime("%F %T"), attributes['Title'] == nil ? '' : attributes['Title'],
我认为它在LastEditDate
上抱怨?