我在构建txt数据集时遇到问题,这是关于来自不同领域和时间的新闻,如下所示:
court agrees to expedite n.f.l.'s appeal
the decision means a ruling could be made nearly two months before the regular season begins, time for the sides to work out a deal without delaying the season.
http://feeds1.nytimes.com/~r/nyt/rss/sports/~3/nbjo7ygxwpc/04nfl.html
0
04 May 2011 07:39:03
nyt
sport
investing: can you profit in agricultural commodities?
bad weather is one factor behind soaring food prices. can you make hay with farm stocks? possibly: but be prepared to harvest gains on a moment's ...
http://rssfeeds.usatoday.com/~r/usatodaycommoney-topstories/~3/qbhb22sut9y/2011-05-19-can-you-make-gains-in-grains_n.htm
1
20 May 2011 15:13:57
ut
business
no tsunami but fifa's corruption storm rages on
though jack warner's threatened soccer "tsunami" remains stuck in the doldrums, the corruption storm raging around fifa shows no sign of abating after another extraordinary week for the game's governing body.
2
07 Jun 2011 17:54:54
reuters
sport
现在我尝试使用R将此数据集作为列中的不同变量读取。 每个的第一行是“主题”,然后是“描述”,“链接”,“ID”,“数据和时间”,“城市”,最后一行是“字段”。该文件包含数千行,其中许多行都缺少变量。
我真的不知道在哪里以及如何开始它。希望有人能帮助我!