Question

我从多个CSV文件中提取数据并尝试将它们组合到一个数据框中。源数据格式很奇怪，因此我必须从源中的特定位置提取数据，然后将它们放在我的结果数据框中的逻辑模式中。

我创建了两个相等长度的向量，并从源文件中提取数据。最终的结果是我得到了两个长度为3的向量（正如预期的那样），但是没有3x2数据帧（2个变量的3次观察），我结束了1x6数据帧（1个6个变量的观察）。

对我来说很奇怪的是，尽管RStudio认为它们都是＆＃34;列表3＆＃34;，当我在控制台中显示它们时，它们的显示方式却截然不同：

不的源代码：

#set the working directory to where the data files are stored
setwd("/foo") 

# identify how many data files are present
files = list.files("/foo")

# create vectors long enough to contain all the postal codes and income data
postalCodeData=vector(length=length(files))
medianIncomeData=vector(mode="character", length=length(files))

# loop through all the files, pulling data from rows 2 and 1585.
  for(i in 1:length(files)) {
  x = read.csv(files[i],skip=1,nrows= 1,header=F)
  y = read.csv(files[i], skip = 1584, nrows = 1,header=F)
  postalCodeData[i]=x
  medianIncomeData[i]=y[2]
  }

#create the data frame
Results=data.frame(postalCodeData,medianIncomeData)

#name the columns
names(Results)=c("FSA", "Median Income")

我的数据框看起来像这样：

的源代码：

setwd("/Users/Perry/Downloads/Postal Code Data/")
files = list.files("/Users/Perry/Downloads/Postal Code Data/")
postalCodeData=c("K0A","K0B","K0C")
medianIncomeData=c("10000","20000","30000")

Results=data.frame(postalCodeData,medianIncomeData)
names(Results)=c("FSA", "Median Income")

不幸的是，我无法明确指定值，因为我有几百个文件要从中提取信息。关于如何纠正循环以获得所需结果的任何建议都将不胜感激。

Answer 1

＆＃34; read.csv＆＃34;的输出是一个数据框，所以，当你存储

stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)

您正在存储数据框的列，请使用

medianIncomeData[i]=y[2]

相反，只存储您想要的值，x

相同

将两个向量合并到数据帧时出现意外结果

1 个答案: