Question

我有一个.csv文件，我需要读入R.第一行包含名称（例如BFI1，BFI2，CAQ2），第二行包含我也想在R中访问的问题（例如＆＃34;我喜欢参加派对＆＃34;）。前两个之后的每一行对应一个参与者。

我希望能够访问R中的代码和文本（例如，使用grep访问一项调查中的所有问题，并在需要时查看项目文本。我需要数字响应为数字。

BFI1, BFI2, CAQ1, CAQ2
Likes to read, Enjoys Parties, Is Nervous, Loves Books
3, 7, 1, 4
4, 5, 3, 3

我想读这个，以便我可以访问名称（第1行）或文本（可能是标签）。我查看了Hmisc包，但它们的标签功能似乎有限。

有没有办法读取这个.csv文件并访问这两个值？

Answer 1

不确定将标签作为单独的载体是否合适，但这是一个想法。假设您的文件名为x.txt

## set up an argument list for scan() - just to avoid repetition
scanArgs <- list(
    file = "x.txt", what = "", nlines = 1, sep = ",", strip.white = TRUE
)

## read the data with no header and add the first line as names
df <- setNames(
    read.table("x.txt", skip = 2, sep = ","), 
    do.call(scan, scanArgs)
)
#   BFI1 BFI2 CAQ1 CAQ2
# 1    3    7    1    4
# 2    4    5    3    3

## make the label vector
labels <- setNames(do.call(scan, c(scanArgs, skip = 1)), names(df))
#            BFI1             BFI2             CAQ1             CAQ2 
# "Likes to read" "Enjoys Parties"     "Is Nervous"    "Loves Books"

因此labels中的元素对应df中的列，而且列是数字。

请注意x.txt是使用

创建的

txt <- 'BFI1, BFI2, CAQ1, CAQ2
Likes to read, Enjoys Parties, Is Nervous, Loves Books
3,7,1,4
4,5,3,3'
writeLines(txt, "x.txt")

Answer 2

您可以使用nrows和skip参数或read.csv

nameFile <- "data.csv"

# read the first two lines
vectorNames <- read.csv(nameFile, nrows = 1)
vectorDescription <- read.csv(nameFile, nrows = 1, skip = 1)

# read the data
dfIn <- read.csv(nameFile, skip = 2)
names(dfIn) <- vectorNames

Answer 3

@Richard Scriven我使用了你的代码并使用包

进行了跟进

library(Hmisc)
y=data.frame(temp=rep(NA,nrow(df)))  
for (i in 1:length(labels)){  
x=df[,i]  
label(x)=labels[i]   
y[names(df)[i]]=x  
}  
y$temp=NULL  
y  
#  BFI1 BFI2 CAQ1 CAQ2
# 1    3    7    1    4
# 2    4    5    3    3
label(y)
#            BFI1             BFI2             CAQ1             CAQ2 
# "Likes to read" "Enjoys Parties"     "Is Nervous"    "Loves Books"

Answer 4

基于Michelle Usuelli的回答和Rich Scriven更正，您可以编写此功能：

read_csv_with_labels <- function(fileName)
{
 library(Hmisc)

 # read the first two lines
 varNames <- read.csv(fileName, nrows = 1, stringsAsFactors = FALSE, header = FALSE)
 varLabels <- read.csv(fileName, nrows = 1, stringsAsFactors = FALSE, header = TRUE)

 # read the data
 df <- read.csv(fileName, skip = 2)

 # assign variable names and labels to the dataframe
 names(df) <- varNames
 label(df) <- varLabels 

 return(df)
}

我认为这应该包含在read.csv和read_csv的基本功能中。

读取带有名称和标签的.csv文件到R中

4 个答案: