Question

我转换了＃34;字符＆＃34;中的几列之后收到此消息到＆＃34;数字＆＃34;：警告信息：未知或未初始化的列：df

我需要将csv文件（来自Qualtrics）加载到R.

filename <- "/Users/Study1.csv"
library(readr)
df <- read_csv(filename)

第一行包含变量名，但第二行和第三行是一组对R无用的字符。因此，我需要删除这两行。但是，由于那些无用的字符串块，R已经将第18行识别为字符，我需要手动将这些行转换为数字（这是我进行进一步分析所必需的）。

# The 2nd and 3rd rows of the csv file are useless (they are strings)
df <- df[3:nrow(df), ]
# cols 18 to the end are supposed to be numeric, but the 2nd and 3rd rows are string, so R thinks that these columns contain strings
df[ ,18:ncol(df)] <- lapply(df[ ,18:ncol(df)], as.numeric)

运行上面的代码后，弹出错误：

Warning message:
Unknown or uninitialised column: 'df'. 
Parsed with column specification:
cols(
  .default = col_character()
)
See spec(...) for full column specifications.
NAs introduced by coercionNAs introduced by coercion

NAs很好。但错误消息很烦人。有没有更好的方法将我的列转换为数字？谢谢大家！

EDITED 谢谢大家的意见。我尝试了skip第2行和第3行的方法。但是，发生了一件奇怪的事情。因为单元格包含多行，由空行分隔，R识别错误。我模糊了图片中的原始文字。无论我是否点击＆＃34;＆＃34;第一行作为名称＆＃34;都会发生这种情况。你能建议任何解决方法吗？再次感谢所有人。

2018-05-30更新：我已解决了这个问题。请参阅下面的答案或访问 How to import Qualtrics data (in csv format) into R

Answer 1

您可以在readr::read_csv

中指定列类型

df <- readr::read_csv(file_name, col_types = "c")

来自?readr::read_csv 的

或者，您可以使用紧凑的字符串表示，其中每个字符代表一列：c =字符，i =整数，n =数字，d = double，l =逻辑，D =日期，T =日期时间，t =时间，？ = guess，或_ / - 跳过列。

工作示例

df <- readr::read_csv("  ,    ,      
                         ,    ,      
                      idx, key, value
                         ,    ,  
                        1, foo,   196
                        2, bar,   691",
                      skip = 2,
                      col_names = TRUE,
                      col_types = "ncd")

df <- dplyr::slice(df, 2:n())

df
# # A tibble: 2 x 3
#   idx   key value
# <dbl> <chr> <dbl>
# 1   1   foo   196
# 2   2   bar   691

这假设标题和数据之间的行数是一致的，如果这可能会发生变化，则需要采用不同的策略。

Answer 2

谢谢大家的意见和建议。我听从了@alistaire关于使用skip的建议。

根据qualtrics单元格中的newline，我发现在导出数据时可以点击“更多选项”，然后选择“删除换行符”。

根据Skip specific rows using read.csv in R的建议，我使用以下代码来解决我的问题。

headers = read.csv(filename, header = F, nrows = 1, as.is = T)
df = read.csv(filename, skip = 3, header = F)
colnames(df)= headers

＆＃34;警告消息：未知或未初始化的列：df＆＃34;将几列转换为数字后

2 个答案:

或者，您可以使用紧凑的字符串表示，其中每个字符代表一列：c =字符，i =整数，n =数字，d = double，l =逻辑，D =日期，T =日期时间，t =时间，？ = guess，或_ / - 跳过列。

工作示例

＆＃34;警告消息：未知或未初始化的列：df＆＃34;将几列转换为数字后

2 个答案:

或者，您可以使用紧凑的字符串表示，其中每个字符代表一列：c =字符，i =整数，n =数字，d = double，l =逻辑，D =日期，T =日期时间，t =时间， ？ = guess，或_ / - 跳过列。

工作示例

或者，您可以使用紧凑的字符串表示，其中每个字符代表一列：c =字符，i =整数，n =数字，d = double，l =逻辑，D =日期，T =日期时间，t =时间，？ = guess，或_ / - 跳过列。