在R中读取凌乱的CSV文件时遇到问题

时间:2020-04-27 04:20:58

标签: r data-cleaning

我一直试图将CSV读取到R中。CSV以一种奇怪的方式分离,一列中的所有值都用逗号分隔,如这张图片this picture所示。第一行是列名,然后是值 当我尝试read_csv("filename")时,运行视图函数enter image description here后,除了一堆NA值(如上图所示)外,没有任何内容显示在小标题中。我该如何处理?

以下是参考数据

, Calories, Fat (g), Carb. (g), Fiber (g), Protein (g)
Chonga Bagel,300,5,50,3,12
8-Grain Roll,380,6,70,7,10
Almond Croissant,410,22,45,3,10
Apple Fritter,460,23,56,2,7
Banana Nut Bread,420,22,52,2,6
Blueberry Muffin with Yogurt and Honey,380,16,53,1,6
Blueberry Scone,420,17,61,2,5
Butter Croissant,240,12,28,1,5
Butterfly Cookie,350,22,38,0,2
Cheese Danish,320,16,36,1,8
Chewy Chocolate Cookie,170,5,30,2,2
Chocolate Chip Cookie,310,15,42,2,4
Chocolate Chunk Muffin,440,21,60,2,7
Chocolate Croissant,330,18,38,1,6
Chocolate Hazelnut Croissant,390,22,43,2,7
Chocolate Marble Loaf Cake,490,24,64,2,6
Cinnamon Morning Bun,390,15,56,2,8
Cinnamon Raisin Bagel,270,1,58,3,9
Classic Coffee Cake,390,16,57,1,5
Cookie Butter Bar,360,23,36,0,2

1 个答案:

答案 0 :(得分:2)

使用以下代码读取数据

df = read.csv("starbucks-menu-nutrition-food.csv", skipNul = T)

head(df, 2)

        ÿþ Calories Fat..g. Carb...g. Fiber..g. Protein..g.
1 Chonga Bagel      300       5        50         3          12
2 8-Grain Roll      380       6        70         7          10

然后,您可以考虑重命名列,例如

colnames(df) <- c("Food", "Calories", "Fat", "Carb", "Fiber", "Protein")

用于数据的进一步处理。