我有一个数据框,其中一列看起来像这样
df = read.table(file="sprint.m.df.txt", sep="\t", quote="", header=TRUE)
X.Rank...Time...Wind...Name...Country...Birthdate...City...Date.
1 1 9.58 0.9 "Usain Bolt" "JAM" "21.08.86" "Berlin" "16.08.2009"
2 2 9.63 1.5 "Usain Bolt" "JAM" "21.08.86" "London" "05.08.2012"
3 3 9.69 0 "Usain Bolt" "JAM" "21.08.86" "Beijing" "16.08.2008"
4 3 9.69 2 "Tyson Gay" "USA" "09.08.82" "Shanghai" "20.09.2009"
5 3 9.69 -0.1 "Yohan Blake" "JAM" "26.12.89" "Lausanne" "23.08.2012"
6 6 9.71 0.9 "Tyson Gay" "USA" "09.08.82" "Berlin" "16.08.2009"
我一直在尝试使用字符串拆分和其他方法将列拆分为多列,但没有任何效果。
如何拆分数据框,以便最终得到一个数据框
X.rank | Time | wind | name | country | birthdate| city | date
1 9.58 0.9 Usian Bolt jam 21.08.86 Berlin 16.08.2009
答案 0 :(得分:2)
您可以使用 tibble 包创建一个 tribble
library(tibble)
df <- tribble(
~X, ~RankTime, ~Wind, ~Name, ~Country, ~Birthdate, ~City, ~Date,
1, 9.58, 0.9, "Usain Bolt", "JAM", "21.08.86", "Berlin", "16.08.2009",
2, 9.63, 1.5, "Usain Bolt", "JAM", "21.08.86", "London", "05.08.2012",
3, 9.69, 0, "Usain Bolt", "JAM", "21.08.86", "Beijing", "16.08.2008",
3, 9.69, 2, "Tyson Gay", "USA", "09.08.82", "Shanghai", "20.09.2009",
3, 9.69, -0.1, "Yohan Blake", "JAM", "26.12.89", "Lausanne", "23.08.2012",
6, 9.71, 0.9, "Tyson Gay", "USA", "09.08.82", "Berlin", "16.08.2009")
df
# output
# A tibble: 6 x 8
X RankTime Wind Name Country Birthdate City Date
<dbl> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr>
1 1 9.58 0.9 Usain Bolt JAM 21.08.86 Berlin 16.08.2009
2 2 9.63 1.5 Usain Bolt JAM 21.08.86 London 05.08.2012
3 3 9.69 0 Usain Bolt JAM 21.08.86 Beijing 16.08.2008
4 3 9.69 2 Tyson Gay USA 09.08.82 Shanghai 20.09.2009
5 3 9.69 -0.1 Yohan Blake JAM 26.12.89 Lausanne 23.08.2012
6 6 9.71 0.9 Tyson Gay USA 09.08.82 Berlin 16.08.2009
答案 1 :(得分:0)
引号内的空格使该列难以解析,但很容易阅读。请参阅我上面的评论并使用 read.table(file="sprint.m.df.txt", sep=" ")
,或者如果您确实必须使用您的 df
,请尝试使用 read_delim
或 scan
。
df8 <- readr::read_delim(df[,1], delim=" ", col_names =FALSE)
# OR
df8 <- data.frame(matrix(scan(text=df[,1], what=" "), ncol=8, byrow=TRUE))
colnames(df8) <- c("rank", "Time", "wind", "name", "country", "birthdate", "city", "date")
df8
rank Time wind name country birthdate city date
1 1 9.58 0.9 Usain Bolt JAM 21.08.86 Berlin 16.08.2009
2 2 9.63 1.5 Usain Bolt JAM 21.08.86 London 05.08.2012
3 3 9.69 0 Usain Bolt JAM 21.08.86 Beijing 16.08.2008
4 3 9.69 2 Tyson Gay USA 09.08.82 Shanghai 20.09.2009
5 3 9.69 -0.1 Yohan Blake JAM 26.12.89 Lausanne 23.08.2012
6 6 9.71 0.9 Tyson Gay USA 09.08.82 Berlin 16.08.2009