转换为长格式数据

时间:2018-05-09 07:40:10

标签: r dplyr tidyr lubridate

我有一个数据帧(df)如图所示:

<android.support.constraint.ConstraintLayout
    xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    android:layout_width="match_parent"
    android:layout_height="match_parent">

    <ImageView
    android:id="@+id/imgLogo"
    android:layout_width="0dp"
    android:layout_height="0dp"
    app:layout_constraintDimensionRatio="H,1:1"
    app:layout_constraintEnd_toStartOf="@+id/guidelineV_75"
    app:layout_constraintStart_toEndOf="@+id/guidelineV_25"
    app:layout_constraintTop_toTopOf="@+id/guideline_15"
    app:srcCompat="@drawable/ic_launcher_background" />

<android.support.constraint.Guideline
    android:id="@+id/guideline_15"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:orientation="horizontal"
    app:layout_constraintGuide_percent="0.15" />

<android.support.constraint.Guideline
    android:id="@+id/guidelineV_25"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:orientation="vertical"
    app:layout_constraintGuide_percent="0.25" />

<android.support.constraint.Guideline
    android:id="@+id/guidelineV_75"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:orientation="vertical"
    app:layout_constraintGuide_percent="0.75" />

</android.support.constraint.ConstraintLayout>

以上数据是每日10个数据(每个月3次观察),一年观察36次。第3次观察(X1,X2,X3)对应于1月份,第3次观察(X4,X5,X6)对应于2月份,随后在该月的其余时间遵循相同的模式。 所以我的问题是如何转换这些数据,使它看起来像这样:

head(df)
Year     X1     X2     X3     X4     X5     X6     X7     X8     X9....X36

1 1970     NA     NA     NA     NA     NA     NA     NA     NA     NA.....
2 1971 123.47 110.19 125.49 121.12 109.23  78.92 111.75  90.70  91.95.....
3 1972 142.20 131.95 173.17 222.52 220.85 175.16 180.09 165.64 164.35.....
4 1973 192.60 174.36 207.86 182.91 170.26 128.39 164.50 157.06 151.11.....
5 1974 214.89 200.21 221.03 188.61 175.43 137.63 156.84 142.45 155.58.....
6 1975 141.88 132.59 154.14 139.14 139.78  81.49 105.59 101.58 113.15.....

str(df)
'data.frame':   48 obs. of  37 variables:
 $ Year: num  1970 1971 1972 1973 1974 ...
 $ X1  : num  NA 123 142 193 215 ...
 $ X2  : num  NA 110 132 174 200 ...
 $ X3  : num  NA 125 173 208 221 ...
 $ X4  : num  NA 121 223 183 189 ...
 $ X5  : num  NA 109 221 170 175 ...
 $ X6  : num  NA 78.9 175.2 128.4 137.6 ...
 $ X7  : num  NA 112 180 164 157 ...
 $ X8  : num  NA 90.7 165.6 157.1 142.4 ...
 $ X9  : num  NA 92 164 151 156 ...
 $ X10 : num  NA 81.8 137 136.7 137.5 ...
 ..
 $ X36  :num  NA ..................

我尝试了以下操作,但似乎无法正常工作。

Year Month Value
1971 Jan   123.47
1971 Jan   110.19
1971 Jan   125.49
1971 Feb   121.12
1971 Feb   109.23
1971 Feb   78.92
..................
1971 Dec   150
1972 Jan   180

任何帮助将不胜感激

2 个答案:

答案 0 :(得分:1)

你快到了。

xy <- data.frame(year = 1970:1974, matrix(runif(5*6), ncol = 6))

months <- c("Jan", "Feb")
colnames(xy)[-1] <- paste(rep(months, each = 3), rep(1:3, times = length(months)), sep = ".")

library(tidyr)
out <- gather(xy, key = "month", value = "value", -year)

out$month <- gsub("\\.\\d{1}$", "", out$month)

head(out)

  year month     value
1 1970   Jan 0.9749443
2 1971   Jan 0.3167903
3 1972   Jan 0.5024181
4 1973   Jan 0.5217141
5 1974   Jan 0.1422871
6 1970   Jan 0.2429328

在此示例中,我创建了唯一的列名,并使用gsub删除了点位标识符。填写months变量中的所有月份后,您应该能够使用此代码。它假设每个月有三个重复。当然,这种假设可以放宽。

答案 1 :(得分:0)

从@Roman Lustrik获取数据

xy = data.frame(year = 1970:1974, matrix(runif(5*6), ncol = 6))
df = as.data.frame(t(subset(xy, select = -c(year)))) #transposing and subsetting 
d1 = data.frame(Value = unlist(df, use.names = FALSE)) # adding one column below another 



cbind(year = rep(xy$year, each = 6), month = rep(c("Jan","Feb"),each = 3),Value = d1)

# req =  cbind(year = rep(1971:1975, each = 36), month = rep(month.abb,each = 3),)