如何将广泛的连续数据重新整形为长分类数据?

时间:2017-11-24 14:11:23

标签: r dataframe reshape

我的数据采用以下宽格式,按照SUBJECT_ID的行,总共观察变量XY,然后是各种元数据列,例如SUBJECT_BIRTHYEARSUBJECT_HOMETOWN

variableX    variableY    SUBJECT_ID     SUBJECT_BIRTHYEAR     SUBJECT_HOMETOWN
2            1            A              1950                  Townsville
1            2            B              1951                  Villestown

我想将这些转换为以下长格式,每个X的变量YSUBJECT_ID每个观察点都会显示:

VARIABLE     SUBJECT_ID     SUBJECT_BIRTHYEAR     SUBJECT_HOMETOWN
X            A              1950                  Townsville
X            A              1950                  Townsville
Y            A              1950                  Townsville
X            B              1951                  Villestown
Y            B              1951                  Villestown
Y            B              1951                  Villestown

具体到我的问题是如何将连续变量的 n 观察转换为 n 行的分类数据。

3 个答案:

答案 0 :(得分:1)

尝试以下

数据

 public void onActivityCreated(@Nullable Bundle savedInstanceState) {
    super.onActivityCreated(savedInstanceState);
    mViewModel = ViewModelProviders.of(this, mViewModelFactory).get(ListMedicosViewModel.class);
    setupView();
    setProgress(true);
    mViewModel.loadMedicos();
    mViewModel.getApiResponse().observe(this, apiResponse -> {
        if (apiResponse.getError() != null) {
            handleError(apiResponse.getError());
        } else {
            handleResponse(apiResponse.getMedicos());
        }
    });

解决方案

public class ListMedicosViewModel extends ViewModel {

private MediatorLiveData<ApiResponseMedicos> mApiMedicosResponse;
private SaludRepository mSaludRepository;
@Inject
public ListMedicosViewModel(SaludRepository saludRepository) {
    mApiMedicosResponse = new MediatorLiveData<>();
    mSaludRepository = saludRepository;
}
@NonNull
public LiveData<ApiResponseMedicos> getApiResponse() {
    return mApiMedicosResponse;
}
public LiveData<ApiResponseMedicos> loadMedicos() {
    mApiMedicosResponse.addSource(
            mSaludRepository.getMedicos(),
            apiResponse -> mApiMedicosResponse.setValue(apiResponse)
    );
    return mApiMedicosResponse;
}

答案 1 :(得分:1)

该问题要求将dcast()的调用反转,该调用使用length()作为聚合函数将数据从长格式转换为宽格式。

这可以通过调用melt()加上一些额外的转换来实现:

library(data.table)
# reshape wide back to long format
long <- melt(setDT(wide), measure.vars = c("variableX", "variableY"))[
  # undo munging of variable names
  , variable := stringr::str_replace(variable, "^variable", "")][]
# undo effect of aggregation by length()
result <- long[long[, rep(.I, value)]][
  # beautify result
  order(SUBJECT_ID), !"value"]
result
   SUBJECT_ID SUBJECT_BIRTHYEAR SUBJECT_HOMETOWN variable
1:          A              1950       Townsville        X
2:          A              1950       Townsville        X
3:          A              1950       Townsville        Y
4:          B              1951       Villestown        X
5:          B              1951       Villestown        Y
6:          B              1951       Villestown        Y

.I是一个特殊符号,用于保存行位置,即行索引。

为了证明这确实是逆操作,可以再次重新设置result以重现wide

dcast(result, ... ~ paste0("variable", variable), length, value.var = "variable")
   SUBJECT_ID SUBJECT_BIRTHYEAR SUBJECT_HOMETOWN variableX variableY
1:          A              1950       Townsville         2         1
2:          B              1951       Villestown         1         2

数据

library(data.table)
wide <- fread("variableX    variableY    SUBJECT_ID     SUBJECT_BIRTHYEAR     SUBJECT_HOMETOWN
2            1            A              1950                  Townsville
1            2            B              1951                  Villestown")

答案 2 :(得分:0)

以下是使用base R

的选项
res <- cbind(VARIABLE = rep(substr(names(df1)[1:2], 9, 9)[row(df1[1:2])], t(df1[1:2])), 
        df1[rep(seq_len(nrow(df1)), rowSums(df1[1:2])), -(1:2)])
row.names(res) <- NULL
res
#   VARIABLE SUBJECT_ID SUBJECT_BIRTHYEAR SUBJECT_HOMETOWN
#1        X          A              1950       Townsville
#2        X          A              1950       Townsville
#3        Y          A              1950       Townsville
#4        X          B              1951       Villestown
#5        Y          B              1951       Villestown
#6        Y          B              1951       Villestown