数据

Question

我正在尝试获取一个新的数据集，该数据集可以包含两列并根据第三列的计算来创建新表。

source.onmessage = (message)=>{
   let n:Notification = JSON.parse(message.data);
}


source.addEventListener('message', message => {
// There is no data property available on 'message' here
   let n: Notification; 
   n = JSON.parse(message.data);
   console.log(message.data); 
});

我需要在其中创建一个新表

Cust    T       S1      S2      S3      S4
1009    150     1007    1006    1001    1000
1010    50      1007    1006    1001    1000
1011    50      1007    1006    1001    1000
1013    10000   1007    1006    1001    1000
1931    60      1008    1007    1006    1005
1141    1000    1014    1013    1007    1006

我似乎无法弄清楚，甚至不确定是否可行。

Answer 1

这是结合使用using (var repo = new Repository(Repository.Init(@"path\to\local.git", true))) { var remote = repo.Network.Remotes.Add("origin", "https://github.com/{org}/{SourceProjectName}.git", "+refs/*:refs/*"); repo.Network.Fetch(remote /* anything for report progress */); }，public virtual Remote Add(string name, string url, string fetchRefSpec)，reshape2::melt和dplyr::select的一种方法。可能不是最好的方法，但是它应该做您想要的：

tidyr::spread

您将需要反引号，因为R不能很好地使用数字名称。

让我知道我是否误解了什么/没有任何意义

Answer 2

library(dplyr)
library(tidyr)
library(data.table)
df %>% gather(key=k,value = val, -c('Cust','T')) %>%
       mutate(val_upd=ifelse(k=='S1'|k=='S2','T*.1',ifelse(k=='S3','T*.05','T*.025'))) %>% 
       #Change 'T*.1' to T*.1 to get the actual value
       select(-T,-k) %>% dcast(Cust~val,value.var='val_upd')


  Cust   1000  1001   1005   1006  1007 1008 1013 1014
1 1009 T*.025 T*.05   <NA>   T*.1  T*.1 <NA> <NA> <NA>
2 1010 T*.025 T*.05   <NA>   T*.1  T*.1 <NA> <NA> <NA>
3 1011 T*.025 T*.05   <NA>   T*.1  T*.1 <NA> <NA> <NA>
4 1013 T*.025 T*.05   <NA>   T*.1  T*.1 <NA> <NA> <NA>
5 1141   <NA>  <NA>   <NA> T*.025 T*.05 <NA> T*.1 T*.1
6 1931   <NA>  <NA> T*.025  T*.05  T*.1 T*.1 <NA> <NA>

数据

df <- read.table(text = "
Cust    T       S1      S2      S3      S4
1009    150     1007    1006    1001    1000
1010    50      1007    1006    1001    1000
1011    50      1007    1006    1001    1000
1013    10000   1007    1006    1001    1000
1931    60      1008    1007    1006    1005
1141    1000    1014    1013    1007    1006
", header=TRUE)

Answer 3

一种tidyverse解决方案：

library(tidyverse)
df %>% gather(select = -c(Cust, T)) %>%
       select(-key) %>%
       spread(value, T) %>%
       map2_dfc(c(1, .025, .05, rep(.1, 6)), ~ .x * .y)

#    Cust `1000` `1001` `1005` `1006` `1007` `1008` `1013` `1014`
#   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
# 1  1009   3.75    7.5     NA     15     15     NA     NA     NA
# 2  1010   1.25    2.5     NA      5      5     NA     NA     NA
# 3  1011   1.25    2.5     NA      5      5     NA     NA     NA
# 4  1013 250     500       NA   1000   1000     NA     NA     NA
# 5  1141  NA      NA       NA    100    100     NA    100    100
# 6  1931  NA      NA        6      6      6      6     NA     NA

如何根据数据框的X和Y创建新表

3 个答案:

数据