说我有 df:
rDate CCRN630 CCRN800 CCRN532 CCRN570
1 2015-08-19 14:45:00 0.2878412 0.3213675 0.3327465 0.4172932
2 2015-08-19 14:50:00 0.2878412 0.3213675 0.3327465 0.4172932
3 2015-08-19 14:55:00 0.2878412 0.3213675 0.3327465 0.4172932
4 2015-08-19 15:00:00 0.2878412 0.3213675 0.3327465 0.4172932
5 2015-08-19 15:05:00 0.2878412 0.3213675 0.3327465 0.4172932
6 2015-08-19 15:10:00 0.2878412 0.3213675 0.3327465 0.4172932
18670 2015-08-19 14:45:00 0.2878412 0.3213675 0.3327465 0.4172932
18671 2015-08-19 14:50:00 0.2878412 0.3213675 0.3327465 0.4172932
18672 2015-10-23 10:40:00 0.1287671 0.1181319 0.2111437 0.2463768
18673 2015-08-19 15:00:00 0.1287671 0.1181319 0.2111437 0.2463768
18674 2015-08-19 15:05:00 0.1287671 0.1181319 0.2111437 0.2463768
18675 2015-08-19 15:10:00 0.1287671 0.1181319 0.2111437 0.2463768
我将如何创建一个频率矩阵,如:
df <- data.frame(cell = c("c1", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8"),
layer = c("L1", "L2", "L1", "L2", "L3", "L3", "L4", "L4", "L3"))
> df
cell layer
1 c1 L1
2 c1 L2
3 c2 L1
4 c3 L2
5 c4 L3
6 c5 L3
7 c6 L4
8 c7 L4
9 c8 L3
但在 for 循环中?我试过类似的东西:
> table(df$cell, df$layer)
L1 L2 L3 L4
c1 1 1 0 0
c2 1 0 0 0
c3 0 1 0 0
c4 0 0 1 0
c5 0 0 1 0
c6 0 0 0 1
c7 0 0 0 1
c8 0 0 1 0
但它对行进行单热处理并将其添加回原始数据帧...
我正在查看 > for(layer in unique(df$layer)){
+ df[paste(layer)] <- ifelse(df$layer == layer, 1, 0)
+ }
> df
cell layer L1 L2 L3 L4
1 c1 L1 1 0 0 0
2 c1 L2 0 1 0 0
3 c2 L1 1 0 0 0
4 c3 L2 0 1 0 0
5 c4 L3 0 0 1 0
6 c5 L3 0 0 1 0
7 c6 L4 0 0 0 1
8 c7 L4 0 0 0 1
9 c8 L3 0 0 1 0
的源代码,但无法挑选出我感兴趣的部分。有没有办法“推”到一个空矩阵中?
类似:
base:::table
只是不知道如何完成它...谢谢! 预期输出,矩阵形式:
newMat <- Matrix(0, nrow = length(unique(df$cell)), ncol=length(unique(df$layer)))
for (i in 1:length(unique(df$cell))){
for (j in 1:length(unique(df$layer)))){
newMat[i,j] <- ....
}
}
答案 0 :(得分:0)
这是使用循环的一种方法:
df <- data.frame(cell = c("c1", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8"),
layer = c("L1", "L2", "L1", "L2", "L3", "L3", "L4", "L4", "L3"))
cells = unique(df$cell)
layers = unique(df$layer)
res = matrix(0L,
nrow = length(cells),
ncol = length(layers),
dimnames = list(cells, layers))
for (i in seq_along(cells)) {
cols = match(df[df$cell == cells[i], 'layer'], layers)
res[i, cols] = 1L
}
res
## L1 L2 L3 L4
## c1 1 1 0 0
## c2 1 0 0 0
## c3 0 1 0 0
## c4 0 0 1 0
## c5 0 0 1 0
## c6 0 0 0 1
## c7 0 0 0 1
## c8 0 0 1 0
最重要的两件事是,由于我们需要多次使用唯一的 cell
和 layers
,因此分配给变量而不是使用 unique
比使用 match(df[df$cell == cells[i], ...])
的性能更高一次。然后,此 1
调用将确定存在哪些层,以便我们为它们分配 table
。
注意,我根本不会这样做。我建议使用 #include <numeric>
#include <iostream>
#include <ctime>
#include <random>
class Fifteen
{
public:
static const int N = 15;
Fifteen(size_t seed) : rng(seed) {}
std::vector<int> next()
{
std::vector<int> v(N);
std::iota(begin(v), end(v), 1);
bool even_permutation = true;
for (int i = N - 1; i > 0; i--)
{
auto k = std::uniform_int_distribution(0, i)(rng);
if (k != i)
{
std::swap(v[i], v[k]);
even_permutation = !even_permutation;
}
}
if (!even_permutation) transpose_pair(v);
return v;
}
private:
std::mt19937_64 rng;
void transpose_pair(std::vector<int> & v)
{
auto n = std::uniform_int_distribution(0, N - 1)(rng);
auto m = n;
while (m == n)
{
m = std::uniform_int_distribution(0, N - 1)(rng);
}
std::swap(v[n], v[m]);
}
};
// simple test and example of usage
int main()
{
Fifteen fifteen(time(nullptr));
auto v = fifteen.next();
for (auto n: v) std::cout << n << " ";
std::cout << "\n";
}
或某种重塑机制。