使用以下命令我使用命名系列构建了一个示例数据框,然后我创建了另一个包含所有可能的列名对的框架。
HttpURLConnection httpURLConnection = null;
String jsonResponse = null;
try{
final String path = "http://10.0.2.2:8889/insert-db.php";
//final String path = "http://192.168.0.11:8889/insert-db.php";
URL finalURl = new URL(path);
httpURLConnection = (HttpURLConnection) finalURl.openConnection();
httpURLConnection.setDoOutput(true);
httpURLConnection.setRequestMethod("POST");
OutputStream os = httpURLConnection.getOutputStream();
BufferedWriter writer = new BufferedWriter(
new OutputStreamWriter(os, "UTF-8")
);
String data = URLEncoder.encode("email", "UTF-8") + "=" + URLEncoder.encode(mEmail, "UTF-8") +
"&" + URLEncoder.encode("password", "UTF-8") + "=" + URLEncoder.encode(mPassword, "UTF-8");
writer.write(data);
Log.d("OUTPUT INFO STREAM", httpURLConnection.getOutputStream().toString());
writer.flush();
writer.close();
os.close();
InputStream is = httpURLConnection.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
Log.d("INPUTSTREAM", reader.readLine());
is.close();
} catch (IOException e) {
e.printStackTrace();
return false;
}
httpURLConnection.disconnect();
return true;
}
他们看起来像这样:
dataset <- data.frame(randwalk(10), randwalk(10), randwalk(10), randwalk(10), randwalk(10))
colnames(dataset) <- c( "one", "two", "three", "four", "five")
datasetpairs = data.frame(t(combn(colnames(dataset), 2)))
colnames(datasetpairs) <- c("numerator", "denominator")
我想要做的是在“datasetpairs”中添加几列,以存储每个列对的平均值,最大值和最小值。 我可以通过管道每行的值来获得一个数字,所以我可以做一个FOR循环,但我试图做它的矢量样式:
head(dataset)
one two three four five
1 1.0000000 1.0000000 1.000000 1.000000 1.000000
2 1.0055678 0.9866026 1.004089 1.007859 1.004886
3 1.0137884 0.9794308 1.013057 1.011453 1.003129
4 1.0043928 0.9838919 1.026479 1.025951 1.005845
5 0.9942291 0.9839125 1.026769 1.030824 1.007177
6 0.9993814 0.9618307 1.035784 1.037156 1.026317
head(datasetpairs)
numerator denominator
1 one two
2 one three
3 one four
4 one five
5 two three
6 two four
但这给了我一个错误。 此外,我真正想要做的只是从两列计算一次比率,并在分析之前存储几个值而不存储它,因为实际上我的数据集太大而无法计算所有可能的组合比率预先。在不诉诸循环的情况下,这样做的优雅方法是什么? 感谢任何可以提供帮助的人!
答案 0 :(得分:1)
以下是使用data.table的解决方案(因为它可以快速执行许多分组操作)和自定义函数来进行分析。这样,您的代码是可读的,您只需在继续之前计算每个比率一次。
library(data.table)
#create data
set.seed(123)
dataset <- data.frame(matrix(runif(50),ncol=5))
colnames(dataset) <- c( "one", "two", "three", "four", "five")
#custom function to process two vectors:
process_data <- function(v1,v2){
ratio <- v1/v2
res <- list(mean=mean(ratio),min=min(ratio),max=max(ratio))
return(res)
}
datasetpairs = data.table(t(combn(colnames(dataset), 2)))
colnames(datasetpairs) <- c("numerator", "denominator")
#run the analysis
datasetpairs[,process_data(dataset[[numerator]],dataset[[denominator]]),by=.(numerator,denominator)]