在d中使用dplyr在r中执行此操作是否有更好的方法,而不必为每个变量键入新的公式?
code dagala_price_1 dagala_price_2 dagala_price_3 dagala_price_4 dagala_price_5 dagala_unit_nb_1 dagala_unit_nb_2 dagala_unit_nb_3 dagala_unit_nb_4 dagala_unit_nb_5
MI-NAL-KA 50 15000 NA NA NA 100 1 NA NA NA
M-KK-KZ 10000 20000 NA NA NA 20 2 NA NA NA
M-KK-NK 10000 NA NA NA NA 5 NA NA NA NA
MI-NA-BA 12000 15000 NA NA NA 2 1 NA NA NA
MI-BD-BT 12000 15000 NA NA NA 3 1 NA NA NA
MI-MI-ND 12000 80000 NA NA NA 8 1 NA NA NA
MI-NAL-LT 13000 15000 NA 18000 NA 1 3 NA 1 NA
M-BY-BGY 13000 15000 NA NA NA 4 1 NA NA NA
MI-NA-NY 13000 NA NA NA NA 2 NA NA NA NA
MI-KAN-BL 18000 35000 15000 NA NA 1 1 6 NA NA
MI-KIGO-KR 20000 15000 15000 NA NA 10 8 4 NA NA
MI-KAN-KY 20000 16000 NA NA NA 2 6 NA NA NA
MI-NAL-BB 20000 35000 250000 NA NA 1 1 1 NA NA
MI-KAM-AL 30000 14000 13000 NA NA 1 10 2 NA NA
df <- df %>% mutate(
dagala_total_1 = dagala_price_1 * dagala_unit_nb_1,
dagala_total_2 = dagala_price_2 * dagala_unit_nb_2,
dagala_total_3 = dagala_price_3 * dagala_unit_nb_3,
dagala_total_total =dagala_total_1 + dagala_total_2 + dagala_total_3)
答案 0 :(得分:1)
根据您的数据,您可以将其以长格式(tidyverse的术语为“ tidy”)进行排列,这将使您的代码更简单。
我假设您有5个1〜5组的dagala单位和价格,所以我在data.frame中添加了一个新的组变量以使其整洁,即采用“长整形”格式
public class YourClass {
public static void main(String[] args) {
String[] needles = new String[2];
findNeedles("some string", needles);
}
public static void findNeedles(String haystack, String[] needles){
if(needles.length > 5){
System.err.println("Too many words!");
} else {
int[] countArray = new int[needles.length];
for(int i = 0; i < needles.length; i++){
String[] words = haystack.split("[\"\'\t\n\b\f\r]", 0);
for(int j = 0; j < words.length; j++){
if(words[j].compareTo(needles[i]) == 0){
countArray[i]++;
}
}
}
for(int j = 0; j < needles.length; j++){
System.out.println(needles[j] + ": " + countArray[j]);
}
}
}
}
library(tidyr)
library(dplyr)
library(data.table)
df <- data.table::fread(
"code dagala_price_1 dagala_price_2 dagala_price_3 dagala_price_4 dagala_price_5 dagala_unit_nb_1 dagala_unit_nb_2 dagala_unit_nb_3 dagala_unit_nb_4 dagala_unit_nb_5
MI-NAL-KA 50 15000 NA NA NA 100 1 NA NA NA
M-KK-KZ 10000 20000 NA NA NA 20 2 NA NA NA
M-KK-NK 10000 NA NA NA NA 5 NA NA NA NA
MI-NA-BA 12000 15000 NA NA NA 2 1 NA NA NA
MI-BD-BT 12000 15000 NA NA NA 3 1 NA NA NA
MI-MI-ND 12000 80000 NA NA NA 8 1 NA NA NA
MI-NAL-LT 13000 15000 NA 18000 NA 1 3 NA 1 NA
M-BY-BGY 13000 15000 NA NA NA 4 1 NA NA NA
MI-NA-NY 13000 NA NA NA NA 2 NA NA NA NA
MI-KAN-BL 18000 35000 15000 NA NA 1 1 6 NA NA
MI-KIGO-KR 20000 15000 15000 NA NA 10 8 4 NA NA
MI-KAN-KY 20000 16000 NA NA NA 2 6 NA NA NA
MI-NAL-BB 20000 35000 250000 NA NA 1 1 1 NA NA
MI-KAM-AL 30000 14000 13000 NA NA 1 10 2 NA NA"
)
df.price <- df %>%
select(code, matches("price_")) %>%
# gather price by group
gather(key=groups,value=dagala_price,matches("price_")) %>%
# extract last number as group
mutate(groups = gsub(".*(\\d)$","\\1",groups))
#> Warning: package 'bindrcpp' was built under R version 3.4.4
df.unit <- df %>%
select(code,matches("unit_nb")) %>%
# gather units by group
gather(key=groups,value=dagala_unit,matches("unit_")) %>%
# extract last number as group
mutate(groups = gsub(".*(\\d)$","\\1",groups))
df.tidy <- left_join(df.price,df.unit)
#> Joining, by = c("code", "groups")
是'long'整齐的形式,在tidyverse语法中更易于操作:df.tidy
# Tidy data.frame
df.tidy
# A tibble: 70 x 4
code groups dagala_price dagala_unit
<chr> <chr> <int> <int>
1 MI-NAL-KA 1 50 100
2 M-KK-KZ 1 10000 20
3 M-KK-NK 1 10000 5
4 MI-NA-BA 1 12000 2
5 MI-BD-BT 1 12000 3
6 MI-MI-ND 1 12000 8
7 MI-NAL-LT 1 13000 1
8 M-BY-BGY 1 13000 4
9 MI-NA-NY 1 13000 2
10 MI-KAN-BL 1 18000 1
# ... with 60 more rows
由reprex package(v0.2.0)于2018-07-28创建。