之所以出现这个问题,是因为我希望为自己的方便起一个作用:
as.numeric_psql <- function(x) {
return(as.numeric(as.integer(x)))
}
将远程postgres表中的布尔值转换为数字。需要转换为整数的步骤是:
在数字和布尔值之间没有定义直接转换。您可以将整数用作中间值。 (https://stackoverflow.com/a/19290671/2109289)
当然,此功能可以在本地按预期工作:
copy_to(con_psql, cars, 'tmp_cars')
tmp_cars_sdf <-
tbl(con_psql, 'tmp_cars')
tmp_cars_sdf %>%
mutate(low_dist = dist < 5) %>%
mutate(low_dist = as.numeric(as.integer(low_dist)))
# # Source: lazy query [?? x 3]
# # Database: postgres 9.5.3
# speed dist low_dist
# <dbl> <dbl> <dbl>
# 1 4 2 1
# 2 4 10 0
# 3 7 4 1
# 4 7 22 0
# 5 8 16 0
cars %>%
mutate(low_dist = dist < 5) %>%
mutate(low_dist = as.numeric_psql(low_dist)) %>%
head(5)
# speed dist low_dist
# 1 4 2 1
# 2 4 10 0
# 3 7 4 1
# 4 7 22 0
# 5 8 16 0
但是,由于as.numeric_psql
不在sql转换列表中,因此在远程数据帧上使用时不起作用,因此将其逐字传递给查询:
> tmp_cars_sdf %>%
+ mutate(low_dist = dist < 5) %>%
+ mutate(low_dist = as.numeric_psql(low_dist))
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: syntax error at or near "as"
LINE 1: SELECT "speed", "dist", as.numeric_psql("low_dist") AS "low_...
^
)
我的问题是,是否存在一种使dplyr理解函数as.numeric_psql
的简单方法(即未定义自定义sql转换),该函数由具有现有sql转换的函数组成,并使用这些转换代替。
答案 0 :(得分:1)
避免错误的一种方法是将函数设置为在数据帧上运行,而不是在内部mutate上运行。例如:
import java.util.ArrayList;
public class LinearList<T> {
private static int SIZE = 10;
private int n = 0;
private final ArrayList<T> newList = new ArrayList<T>(SIZE);
private T t;
public void set(T t) {
this.t = t;
}
public T get() {
return t;
}
public void add(T value, int position) {
newList.add(position, value);
n++;
}
public void addFirst(T value) {
newList.add(0, value);
n++;
}
public void removeLast() {
T value = null;
for (int i = 0; i < newList.size(); i++)
value = newList.get(i);
newList.remove(value);
n--;
}
public void removeFirst() {
newList.remove(0);
n--;
}
public T first() {
return newList.get(0);
}
public T last() {
int value = 0;
for (int i = 0; i < newList.size() - 1; i++)
value++;
return newList.get(value);
}
public int count() {
return n;
}
public boolean isFull() {
return (n >= SIZE);
}
public boolean isEmpty() {
return (n <= 0);
}
//part 4
public void Grow() {
int grow = SIZE / 2;
SIZE = SIZE + grow;
}
public void Shrink() {
int grow = SIZE / 2;
SIZE = SIZE - grow;
}
public String toString() {
String outStr = "" + newList;
return outStr;
}
}
请注意,在您的示例中,数据库版本copy_to(con_psql, cars, 'tmp_cars')
tmp_cars_sdf <- tbl(con_psql, 'tmp_cars')
as.numeric_psql <- function(data, x) {
return(data %>% mutate({{x}} := as.numeric(as.integer({{x}}))))
}
tmp_cars_sdf %>%
mutate(low_dist = dist < 5) %>%
as.numeric_psql(low_dist)
#> # Source: lazy query [?? x 3]
#> # Database: sqlite 3.30.1 [:memory:]
#> speed dist low_dist
#> <dbl> <dbl> <dbl>
#> 1 4 2 1
#> 2 4 10 0
#> 3 7 4 1
#> 4 7 22 0
#> 5 8 16 0
#> 6 9 10 0
#> 7 10 18 0
#> 8 10 26 0
#> 9 10 34 0
#> 10 11 17 0
#> # … with more rows
在创建时已经被编码为整数,而不是像标准R数据框中那样被编码为逻辑:
low_dist