使用函数和lapply从数据框列表中提取特定列

时间:2016-10-20 00:58:25

标签: r list function lapply

我有一个名为#include <stdlib.h> #include <stdio.h> #include <pthread.h> #include <semaphore.h> #include <sys/types.h> void* thd(void *); pthread_t tid; int main() { int i; i = pthread_create(&tid, NULL, &thd, NULL); pthread_join(tid, NULL); return 0; } void* thd(void *unused) { printf("hello\n"); return 0; } 的数据框列表(它是一个状态列表),我试图从每个列中拉出两列,求和,然后返回总和。这就是我到目前为止所做的:

StatesList

我知道StatesList <- list(Alabam, Alask, Arizon, Arkansa, Californi, Colorado, Connecticu, Delawar, District_ColUmbi, Florid, Georgi, Hawai, Idah, Illinoi, Indian, Iow, Kansa, Kentuck, Louisian, Main, Marylan, Massachusett, Michiga, Minnesot, Mississipp, Missour, Montan, Nebrask, Nevad, New_Hamp, New_Jer, New_Mex, New_York, North_Carol, North_Dak, Ohi, Oklahom, Orego, Pennsylvani, Rhode_Isl, South_Carol, South_Dak, Tennesse, Texa, Uta,Vermon, Virgini, Washingto, West_Vir, Wisconsi, Wyomin ) my_function <- function(x) { c <- sum(x + $Clinton_Weighted) t <- sum(x + $Trump_Weighted) ans <- list(Clinton = c, Trump = t) return(print(ans)) } lapply(StatesList, my_function(x)) 不起作用,但我不确定会发生什么。 如何在函数代码中提取该特定列?并且正在尝试将每个列表的名称与x + $Clinton_Weighted和所需的列组合成一个坏主意?

1 个答案:

答案 0 :(得分:0)

Here is a simple way to do this using a combination of lapply and apply:

# Create sample data
cols = list(Clinton = 1:10, Trump = 10:1, SomeoneElse = 21:30)

Alabama = data.frame(cols)
Alaska = data.frame(cols)
Arison = data.frame(cols)
Arkansa = data.frame(cols)
Californi = data.frame(cols)

df_list = list(Alabama, Alaska, Arison, Arkansa, Californi)

The list of dataframes look like this:

df_list
[[1]]
   Clinton Trump SomeoneElse
1        1    10          21
2        2     9          22
3        3     8          23
4        4     7          24
5        5     6          25
6        6     5          26
7        7     4          27
8        8     3          28
9        9     2          29
10      10     1          30

[[2]]
   Clinton Trump SomeoneElse
1        1    10          21
2        2     9          22
3        3     8          23
4        4     7          24
5        5     6          25
6        6     5          26
7        7     4          27
8        8     3          28
9        9     2          29
10      10     1          30

[[3]]
   Clinton Trump SomeoneElse
1        1    10          21
2        2     9          22
3        3     8          23
4        4     7          24
5        5     6          25
6        6     5          26
7        7     4          27
8        8     3          28
9        9     2          29
10      10     1          30

[[4]]
   Clinton Trump SomeoneElse
1        1    10          21
2        2     9          22
3        3     8          23
4        4     7          24
5        5     6          25
6        6     5          26
7        7     4          27
8        8     3          28
9        9     2          29
10      10     1          30

[[5]]
   Clinton Trump SomeoneElse
1        1    10          21
2        2     9          22
3        3     8          23
4        4     7          24
5        5     6          25
6        6     5          26
7        7     4          27
8        8     3          28
9        9     2          29
10      10     1          30

Now sum up the columns of the dataframe, and apply it over the list of dataframes:

# Choose the columns to extract the sum of
cols = c("Clinton", "Trump")

lapply(df_list, function(x) apply(x[cols], 2, sum))

Below is the returned list

[[1]]
Clinton   Trump 
     55      55 

[[2]]
Clinton   Trump 
     55      55 

[[3]]
Clinton   Trump 
     55      55 

[[4]]
Clinton   Trump 
     55      55 

[[5]]
Clinton   Trump 
     55      55