查找R数据帧中的行,其中列值遵循序列

时间:2016-06-12 04:43:08

标签: r dataframe

我有一个如下的数据帧,它是分类器的输出。

col1, class
 123, 2
 122, 5
 124, 7
 125, 9
 126, 15
 127, 2
 128, 19
 129, 5
 130, 7
 179, 9
 180, 3

我想找到具有某种类型模式的行,就像其类在seq 5,7,9中的所有行一样。

我提出的解决方案是通过移动一行并比较列来粘贴类列

 col1, class, class1, class2
 123, 2,5,7
 122, 5,7,9
 124, 7,9,15
 125, 9,15,2
 126, 15,2,19
 127, 2,19,5
 128, 19,5,7
 129, 5,7,9
 130, 7,9,3
 179, 9,3,NA,
 180, 3,NA,NA

只有当我的模式中的字段数相同时,才能解决这个问题。有些模式甚至可以有5到7个字段。

2 个答案:

答案 0 :(得分:2)

我们可以使用shift中的data.table,然后将paste元素放在一起,并查看我们的位置579

n <- 3
library(data.table)
setDT(df1)[, which(do.call(paste0, shift(class, seq(n)-1, type = "lead"))=="579")]
#[1] 2 8

或者paste代替Map我们可以将ReducesetDT(df1)[, which(Reduce(`&`, Map(`==`, shift(class, seq(n)-1, type = "lead"), c(5, 7, 9))))] #[1] 2 8

一起使用
public class MainActivity extends ListActivity {
Cursor cursor;

@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
    String[] columns = { MediaStore.Audio.Media._ID };
    int[] displayViews = new int[] {android.R.id.text1};
    ContentResolver cr = this.getContentResolver();
    cursor = cr.query(MediaStore.Audio.Albums.EXTERNAL_CONTENT_URI,columns, null, null, null);
    setListAdapter(new SimpleCursorAdapter(this,android.R.layout.simple_list_item_1,cursor,columns,displayViews,1));
}

public static ArrayList<SongDto> getMusicInfos(Context context) {

    ArrayList<SongDto> musicInfos = new ArrayList<SongDto>();

    Cursor cursor = context.getContentResolver().query(
            MediaStore.Audio.Media.EXTERNAL_CONTENT_URI, null, null, null,
            MediaStore.Audio.Media.DEFAULT_SORT_ORDER);
    if (cursor == null) {
        return null;
    }


    for (int i = 0; i < cursor.getCount(); i++) {
        cursor.moveToNext();


        int isMusic = cursor.getInt(cursor
                .getColumnIndex(MediaStore.Audio.Media.IS_MUSIC));

        if (isMusic != 0) {
            SongDto music = new SongDto();


            music.path = cursor.getString(cursor
                    .getColumnIndexOrThrow(MediaStore.Audio.Media.DATA));

            if (!new File(music.path).exists()) {
                continue;
            }


            music.songId = cursor.getLong(cursor
                    .getColumnIndexOrThrow(MediaStore.Audio.Media._ID));

            music.songTitle = cursor.getString(cursor
                    .getColumnIndexOrThrow(MediaStore.Audio.Media.TITLE));


            music.songTitle = cursor.getString(cursor
                    .getColumnIndex(MediaStore.Audio.Media.DISPLAY_NAME));


            music.album = cursor.getString(cursor
                    .getColumnIndexOrThrow(MediaStore.Audio.Media.ALBUM));


            music.songArtist = cursor.getString(cursor
                    .getColumnIndexOrThrow(MediaStore.Audio.Media.ARTIST));

            music.duration = cursor
                    .getLong(cursor
                            .getColumnIndexOrThrow(MediaStore.Audio.Media.DURATION));

            MediaMetadataRetriever mmr = new MediaMetadataRetriever();
            mmr.setDataSource(music.path);
            mmr.release();

            musicInfos.add(music);
        }
    }

    return musicInfos;
}
}

答案 1 :(得分:2)

一个更长的基础R替代方案,原则上类似于@ akrun的答案:

which(do.call(paste0, cbind(df1, with(df1, class[seq_along(class)+1]),
                             with(df1, class[seq_along(class)+2]))[-1]) == "579")
#[1] 2 8

数据:

df1 <- structure(list(col1 = c(123L, 122L, 124L, 125L, 126L, 127L, 128L, 
                               129L, 130L, 179L, 180L), class = c(2L, 5L,
                               7L, 9L, 15L, 2L, 19L, 5L, 7L, 9L, 3L)),
                              .Names = c("col1", "class"), class = "data.frame", 
                               row.names = c(NA, -11L))