我想完成something very similar to what is being done in this question。我有一个大的data.table(或data.frame),其中一列有一个基本时间戳(BST)。我需要确定每个唯一ID的天数,这些ID可能是数万行。我发现的所有rubridate教程都以非常简单的开始结束示例开始...(this is a great intro but not the answer I'm looking for)。
我基本上需要遍历我的BST列并确定每个ID的开始和结束日期。
以下是示例数据:
class SafeLoc(object):
def __init__(df):
self._df = df
...
class SafeDataFrame(pd.DataFrame):
def loc(self):
return SafeLoc(self)
期望的结果:
然后在保留所有原始行的同时如何完成... a la dplyr :: mutate()?
第二个期望的结果:
答案 0 :(得分:3)
您可以尝试使用BST
将date/time
转换为lubridate::ymd_hms
,然后在myID
上进行分组,最少BST
为startDates
,最大值为BST
endDates
为library(data.table)
library(lubridate)
dt1[,.(startDates= min(ymd_hms(BST)), endDates = max(ymd_hms(BST))), by=myID]
# myID startDates endDates
#1: 1 2017-06-01 00:00:01 2017-06-03 00:00:02
#2: 2 2017-06-01 00:00:01 2017-06-05 00:00:02
。
import java.util.ArrayList;
import java.util.List;
import org.json.JSONArray;
import com.google.common.reflect.TypeToken;
import com.google.gson.Gson;
public class Deserializer {
public static void main(String[] args) {
JSONArray jsonArray = new JSONArray(
"[[{\"dMetaData\": {\"docName\": \"string\",\"docType\": \"pdf\"},\"dCont\": {\"data\": \"abc\"}},{\"dMetaData\": {\"docName\": \"string\",\"docType\": \"pdf\"},\"dCont\": {\"data\": \"def\"}},{\"dMetaData\": {\"docName\": \"string\",\"docType\": \"pdf\"},\"dCont\": {\"data\": \"ghk\"}}]]");
JSONArray docsArray = jsonArray.getJSONArray(0);
List<CreateDoc> docsList = new Gson().fromJson(docsArray.toString(),
new TypeToken<ArrayList<CreateDoc>>() {}.getType());
docsList.forEach(System.out::println);
}
public static class CreateDoc {
DocMetData dMetaData;
DocContent dCont;
@Override
public String toString() {
return this.dMetaData.toString() + " " + this.dCont.toString();
}
}
public static class DocMetData {
String docName;
String docType;
@Override
public String toString() {
return "name: " + this.docName + " type: " + this.docType;
}
}
public static class DocContent {
String data;
@Override
public String toString() {
return "data: " + this.data;
}
}