数据就像
{20|1|04||02|02|02|02|02|02|02|02|02|02|01|||01|10088499405447|MR|Anantha|sadvichar|Lochanvala||Mr|Anand|Sharma|Upadhyay|Mr Anand Sharma Upadhyay|1|MR|Rajeev|vxvxase|Shah|MR Rajeev vxvxase Shah|vvxvx|Nanditha|vxxvasew|Pandit|vvxvx Nanditha vxxvasew Pandit|M|01|IN|S-02|02-12-1962|||||||||01|02|AF||AF||1|Li|lin|line|city|Hassan||IN|573201|||2|0|gdfgd|ssfsdf|sdfsdf|dsfsdfsdfsdf|Hassan||IN|573201||3|0|Li|lin|line|city||IN|573201||||||||||||01-06-2016|dfgdfgdfg|01-06-2016|01|dgfdgdfg|gdfgdfg|dfgdgdgdf|dgdgdg|FI Registration472|IN0356|1|2|||3||||||| 30|1|F|qwert||01|02||||| 40|1|1|||mkm|Dileep|kmk|mkmk|||||||||||||||||||||||||MKNPO2356K||||||||||||||||||||||01-06-2016|kmk|01-06-2016|01|mk|km|km|mkm|DOTEX|IN106|||||| 40|1|1|||mkm|Dileep|kmk|mkmk|||||||||||||||||||||||||MKNPO2356K||||||||||||||||||||||01-06-2016|kmk|01-06-2016|01|mk|km|km|mkm|DOTEX|IN106|||||| 70|1|10088499405447_02_02062016170054.jpg|02||||||| 70|1|10088499405447_09_02062016170054.tiff|09||||||| 70|1|10088499405447_08_02062016170054.tif|08||||||| }
我希望根据部分明智地提供数据
意思是:
根据记录,有两个记录
本节记录每个字段有20,30,40,70个字段
我想从文本文件中获取此记录并保存到数据库中
根据每个客户
第一个客户记录,如
{20|2|04||02|02|02|02|02|02|02|02|02|02|1|||01|10095197636617|hg|fhgfhg||ytrytr||Mr|Anand|Sharma|Upadhyay|Mr Anand Sharma Upadhyay|1|MR|Rajeev||Shah|MR Rajeev Shah|hjgjh|Nanditha||Pandit|hjgjh Nanditha Pandit|M|01|IN|S-02|02-12-1962|||||||||02|01|AU|1235|IN|Patna|1|Ranga Mandira|||Ratnagiri|361||AF||||N|0|NSE|Mahamandal||Ratnagirishhhhhhh|Mumbai|MH|IN|400051||3|0|Madhava nagara|manehala||Ratnagiri||AF|||||||||||||02-05-2016|gjhgjhghj|03-05-2016|01|gjhgg|tyrytryt|uytfyutfytu|ytuytuyt|FI Registration472|IN0356|3||||6||||||| 30|1|F|123456||01|02||||| 30|1|E|123456789012|||02|||||| 70|2|10095197636617_02_31052016161747.jpg|02||||||| 70|2|10095197636617_09_31052016161747.tiff|09||||||| 70|2|10095197636617_08_31052016161747.tiff|08|||||||}
第二个客户记录,如
> # Log likelihood function
> llpoi = function(X, y){
+ # Ensures X is a matrix
+ if(class(X) != "matrix") X = as.matrix(X)
+ # Ensures there's a constant
+ if(sum(X[, 1]) != nrow(X)) X = cbind(1, X)
+ # A useful scalar that I'll need below
+ k = ncol(X)
+ ## Function to be maximized
+ FUN = function(par, X, y){
+ # beta hat -- the parameter we're trying to estimate
+ betahat = par[1:k]
+ # mu hat -- the systematic component
+ muhat = X %*% betahat
+ # Log likelihood function
+ sum(muhat * y - exp(muhat))
+ }
+ # Optimizing
+ opt = optim(rep(0, k), fn = FUN, y = y, X = X, control = list(fnscale = -1), method = "BFGS", hessian = T)
+ # Results, including getting the SEs from the hessian
+ cbind(opt$par, sqrt(diag(solve(-1 * opt$hessian))))
+ }
>
> # Defining inputs
> y = c(2, 2, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 1, 1, 3, 1, 1, 3, 2, 2, 2, 3, 1, 2, 4, 3, 3, 3, 1, 3, 0, 2, 1, 2, 4, 1, 2, 0, 2, 1, 2, 1, 4, 1, 2, 0)
> x1 = c(8, 1, 0, 3, 3, 3, 5, 4, 0.4, 1.5, 2, 1, 1, 7, 2, 3, 0, 2, 1.5, 5, 1, 4, 5.5, 6, 3, 3, 2, 0.5, 5, 10, 3, 22, 20, 3, 20, 10, 15, 25, 15, 6, 3.5, 5, 18, 2, 15.0, 16, 24)
> x2 = c(12, 12, 12, 16, 12, 12, 12, 12, 12, 12, 12, 12, 9, 9, 12, 9, 12, 12, 9, 16, 9, 6, 12, 9, 9, 12, 12, 12, 12, 14, 14, 14, 9, 12, 9, 12, 3, 12, 9, 6, 12, 12, 12, 12, 12, 12, 9)
>
> # Results
> withmyfun = llpoi(cbind(x1, x2, x1 * x2), y)
> round(withmyfun, 2)
[,1] [,2]
[1,] 0.96 0.90
[2,] -0.05 0.09
[3,] -0.02 0.08
[4,] 0.00 0.01
> withglm = glm(y ~ x1 + x2 + x1 * x2, family = "poisson")
> round(summary(withglm)$coef[, 1:2], 2)
Estimate Std. Error
(Intercept) 1.08 0.90
x1 -0.07 0.09
x2 -0.03 0.08
x1:x2 0.00 0.01
但是文本文件附带了许多记录,我希望明智地将此部分拆分
答案 0 :(得分:2)
试试这个:
第1步:逐行阅读您的文件。
第2步:清理数据。
第3步:将数据存储到数据库。
// Open the file
FileInputStream fstream = new FileInputStream("textfile.txt");
BufferedReader br = new BufferedReader(new
InputStreamReader(fstream));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Now split line by pipe symbol and convert to array
String[] value_split = strLine.split("|");
//Clear your data further if nessasry
//Store data to DB here
}
//Close the input stream
br.close();