Question

数据就像

{20|1|04||02|02|02|02|02|02|02|02|02|02|01|||01|10088499405447|MR|Anantha|sadvichar|Lochanvala||Mr|Anand|Sharma|Upadhyay|Mr Anand Sharma Upadhyay|1|MR|Rajeev|vxvxase|Shah|MR Rajeev vxvxase Shah|vvxvx|Nanditha|vxxvasew|Pandit|vvxvx Nanditha vxxvasew Pandit|M|01|IN|S-02|02-12-1962|||||||||01|02|AF||AF||1|Li|lin|line|city|Hassan||IN|573201|||2|0|gdfgd|ssfsdf|sdfsdf|dsfsdfsdfsdf|Hassan||IN|573201||3|0|Li|lin|line|city||IN|573201||||||||||||01-06-2016|dfgdfgdfg|01-06-2016|01|dgfdgdfg|gdfgdfg|dfgdgdgdf|dgdgdg|FI Registration472|IN0356|1|2|||3||||||| 30|1|F|qwert||01|02||||| 40|1|1|||mkm|Dileep|kmk|mkmk|||||||||||||||||||||||||MKNPO2356K||||||||||||||||||||||01-06-2016|kmk|01-06-2016|01|mk|km|km|mkm|DOTEX|IN106|||||| 40|1|1|||mkm|Dileep|kmk|mkmk|||||||||||||||||||||||||MKNPO2356K||||||||||||||||||||||01-06-2016|kmk|01-06-2016|01|mk|km|km|mkm|DOTEX|IN106|||||| 70|1|10088499405447_02_02062016170054.jpg|02||||||| 70|1|10088499405447_09_02062016170054.tiff|09||||||| 70|1|10088499405447_08_02062016170054.tif|08||||||| }

我希望根据部分明智地提供数据意思是：
根据记录，有两个记录
本节记录每个字段有20,30,40,70个字段我想从文本文件中获取此记录并保存到数据库中根据每个客户第一个客户记录，如

{20|2|04||02|02|02|02|02|02|02|02|02|02|1|||01|10095197636617|hg|fhgfhg||ytrytr||Mr|Anand|Sharma|Upadhyay|Mr Anand Sharma Upadhyay|1|MR|Rajeev||Shah|MR Rajeev Shah|hjgjh|Nanditha||Pandit|hjgjh Nanditha Pandit|M|01|IN|S-02|02-12-1962|||||||||02|01|AU|1235|IN|Patna|1|Ranga Mandira|||Ratnagiri|361||AF||||N|0|NSE|Mahamandal||Ratnagirishhhhhhh|Mumbai|MH|IN|400051||3|0|Madhava nagara|manehala||Ratnagiri||AF|||||||||||||02-05-2016|gjhgjhghj|03-05-2016|01|gjhgg|tyrytryt|uytfyutfytu|ytuytuyt|FI Registration472|IN0356|3||||6||||||| 30|1|F|123456||01|02||||| 30|1|E|123456789012|||02|||||| 70|2|10095197636617_02_31052016161747.jpg|02||||||| 70|2|10095197636617_09_31052016161747.tiff|09||||||| 70|2|10095197636617_08_31052016161747.tiff|08|||||||}

第二个客户记录，如

> # Log likelihood function > llpoi = function(X, y){ + # Ensures X is a matrix + if(class(X) != "matrix") X = as.matrix(X) + # Ensures there's a constant + if(sum(X[, 1]) != nrow(X)) X = cbind(1, X) + # A useful scalar that I'll need below + k = ncol(X) + ## Function to be maximized + FUN = function(par, X, y){ + # beta hat -- the parameter we're trying to estimate + betahat = par[1:k] + # mu hat -- the systematic component + muhat = X %*% betahat + # Log likelihood function + sum(muhat * y - exp(muhat)) + } + # Optimizing + opt = optim(rep(0, k), fn = FUN, y = y, X = X, control = list(fnscale = -1), method = "BFGS", hessian = T) + # Results, including getting the SEs from the hessian + cbind(opt$par, sqrt(diag(solve(-1 * opt$hessian)))) + } > > # Defining inputs > y = c(2, 2, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 1, 1, 3, 1, 1, 3, 2, 2, 2, 3, 1, 2, 4, 3, 3, 3, 1, 3, 0, 2, 1, 2, 4, 1, 2, 0, 2, 1, 2, 1, 4, 1, 2, 0) > x1 = c(8, 1, 0, 3, 3, 3, 5, 4, 0.4, 1.5, 2, 1, 1, 7, 2, 3, 0, 2, 1.5, 5, 1, 4, 5.5, 6, 3, 3, 2, 0.5, 5, 10, 3, 22, 20, 3, 20, 10, 15, 25, 15, 6, 3.5, 5, 18, 2, 15.0, 16, 24) > x2 = c(12, 12, 12, 16, 12, 12, 12, 12, 12, 12, 12, 12, 9, 9, 12, 9, 12, 12, 9, 16, 9, 6, 12, 9, 9, 12, 12, 12, 12, 14, 14, 14, 9, 12, 9, 12, 3, 12, 9, 6, 12, 12, 12, 12, 12, 12, 9) > > # Results > withmyfun = llpoi(cbind(x1, x2, x1 * x2), y) > round(withmyfun, 2) [,1] [,2] [1,] 0.96 0.90 [2,] -0.05 0.09 [3,] -0.02 0.08 [4,] 0.00 0.01 > withglm = glm(y ~ x1 + x2 + x1 * x2, family = "poisson") > round(summary(withglm)$coef[, 1:2], 2) Estimate Std. Error (Intercept) 1.08 0.90 x1 -0.07 0.09 x2 -0.03 0.08 x1:x2 0.00 0.01

但是文本文件附带了许多记录，我希望明智地将此部分拆分

Answer 1

试试这个：

第1步：逐行阅读您的文件。

第2步：清理数据。

第3步：将数据存储到数据库。

 // Open the file
 FileInputStream fstream = new FileInputStream("textfile.txt");
 BufferedReader br = new BufferedReader(new 
 InputStreamReader(fstream));

 String strLine;

//Read File Line By Line
while ((strLine = br.readLine()) != null)   {
  // Now split line by pipe symbol and convert to array
  String[] value_split = strLine.split("|");
  //Clear your data further if nessasry 
  //Store data to DB here
}

//Close the input stream
br.close();

我想要一个来自文本文件的数据并保存到数据库中

1 个答案: