我有一个包含以下数据的CSV:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ParseEquation_test {
/**
* @param str
* @param regex
* @return
*/
public static String coeff(String str, String regex) {
Pattern patt = Pattern.compile(regex);
Matcher match = patt.matcher(str);
// missing coefficient default
String coeff = "+0";
double value = 0;
if (match.find()) {
coeff = match.group(1);
}
// always have sign, handle implicit 1
value = Double.parseDouble((coeff.length() == 1) ? coeff + "1"
: coeff);
while (match.find()) {
coeff = match.group(1);
value = value + Double.parseDouble(coeff);
}
String value2 = String.valueOf(value);
return (value2.length() == 1) ? (value2 + "1") : value2;
}
public static String[] quadParse(String arg) {
String str = ("+" + arg).replaceAll("\\s", "");
double a1 = Double.parseDouble(coeff(str, "([+-][0-9]*)([a-z]\\^2)"));
double b1 = Double.parseDouble(coeff(str, "([+-][0-9]*)([a-z](?!\\^))"));
double c1 = Double.parseDouble(coeff(str, "([+-][0-9]+)(?![a-z])"));
System.out.println("Values are a: " + a1 + " b: " + b1 + " c: " + c1);
if (a1 == 0) {
if (b1 == 0) {
if (c1 == 0) {
String no_sol = "There are no solution";
return new String[]{no_sol};
} else {
String infinite_sol = "There are infinitely many solutions";
return new String[]{infinite_sol};
}
} else {
double sol_order1 = -c1 / b1;
String final_sol_order1 = Double.toString(sol_order1);
return new String[]{final_sol_order1};
}
} else {
double dis = (Math.pow(b1, 2.0)) - (4 * a1 * c1);
double d = Math.sqrt(dis);
double X = 0, Y = 0; //root 1 & root 2, respectively
if (dis > 0.0) {
X = (-b1 + d) / (2.0 * a1);
Y = (-b1 - d) / (2.0 * a1);
String root1 = Double.toString(X);
String root2 = Double.toString(Y);
return new String[]{root1, root2};
} else if (dis == 0.0) {
X = (-b1 + 0.0) / (2.0 * a1);//repeated root
String root2 = Double.toString(X);
return new String[]{root2};
} else if (dis < 0) {
String no_sol = "There are no solution";
return new String[]{no_sol};
}
}
return new String[-1];
}
public static void main(String[] args) throws IOException {
// TODO code application logic here
System.out.println("Insert equation: ");
BufferedReader r = new BufferedReader(new InputStreamReader(System.in));
String s;
while ((s = r.readLine()) != null) {
String[] pieces = quadParse(s);
System.out.println(Arrays.toString(pieces));
}
}
}
我想重写CSV,以便在找到第1列中的副本时,数据会附加到第一个条目的新列中。
例如,所需的输出为:
somename1,value1
somename1,value2
somename1,value3
anothername1,anothervalue1
anothername1,anothervalue2
anothername1,anothervalue3
我如何在shell脚本中执行此操作?
TIA
答案 0 :(得分:1)
使用 Awk 时,您需要的不仅仅是删除重复的行,您需要一个逻辑,如下所示为$1
中的每个唯一条目创建一个元素数组。
该解决方案创建一个哈希映射,其中$1
中的唯一值作为数组的索引,而元素作为附加,
分隔符的值。
awk 'BEGIN{FS=OFS=","; prev="";}{ if (prev != $1) {unique[$1]=$2;} else {unique[$1]=(unique[$1]","$2)} prev=$1; }END{for (i in unique) print i,unique[i]}' file
anothername1,anothervalue1,anothervalue2,anothervalue3
somename1,value1,value2,value3
更具可读性的版本就是
BEGIN {
# set input and output field separator to ',' and initialize
# variable holding last instance of $1 to empty
FS=OFS=","
prev=""
}
{
# Update the value of $2 directly in the hash array only when new
# unique elements are found in $1
if (prev != $1){
unique[$1]=$2
}
else {
unique[$1]=(unique[$1]","$2)
}
# Update the current $1
prev=$1
}
END {
for (i in unique) {
print i,unique[i]
}
答案 1 :(得分:1)
FILE=$1
NAMES=`cut -d',' -f 1 $FILE | sort -u`
for NAME in $NAMES; do
echo -n "$NAME"
VALUES=`grep "$NAME" $FILE | cut -d',' -f2`
for VAL in $VALUES; do
echo -n ",$VAL"
done
echo ""
done
使用您的数据运行生成:
>bash script.sh data1.txt
anothername1,anothervalue1,anothervalue2,anothervalue3
somename1,value1,value2,value3
您的数据的文件名必须作为参数传递。可以通过重定向将输出写入新文件。
>bash script.sh data1.txt > data_new.txt