我有两个文件,每个文件的大小均相同(100x12),其中包含数值,正负均用逗号分隔。
文件1的示例输出
-14.99,-15.6,8.0 ->
-9.0,34.87,98.98 ->
(and so on)
文件2的示例输出
-15.99,-18.6,8.00 ->
-3.0,34.34,-98.88 ->
(and so on)
我尝试过:
awk '{getline t<"file1"; print $0-t}' file2
但是,这只会减去第一列。如何扩展它以从文件2 /列2中减去文件1 /列1?
我愿意使用熊猫来执行此操作。预先谢谢你!
答案 0 :(得分:1)
我的头顶上-请检查语法!
import numpy as np
with open("file1.txt, "r") as f1:
with open("file2.txt, "r") as f2:
array1 = np.asarray(f1.read().split(','))
array2 = np.asarray(f2.read().split(','))
result = array1 - array2
print([x for x in result])
答案 1 :(得分:1)
awk:
$ awk '
NR==FNR { # hash file1 values to a
for(i=1;i<=NF;i++)
a[FNR][i]=$i
next
}{ # process file2, subtract values from file1 respectives
for(i=1;i<=NF;i++)
$i=$i-a[FNR][i]
}1' file1 file1
输出:
-1,-3,8.00 ->
6,-0.53,-98.88 ->
答案 2 :(得分:1)
您可以尝试使用unix实用程序..用awk粘贴
paste file1.txt file2.txt |
awk -F"[,\t]" -v OFS="," ' { for(i=1;i<4;i++) { $i=$(i+3)-$i } print $1,$2,$3 } '
具有给定的输入
$ cat halletwx1.txt
-14.99,-15.6,8.0
-9.0,34.87,98.98
$ cat halletwx2.txt
-15.99,-18.6,8.00
-3.0,34.34,-98.88
$ paste halletwx1.txt halletwx2.txt | awk -F"[,\t]" -v OFS="," ' { for(i=1;i<4;i++) { $i=$(i+3)-$i } print $1,$2,$3 } '
-1,-3,0
6,-0.53,-197.86
$
答案 3 :(得分:1)
首先数据
file1 = """
-14.99,-15.6,8.0 ->
-9.0,34.87,98.98 ->
"""
file2 = """
-15.99,-18.6,8.00 ->
-3.0,34.34,-98.88 ->
"""
from io import StringIO # faking file on disk
大熊猫的答案。
import pandas as pd
converter = {2: lambda s: float(s.split(' ')[0])}
df1 = pd.read_csv(StringIO(file1), header=None, converters=converter)
df2 = pd.read_csv(StringIO(file2), header=None, converters=converter)
(df1-df2).to_csv('pddiff12.csv', header=False, index=False)
或使用纯python滚动。
# cmt 1 -> indent under with-statement
def read_csv(file_name):
#with open('file_name', 'rt') as f1: # uncomment when reading from disk
f1 = StringIO(file_name) # comment out when reading from disk
rows = [r for r in f1.readlines() if r.strip()] # cmt 1
crunch = lambda row: [float(r) for r in row.split(',')]
rows = [crunch(r.split(' ')[0]) for r in rows]
return rows
data1 = read_csv(file1)
data2 = read_csv(file2)
diff = []
for row1, row2 in zip(data1, data2):
diff.append([i-j for i, j in zip(row1, row2)])
with open('diff12.csv', 'wt') as d12:
for row in diff:
d12.write(', '.join((str(v) for v in row)) + '\n')
确保熊猫最容易阅读和使用,尽管如果人们倾向于避免这种情况,这是一个明显的依赖。 在这种情况下,我想我不会。