我有文件' f1'看起来像这样:
ID X Y Z
1 439748.5728 7948406.945 799.391875
1 439767.6229 7948552.995 796.977271
1 439805.7229 7948711.745 819.359365
1 439799.3729 7948851.446 776.425797
2 440764.5749 7948991.146 235.551602
2 440504.2243 7948984.796 326.929119
2 440104.1735 7948984.796 536.893601
2 439742.2228 7949003.846 737.887029
2 438580.1705 7949537.247 196.300929
3 438142.0196 7947340.142 388.997748
3 438599.2205 7947333.792 480.580256
3 439126.2716 7947340.142 669.802869
4 438453.1702 7947594.143 600.856103
4 438294.4199 7947657.643 581.018396
4 438167.4197 7947702.093 515.149846
我想使用文件f1中每个ID值的x,y,z值运行一个命令(让我们说打印使这里更简单)
import numpy as np
f1 = ('file1.txt')
id = np.loadtxt(f1, skiprows=1, usecols=[0])
for i in id:
x = np.loadtxt(f1, skiprows=1, usecols=[1])
y = np.loadtxt(f1, skiprows=1, usecols=[2])
z = np.loadtxt(f1, skiprows=1, usecols=[3])
print ('The x, y, z lists of id= %g are:' %(i))
print (x,y,z)
此代码返回f1的每一行的x,y和z列表,但我希望它返回ID列的每个不同值的x,y和z列表。
例如对于ID = 3,它应该返回:
[438142.0196, 438599.2205, 439126.2716] [7947340.142, 7947333.792, 7947340.142] [388.997748, 480.580256, 669.802869]
非常感谢任何帮助!
答案 0 :(得分:0)
为您的结果制作一个容器:
d = {}
迭代文件并拆分每一行以提取您感兴趣的部分
id_, *xyz = line.strip().split()
然后将其添加到词典
try:
d[id_].append(xyz)
except KeyError:
d[id_] = []
d[id_].append(xyz)
使用collections.defaultdict作为容器可以简化代码 - 第一次看到id_
时,您无需考虑KeyErrors。
d = collections.defaultdict(list)
...
d[id_].append(xyz)
答案 1 :(得分:0)
如果您能够使用Pandas,这是一个简单的解决方案:
import pandas as pd
fname = "file1.txt"
df = pd.read_csv("f1.txt", sep=" ") # or substitute with appropriate separator
for i in df.ID.unique():
print(df.loc[df.ID==i])
ID X Y Z
0 1 439748.5728 7948406.945 799.391875
1 1 439767.6229 7948552.995 796.977271
2 1 439805.7229 7948711.745 819.359365
3 1 439799.3729 7948851.446 776.425797
ID X Y Z
4 2 440764.5749 7948991.146 235.551602
5 2 440504.2243 7948984.796 326.929119
6 2 440104.1735 7948984.796 536.893601
7 2 439742.2228 7949003.846 737.887029
8 2 438580.1705 7949537.247 196.300929
ID X Y Z
9 3 438142.0196 7947340.142 388.997748
10 3 438599.2205 7947333.792 480.580256
11 3 439126.2716 7947340.142 669.802869
ID X Y Z
12 4 438453.1702 7947594.143 600.856103
13 4 438294.4199 7947657.643 581.018396
14 4 438167.4197 7947702.093 515.149846
要精确获取您在OP中指定的输出,请使用:
for i in df.ID.unique():
print ('The x, y, z lists of id= %g are:' %(i))
print(df.loc[df.ID==i, ['X','Y','Z']].values)
The x, y, z lists of id= 1 are:
[[ 4.39748573e+05 7.94840695e+06 7.99391875e+02]
[ 4.39767623e+05 7.94855300e+06 7.96977271e+02]
[ 4.39805723e+05 7.94871175e+06 8.19359365e+02]
[ 4.39799373e+05 7.94885145e+06 7.76425797e+02]]
The x, y, z lists of id= 2 are:
[[ 4.40764575e+05 7.94899115e+06 2.35551602e+02]
[ 4.40504224e+05 7.94898480e+06 3.26929119e+02]
[ 4.40104173e+05 7.94898480e+06 5.36893601e+02]
[ 4.39742223e+05 7.94900385e+06 7.37887029e+02]
[ 4.38580171e+05 7.94953725e+06 1.96300929e+02]]
The x, y, z lists of id= 3 are:
[[ 4.38142020e+05 7.94734014e+06 3.88997748e+02]
[ 4.38599220e+05 7.94733379e+06 4.80580256e+02]
[ 4.39126272e+05 7.94734014e+06 6.69802869e+02]]
The x, y, z lists of id= 4 are:
[[ 4.38453170e+05 7.94759414e+06 6.00856103e+02]
[ 4.38294420e+05 7.94765764e+06 5.81018396e+02]
[ 4.38167420e+05 7.94770209e+06 5.15149846e+02]]
答案 2 :(得分:0)
这个怎么样 -
import numpy as np
mydata = np.genfromtxt(r'path\to\my\text.txt', skip_header=1) # to skip the header which is a text
finalArr = [] # to display our final result
for i in xrange(len(mydata)):
if mydata[i][0] == 3: # 3 is the ID, column 1 of the txt file. Change it with some other ID
temp=[]
for j in xrange(1, len(mydata[i])):
temp.append(mydata[i][j])
finalArr.append(temp)
print finalArr
答案 3 :(得分:0)
没有try-except
,没有defaultdict
,没有pandas
。只需使用保密的秘密构建数据字典,您不仅可以引用dict
值
通过d[k]
,但也可以通过d.get
方法d
,如果d.get(k, default)
中的密钥尚未出现,则允许您指定默认值,就像在a, *r = alist
中一样。
我们的默认值必须是空列表,我们可以在其中附加要从行的其余部分获取的值列表,我们可以使用Python的新语法21:25 $ python
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> # lines = open('yourdata').readlines()
>>> lines = '''ID X Y Z
... 1 439748.5728 7948406.945 799.391875
... 1 439767.6229 7948552.995 796.977271
... 1 439805.7229 7948711.745 819.359365
... 1 439799.3729 7948851.446 776.425797
... 2 440764.5749 7948991.146 235.551602
... 2 440504.2243 7948984.796 326.929119
... 2 440104.1735 7948984.796 536.893601
... 2 439742.2228 7949003.846 737.887029
... 2 438580.1705 7949537.247 196.300929
... 3 438142.0196 7947340.142 388.997748
... 3 438599.2205 7947333.792 480.580256
... 3 439126.2716 7947340.142 669.802869
... 4 438453.1702 7947594.143 600.856103
... 4 438294.4199 7947657.643 581.018396
... 4 438167.4197 7947702.093 515.149846'''.split('\n')
>>> d = {}
>>> ################## TL ; DR ###############################
>>> for k, *rest in (line.split() for line in lines[1:] if line):
... d[k] = d.get(k, []) + [[float(f) for f in rest]]
... ################## TL ; DR ###############################
>>> for k in d:
... print(k)
... for l in d[k]: print('\t', l)
...
1
[439748.5728, 7948406.945, 799.391875]
[439767.6229, 7948552.995, 796.977271]
[439805.7229, 7948711.745, 819.359365]
[439799.3729, 7948851.446, 776.425797]
2
[440764.5749, 7948991.146, 235.551602]
[440504.2243, 7948984.796, 326.929119]
[440104.1735, 7948984.796, 536.893601]
[439742.2228, 7949003.846, 737.887029]
[438580.1705, 7949537.247, 196.300929]
3
[438142.0196, 7947340.142, 388.997748]
[438599.2205, 7947333.792, 480.580256]
[439126.2716, 7947340.142, 669.802869]
4
[438453.1702, 7947594.143, 600.856103]
[438294.4199, 7947657.643, 581.018396]
[438167.4197, 7947702.093, 515.149846]
>>>
来获取
numpy
如果您需要>>> import numpy as np
>>> for k in d: d[k] = np.array(d[k])
数组的字典,
{{1}}
这就是全部。
答案 4 :(得分:0)
这里的答案似乎过于复杂。这是一个只使用numpy的双线:
只需加载整个文件并找到唯一的ID:
a = np.loadtxt('file1.txt', skiprows=1)
ids = np.unique(a[0, :])
# ids = array([ 1., 2., 3., 4.])
然后,通过在每个id:
索引a
来创建列表
b = [a[a[:, 0] == i, 1:] for i in ids]
给出:
[array([[ 4.39748573e+05, 7.94840695e+06, 7.99391875e+02],
[ 4.39767623e+05, 7.94855300e+06, 7.96977271e+02],
[ 4.39805723e+05, 7.94871175e+06, 8.19359365e+02],
[ 4.39799373e+05, 7.94885145e+06, 7.76425797e+02]]),
array([[ 4.40764575e+05, 7.94899115e+06, 2.35551602e+02],
[ 4.40504224e+05, 7.94898480e+06, 3.26929119e+02],
[ 4.40104173e+05, 7.94898480e+06, 5.36893601e+02],
[ 4.39742223e+05, 7.94900385e+06, 7.37887029e+02],
[ 4.38580171e+05, 7.94953725e+06, 1.96300929e+02]]),
array([[ 4.38142020e+05, 7.94734014e+06, 3.88997748e+02],
[ 4.38599220e+05, 7.94733379e+06, 4.80580256e+02],
[ 4.39126272e+05, 7.94734014e+06, 6.69802869e+02]]),
array([[ 4.38453170e+05, 7.94759414e+06, 6.00856103e+02],
[ 4.38294420e+05, 7.94765764e+06, 5.81018396e+02],
[ 4.38167420e+05, 7.94770209e+06, 5.15149846e+02]])]
例如,如果您现在想要第一个ID的y值,只需使用b[0][:, 1]
。