昨天我问了一个类似的问题,但是已经删除了它,因为我现在意识到它的格式不正确(我是Python新手)。因此,任何人都很难帮助我。我很抱歉;我知道这不是好形式。我希望我在这里做得更好。
背景: 我有一个模拟的几个输出文件。我想从文件中导入和绘制数据。大多数文件的编号在列中排列。使用“ loadtxt”导入数字很容易。它们以小数的浮点数数组形式到达(据我所知),然后我可以绘制它们。
问题: 我已经为one的文件苦苦挣扎了三天,因为它们没有按好的列排列。它由文本和数字组成,在绘制之前,我必须首先提取所需的数字(单击上一个单词“ one”以查看文件的一小段-实际的行长为数千行)。我将其称为“困难”文件。我可以提取并导入数字,但是它们以元组的形式到达,而且我无法将它们转换为浮点数数组,因此无法根据我从其他文件导入的数据来绘制它们。
即使过去几天尝试过,我也不真正了解元组是什么,所以我可能在某个地方犯了一个愚蠢的错误。在下面的示例中,我尝试使用一种将元组转换为浮点数组的方法。任何建议将不胜感激。请随意提出任何批评,让我更清楚地说明这一点。
我的代码:
from scipy import *
import numpy as np
import matplotlib
matplotlib.use(matplotlib.get_backend())
import matplotlib.pyplot as plt
import re
while True:
try:
cellfile1="pathToDifficultFile" #I have to use "regex" to extract numbers from a file that contains numbers and text. They arrive as some kind of tuple.
infile1=open(cellfile1,'r')
cellfile2="pathToEasyFile" #I can use "loadtxt" to get the data. The data arrive as nice arrays of floats--for example, times: 1, 2, 3, 4,... seconds.
infile2=open(cellfile2,'r')
break
except IOError as e:
print("Cannot find file..try again.")
skip = int(input('How many steps to skip?')) # Skip the first few time steps (first rows in my output files) because the result often not correct in my simulations.
cell = loadtxt(cellfile2,skiprows=2+skip)
step = np.array(cell[:,0]) # This is what I want to be the x axis data in my plot; it's just time, like 1, 2, 3, 4 seconds.
# Extract numbers I need from the difficult file
for line in infile1: # Iterate over the lines
match = re.search('Total= (\d.+)', line) # Returns weird tuple.
if match: # Did we find a match?
totalMoment0 = match.group(1) # Yes, process it
totalMoment = np.asarray(totalMoment0) #Here I'm trying to convert the weird imported tuple data from regex directly above to an array of floats so I can plot it versus the time data imported from the other file.
avgtotalMoment =np.cumsum(totalMoment)/(step-skip)
plt.plot(step,totalMoment,'-')
plt.plot(step,avgtotalMoment,'-')
plt.xlabel('Timestep')
plt.ylabel('Imported difficult data')
plt.show()
我的代码的输出:
How many steps to skip?0
[[ 1.00000000e+00 5.00000000e-01 7.82390980e-01 ..., -9.94476371e+02
-9.93104616e+02 2.86557169e+01]
[ 2.00000000e+00 1.00000000e+00 7.70928719e-01 ..., -9.94464419e+02
-9.93104149e+02 5.06833816e+00]
[ 3.00000000e+00 1.50000000e+00 7.50579191e-01 ..., -9.94443439e+02
-9.93103532e+02 5.15203691e+00]
...,
[ 2.13340000e+04 1.06670000e+04 7.57428741e-01 ..., -9.94623426e+02
-9.93037136e+02 1.91433048e+01]
[ 2.13350000e+04 1.06675000e+04 7.28059027e-01 ..., -9.94593384e+02
-9.93036461e+02 3.76293707e+00]
[ 2.13360000e+04 1.06680000e+04 7.08130301e-01 ..., -9.94572844e+02
-9.93035855e+02 4.03132892e+00]]
Traceback (most recent call last):
File "momentsFromQsMomentsFile.py", line 42, in <module>
plt.plot(step,totalMoment,'-')
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/pyplot.py", line 2987, in plot
ret = ax.plot(*args, **kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 4137, in plot
for line in self._get_lines(*args, **kwargs):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 317, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 295, in _plot_args
x, y = self._xy_from_xy(x, y)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/matplotlib/axes.py", line 237, in _xy_from_xy
raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension
答案 0 :(得分:0)
我认为这里的问题可能是您的for循环。
for line in infile1: # Iterate over the lines
match = re.search('Total= (\d.+)', line) # Returns a match object
if match: # Did we find a match?
totalMoment0 = match.group(1) # this will be a string, assuming the group has a match.
请注意,每次找到匹配项时,您将如何分配给totalMoment0?因此,您每次都会得到一个字符串,然后将其覆盖。我认为,与此相关的另一个问题是python中的字符串是可迭代的!因此,您的最后一个匹配项,例如"1000"
是一个字符串,numpy的asarray
将很高兴地转换为数组,就像array('1','0', '0', '0')
!
您应该做的是像这样附加值:
output_matches = [] # set up an empty list
for line in infile1: # Iterate over the lines
match = re.search('Total= (\d.+)', line) # Try and get a match
if match: # Did we find a match?
output_matches.append(float(match.group(1))) # append the match to the list, casting the match as a float as you do so.
请注意,如果此处的正则表达式不正确,则尝试将其强制转换为浮点数时可能会出错。但是我会把这个问题留给将来的你!
答案 1 :(得分:0)
这是访问元组值并将字符串转换为浮点数的方法:
>>> m = re.search(r'Total=\s+([0-9\-\.]+)', " Random Stuff 12348 Total= -23.94409825335")
>>> m.groups()
('-23.94409825335',)
>>> result = float(m.groups()[0])
>>> result
-23.94409825335