我正在从samba文件服务器读取一个.dat文件,该文件包含一些风传感器数据。该文件包含一个带有一些信息的标头(第1、3和4行),上面带有传感器名称的行(第2行)以及读取人体上的传感器的信息(144行,每10分钟1行),像这样:
"DataFormat","Anemometric tower","Datalogger","SomeCode","LoggerOS","LoggerFileSystem","AnotherCode","Table1"
"TIMESTAMP","RECORD","Precipit1","Barometer1","Temperature1","Humidity1","Anemometer1","Windvane1","Anemometer2","Windvane2","Battery1"
"TS","RN","","hPa","C. Deg","%RH","m/s","Deg","m/s","Deg","Volts"
"","","Smp","Avg","Avg","Avg","Avg","Avg","Avg","Avg","Avg"
"2019-06-19 00:10:00",1211,"NAN",921.014,19.57733,98.29526,10.76701,137.6863,10.68348,139.7062,11.91,
"2019-06-19 00:20:00",1212,"NAN",920.9402,19.44474,98.67733,9.991986,141.5792,9.892648,143.3559,11.35
"2019-06-19 00:30:00",1213,"NAN",920.6142,19.45635,99.00026,10.80979,148.0094,10.63116,150.0893,11.41
...more 141 lines...
我的目标是将原始文本转换为表格(例如,我知道第4列中的所有值都属于Barometer1传感器)。
我设法制作了一个PHP脚本,可以读取所有文件,直到文件结尾,追加到字符串,使用EOL分隔符(行的结果数组)爆炸,然后最后使用','(逗号)分隔符(数组数组?)
$data = '';
while (!feof($stream)) {
$data .= fread($stream, 8192);
}
$lines = explode(PHP_EOL, $data);
foreach ($lines as $line) {
$array[] = explode(",", $line);
}
然后我循环遍历$ array [$ row] [$ col],为每种类型的传感器创建一个列表,并将每个列表插入相应的数据库表中。
但是我需要使用它来处理 python 脚本,所以我尝试了
data = file_obj.read()
file_obj.close()
lines = data.split('\n')
array = []
for line in lines:
array[lines.index(line)] = line.split(',')
在终端上打印数据会以字符串形式返回全文,在终端上打印行会返回每行(例如:print(lines [1])),而数组返回错误
array[lines.index(line)] = line.split(',')
IndexError: list assignment index out of range
使用pysmb库从samba共享中获取file_obj
答案 0 :(得分:0)
为此,最好使用python pandas库在数据框中组织数据。
例如
import pandas as pd
pd.read_csv('yourfile.dat',
header=None, sep='\s\s+', engine='python')
答案 1 :(得分:0)
另一种解决方案是将您的行变成列表并将其附加到数据框:
import pandas as pd
list=[["DataFormat","Anemometric tower","Datalogger","SomeCode","LoggerOS","LoggerFileSystem","AnotherCode","Table1"]]
df=pd.DataFrame(list,columns=['col1'])
答案 2 :(得分:0)
为了编写类似的逻辑(php和python),我设法使用append使之工作 功能。
但是,就像人们提到的那样,熊猫图书馆也可以提供帮助。
data = file_obj.read()
file_obj.close()
lines = data.split('\n')
array = []
for line in lines:
array.append(line.split(','))