我有一个包含字符串行的文件。我需要2D格式的数据,因此我正在使用以下代码读取文件:
with open("User1.txt", "r") as file:
Data=[line.split() for line in file]
我文件中的数据采用以下格式
cpp
sh
xrdb
cpp
sh
xrdb
mkpts
env
csh
csh
我想在此数据上训练隐马尔可夫模型,因为我的数据是2D格式的,因此每当我将其提供给以下模型时,我都会收到必须为浮点格式的错误
model = hmm.GaussianHMM(n_components=2)
model.fit(Train)
错误代码如下:
ValueErrorTraceback (most recent call last)
<ipython-input-23-ac384d51f21d> in <module>()
1 remodel = hmm.GaussianHMM(n_components=2)
----> 2 remodel.fit(Train)
/usr/local/lib/python2.7/dist-packages/hmmlearn/base.pyc in fit(self, X,
lengths)
422 """
423 X = check_array(X)
--> 424 self._init(X, lengths=lengths)
425 self._check()
426
/usr/local/lib/python2.7/dist-packages/hmmlearn/hmm.pyc in _init(self, X,
lengths)
193 kmeans = cluster.KMeans(n_clusters=self.n_components,
194 random_state=self.random_state)
--> 195 kmeans.fit(X)
196 self.means_ = kmeans.cluster_centers_
197 if 'c' in self.init_params or not hasattr(self, "covars_"):
/usr/local/lib/python2.7/dist-packages/sklearn/cluster/k_means_.pyc in
fit(self, X, y, sample_weight)
969 tol=self.tol, random_state=random_state,
copy_x=self.copy_x,
970 n_jobs=self.n_jobs, algorithm=self.algorithm,
--> 971 return_n_iter=True)
972 return self
973
/usr/local/lib/python2.7/dist-packages/sklearn/cluster/k_means_.pyc in
k_means(X, n_clusters, sample_weight, init, precompute_distances, n_init,
max_iter, verbose, tol, random_state, copy_x, n_jobs, algorithm,
return_n_iter)
309 order = "C" if copy_x else None
310 X = check_array(X, accept_sparse='csr', dtype=[np.float64,
np.float32],
--> 311 order=order, copy=copy_x)
312 # verify that the number of samples given is larger than k
313 if _num_samples(X) < n_clusters:
/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.pyc in
check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy,
force_all_finite, ensure_2d, allow_nd, ensure_min_samples,
ensure_min_features, warn_on_dtype, estimator)
525 try:
526 warnings.simplefilter('error', ComplexWarning)
--> 527 array = np.asarray(array, dtype=dtype, order=order)
528 except ComplexWarning:
529 raise ValueError("Complex data not supported\n"
/usr/local/lib/python2.7/dist-packages/numpy/core/numeric.pyc in asarray(a,
dtype, order)
490
491 """
--> 492 return array(a, dtype, copy=False, order=order)
493
494
ValueError: could not convert string to float: cpp
我尝试使用以下代码将数据转换为浮点数:
1. Data1=map(float,Data)
2. Data=float(Train)
但是它给出了以下错误:
TypeErrorTraceback (most recent call last)
<ipython-input-35-f990bdf7a675> in <module>()
----> 1 Data1=map(float,Train)
TypeError: float() argument must be a string or a number
任何人都可以建议我的代码有什么问题吗?