当n_samples小于n_features时,pca score / score_samples函数将引发ValueError

时间:2018-12-21 18:14:00

标签: python numpy scikit-learn pca

对具有n_samples个样品进行PCA时

以下代码将引发错误。 n_samples为5,可以通过设置n_components <5来避免错误。

from sklearn.decomposition import PCA
import numpy as np

pca = PCA()
train = np.random.rand(5,100)
pca.fit(train)
pca.score(np.random.rand(5,100))

预期结果:
评分功能在n_samples时显示默认n_components设置的工作

错误日志:

Traceback (most recent call last):

File "", line 7, in 
pca.score(np.random.rand(5,100))

File "C:\ProgramData\Anaconda3\envs\python2\lib\site-packages\sklearn\decomposition\pca.py", line 594, in score
return np.mean(self.score_samples(X))

File "C:\ProgramData\Anaconda3\envs\python2\lib\site-packages\sklearn\decomposition\pca.py", line 569, in score_samples
precision = self.get_precision()

File "C:\ProgramData\Anaconda3\envs\python2\lib\site-packages\sklearn\decomposition\base.py", line 76, in get_precision
np.dot(linalg.inv(precision), components_))

File "C:\ProgramData\Anaconda3\envs\python2\lib\site-packages\scipy\linalg\basic.py", line 946, in inv
a1 = _asarray_validated(a, check_finite=check_finite)

File "C:\ProgramData\Anaconda3\envs\python2\lib\site-packages\scipy_lib_util.py", line 238, in _asarray_validated
a = toarray(a)

File "C:\ProgramData\Anaconda3\envs\python2\lib\site-packages\numpy\lib\function_base.py", line 1215, in asarray_chkfinite
"array must not contain infs or NaNs")

ValueError: array must not contain infs or NaNs

1 个答案:

答案 0 :(得分:1)

如果.so具有调试信息,那么它只是微不足道的,您只需使用调试器运行应用程序,然后执行您提到的相同的操作,并添加一些断点。您有许多调试器和调试选项。例如,您可以从控制台运行gdb,要求执行该应用程序:

>gdb yourAppPath

或附加到正在运行的进程:

>gdb
(gdb) attach runningProcessId

一旦您进入gdb,就可以在任意位置放置断点(您应该阅读gdb文档)。

另一种选择是将GUI前端用于gdb(即kgdb)here you can find a list of front ends。另外,Linux上可用的大多数IDE都具有与gdb的良好图形集成。

现在,如果库(.so)文件中没有调试符号,则应使用编译器的调试信息生成选项重新编译它们。如果您使用的是gcc,它将为'-g'。

如果您的发行版提供了这些库,则它们中的大多数都有用于调试符号的单独软件包,一旦安装这些软件包,这些符号将可用,并且gdb应该自动加载它们。例如,在OpenSuse中,您具有glibc软件包和glibc-debuginfo软件包,其中包含glibc的符号。

这是我的一个项目中使用gdb的最小示例:

首先,我将所有库的搜索路径称为gdb,因为它们不在系统路径中。如果您的库是“已安装”的,则可以从系统路径访问它们,而无需指定LD_LIBRARY_PATH。

>LD_LIBRARY_PATH=.:~/projects/asdstoolkit2/asdscore/:~/projects/asdstoolkit2/asdscrypto:~/projects/asdstoolkit2/asdsnet:~/projects/asdstoolkit2/codb/codb gdb ../clibrarian/clibrarian 

#output

Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ../clibrarian/clibrarian...
(gdb) 

然后,您可以调用start,以强制gdb加载所有库符号并在程序的开始处放置一个断点。 (您也可以不执行此操作而加载符号,我只是不记得现在的情况,请检查文档。)

(gdb)start
Temporary breakpoint 1 at 0x407aa9: file ../../librarian/clibrarian/main.cpp, line 16.
Starting program: /home/pablo/projects/build-librarian-Desktop-Debug/clibrarian/clibrarian 
Missing separate debuginfos, use: zypper install glibc-debuginfo-2.32-1.1.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffdbf8) at ../../librarian/clibrarian/main.cpp:16
16          qputenv("QT_FORCE_STDERR_LOGGING", QByteArray("1"));
Missing separate debuginfos, use: zypper install krb5-debuginfo-1.18.2-3.2.x86_64 libQt5Core5-debuginfo-5.15.1-2.1.x86_64 libQt5Gui5-debuginfo-5.15.1-2.1.x86_64 libQt5Network5-debuginfo-5.15.1-2.1.x86_64 libQt5Sql5-debuginfo-5.15.1-2.1.x86_64 libX11-6-debuginfo-1.6.12-1.1.x86_64 libXau6-debuginfo-1.0.9-1.7.x86_64 libbrotlicommon1-debuginfo-1.0.9-1.1.x86_64 libbrotlidec1-debuginfo-1.0.9-1.1.x86_64 libbz2-1-debuginfo-1.0.8-2.20.x86_64 libcom_err2-debuginfo-1.45.6-1.19.x86_64 libcurl4-debuginfo-7.72.0-1.2.x86_64 libdouble-conversion3-debuginfo-3.1.5-3.4.x86_64 libexiv2-27-debuginfo-0.27.3-2.1.x86_64 libexpat1-debuginfo-2.2.10-1.1.x86_64 libffi8-debuginfo-3.3.git30-1.13.x86_64 libfontconfig1-debuginfo-2.13.1-2.8.x86_64 libfreetype6-debuginfo-2.10.2-1.3.x86_64 libgcc_s1-debuginfo-10.2.1+git583-1.2.x86_64 libglib-2_0-0-debuginfo-2.64.6-1.1.x86_64 libglvnd-debuginfo-1.3.2-2.1.x86_64 libgmp10-debuginfo-6.2.0-3.3.x86_64 libgnutls30-debuginfo-3.6.15-1.1.x86_64 libgpg-error0-debuginfo-1.39-1.1.x86_64 libgraphite2-3-debuginfo-1.3.14-1.2.x86_64 libharfbuzz0-debuginfo-2.7.2-1.1.x86_64 libhogweed6-debuginfo-3.6-1.5.x86_64 libicu67-debuginfo-67.1-2.3.x86_64 libidn12-debuginfo-1.36-1.2.x86_64 libidn2-0-debuginfo-2.3.0-3.2.x86_64 libjpeg8-debuginfo-8.2.2-60.2.x86_64 libldap-2_4-2-debuginfo-2.4.53-57.2.x86_64 liblz4-1-debuginfo-1.9.2-2.1.x86_64 liblzma5-debuginfo-5.2.5-1.16.x86_64 libmodman1-debuginfo-2.0.1-18.10.x86_64 libnettle8-debuginfo-3.6-1.5.x86_64 libopenssl1_1-debuginfo-1.1.1g-2.13.x86_64 libp11-kit0-debuginfo-0.23.20-2.1.x86_64 libpcre2-16-0-debuginfo-10.35-1.4.x86_64 libpng16-16-debuginfo-1.6.37-1.7.x86_64 libpodofo0_9_6-debuginfo-0.9.6-4.8.x86_64 libproxy1-debuginfo-0.4.15-9.1.x86_64 libsasl2-3-debuginfo-2.1.27-3.5.x86_64 libssh4-debuginfo-0.9.5-1.1.x86_64 libstdc++6-debuginfo-10.2.1+git583-1.2.x86_64 libsystemd0-debuginfo-246.6-1.1.x86_64 libtag1-debuginfo-1.11.2~git20190725.79bc9ccf-2.3.x86_64 libtasn1-6-debuginfo-4.16.0-1.6.x86_64 libtiff5-debuginfo-4.1.0-2.4.x86_64 libunistring2-debuginfo-0.9.10-2.8.x86_64 libz1-debuginfo-1.2.11-16.1.x86_64 libzip5-debuginfo-1.7.3-1.2.x86_64 libzstd1-debuginfo-1.4.5-2.4.x86_64

然后,您可以列出方法的源代码:(您可以点击Tab键自动完成)

(gdb) list AIcon::AIcon
file: "aicon.cpp", line number: 12, symbol: "AIcon::AIcon()"
7       #include <QTextStream>
8       extern QStringList AIcon_google_catnames;
9       extern QList<QList<GoogleIcon>* > AIcon_google_Cats;
10      static QStringList AIcon_texts=QStringList()<<QString();
11      static QStringList AIcon_files=QStringList()<<QString();
12      AIcon::AIcon()
13      {
14          m_itype=Icon_Null;
15          m_icon=0;
16      }
file: "aicon.cpp", line number: 17, symbol: "AIcon::AIcon(GoogleIcon)"
12      AIcon::AIcon()
13      {
14          m_itype=Icon_Null;
15          m_icon=0;
16      }
17      AIcon::AIcon(GoogleIcon i)
18      {
19          m_itype=Icon_Google;
20          m_icon=i;
21      }
file: "aicon.cpp", line number: 22, symbol: "AIcon::AIcon(QString const&, bool)"
17      AIcon::AIcon(GoogleIcon i)
18      {
19          m_itype=Icon_Google;
20          m_icon=i;
21      }
22      AIcon::AIcon(const QString &text,bool ti)
23      {
24          if (ti)
25              setTextIcon(text);
26          else
file: "aicon.cpp", line number: 29, symbol: "AIcon::AIcon(AI18n::CountryCode)"
24          if (ti)
25              setTextIcon(text);
26          else
27              setFileIcon(text);
28      }
29      AIcon::AIcon(AI18n::CountryCode cc)
30      {
31          setFlagIcon(cc);
32      }
33      AIcon::AIcon(AI18n::LanguageCode cc)
file: "aicon.cpp", line number: 33, symbol: "AIcon::AIcon(AI18n::LanguageCode)"
28      }
29      AIcon::AIcon(AI18n::CountryCode cc)
30      {
31          setFlagIcon(cc);
32      }
33      AIcon::AIcon(AI18n::LanguageCode cc)
34      {
35          setLanguageIcon(cc);
36      }
37

最后,您可以放置​​一个断点并最终继续执行

(gdb) break 35
Breakpoint 2 at 0x7ffff7f04651: file aicon.cpp, line 35.
(gdb) continue