稀疏矩阵的元素划分,忽略0/0

时间:2018-05-25 13:18:02

标签: python numpy scipy sparse-matrix division

我有两个稀疏矩阵E和D,它们在相同的位置有非零条目。现在我希望将E/D作为稀疏矩阵,仅在D为非零的情况下定义。

例如,请使用以下代码:

import numpy as np
import scipy

E_full = np.matrix([[1.4536000e-02, 0.0000000e+00, 0.0000000e+00, 1.7914321e+00, 2.6854320e-01, 4.1742600e-01, 0.0000000e+00],
                    [9.8659000e-03, 0.0000000e+00, 0.0000000e+00, 1.9106752e+00, 5.7283640e-01, 1.4840370e-01, 0.0000000e+00],
                    [1.3920000e-04, 0.0000000e+00, 0.0000000e+00, 9.4346500e-02, 2.8285900e-02, 4.3967800e-02, 0.0000000e+00],
                    [0.0000000e+00, 4.5182676e+00, 0.0000000e+00, 0.0000000e+00, 7.3000000e-06, 1.5100000e-05, 4.0746900e-02],
                    [0.0000000e+00, 0.0000000e+00, 3.4002088e+00, 4.6826200e-02, 0.0000000e+00, 2.4246900e-02, 3.4529236e+00]])
D_full = np.matrix([[0.36666667, 0.        , 0.        , 0.33333333, 0.2       , 0.1       , 0.        ],
                    [0.23333333, 0.        , 0.        , 0.33333333, 0.4       , 0.03333333, 0.        ],
                    [0.06666667, 0.        , 0.        , 0.33333333, 0.4       , 0.2       , 0.        ],
                    [0.        , 0.63636364, 0.        , 0.        , 0.04545455, 0.03030303, 0.28787879],
                    [0.        , 0.        , 0.33333333, 0.33333333, 0.        , 0.22222222, 0.11111111]])
E = scipy.sparse.dok_matrix(E_full)
D = scipy.sparse.dok_matrix(D_full)

然后除法E/D产生一个完整的矩阵。

matrix([[3.96436360e-02,            nan,            nan, 5.37429635e+00, 1.34271600e+00, 4.17426000e+00,            nan],
        [4.22824292e-02,            nan,            nan, 5.73202566e+00, 1.43209100e+00, 4.45211145e+00,            nan],
        [2.08799990e-03,            nan,            nan, 2.83039503e-01, 7.07147500e-02, 2.19839000e-01,            nan],
        [           nan, 7.10013476e+00,            nan,            nan, 1.60599984e-04, 4.98300005e-04, 1.41541862e-01],
        [           nan,            nan, 1.02006265e+01, 1.40478601e-01,            nan, 1.09111051e-01, 3.10763127e+01]])

我也试过了不同的包。

import sparse
sparse.COO(E) / sparse.COO(D)

这让我错了。

ValueError: Performing this operation would produce a dense result: <ufunc 'true_divide'>

因此它也尝试创建一个密集矩阵。

我理解这是因为0/0 = nan。但无论如何,我对这些价值观并不感兴趣。那么我怎么能避免计算它们?

3 个答案:

答案 0 :(得分:1)

更新 :(由sacul启发)创建一个空的dok_matrix并仅使用export default class ShowSplitPdf extends Component{ constructor(props){ super(props); this.state={ loading:true, imgsrc : [] } } pdfConversion = ()=>{ if(window.PDFJS){ console.log(this.state); let urls = this.state.urls; for(var i = 0;i<urls.length;i++){ let newurl = urls[i]; //let newurl = 'http://172.104.60.70/st_old/uploads/defaultdocs/7/split/1527165241-42557/1_1527165241-42557.pdf'; this.pdfLoop(newurl,i); } } } pdfLoop = (item,index) => { var that = this; PDFJS.getDocument(item).then(function getPdfHelloWorld(pdf) { // // Fetch the first page console.log('url is : ',item); pdf.getPage(1).then(function getPageHelloWorld(page) { var scale = 0.5; var viewport = page.getViewport(scale); let cref = 'canvas'+index; let imgref ='img'+index; console.log('cref no : ',cref); console.log('img no : ',imgref); // Prepare canvas using PDF page dimensions // var canvas = that.canvasRefs[cref]; //let imagez = that.imageRefs[imgref]; var context = canvas.getContext('2d'); context.globalcompositeoperation = 'source-over'; // context.fillStyle = "#fff"; //draw on entire canvas //context.fillRect( 0, 0, canvas.width, canvas.height ); canvas.height = viewport.height; canvas.width = viewport.width; //imagez.src = canvas.toDataURL("image/png"); // // Render PDF page into canvas context // //page.render({canvasContext: context, viewport: viewport}); var task = page.render({canvasContext: context, viewport: viewport}) task.promise.then(function(){ //console.log(canvas.toDataURL('image/png')); let imgItem = {imgref:canvas.toDataURL('image/png')} let newState = that.state.imgsrc.concat(imgItem); that.setState({ imgsrc:newState }); //imagez.src = canvas.toDataURL('image/png') }); }); }); } componentDidMount(){ var formData = new FormData(); formData.append("filepath",this.props.item.pdfname); let editUrl = devUrl+'trip/getpdfSplitViewtripdoc?json=true'; //this.setState({showSplitFiles:true,loading:true}); var that = this; fetch(editUrl, { method: "post" , credentials:'include', body:formData }).then(function(response) { return response.json(); }).then(function(data) { let urls = []; for(var i = 0;i<data.files.length;i++){ let fileurl = pdfUrl+'split/'+data.dir+'/'+data.files[i]; urls.push(fileurl); } if(that._isMounted){ that.setState({ //splitpdfData:data, urls }); } }).catch((err)=>console.log(err)); } render(){ let canvasDiv = []; if(this.state.urls && this.state.urls.length>0){ this.state.urls.map((item,index)=>{ let canv = <canvas key={index} style={{display:'none'}} ref={(ref) => this.canvasRefs[`canvas${index}`] = ref} > </canvas>; canvasDiv.push(canv); }); } if(this.state.imgsrc.length>0){ this.state.imgsrc.map((item,index)=>{ console.log('item is : ',item); }); } return( <div> show pdf images here {this.state.loading?<Spin style={{width:'100%',margin:"0 auto"}} tip="Loading Pdf..."/>:''} {canvasDiv.length>0?canvasDiv:''} {canvasDiv.length>0?this.pdfConversion():''} </div> ) } } 修改D的非零部分。 (这应该适用于nonzero以外的稀疏矩阵。)

dok_matrix

您可以针对F = scipy.sparse.dok_matrix(E.shape) F[D.nonzero()] = E[D.nonzero()] / D[D.nonzero()] 尝试使用update + nonzero方法。

dok_matrix

首先,我们使用nonzero_idx = [tuple(l) for l in np.transpose(D.nonzero())] D.update({k: E[k]/D[k] for k in nonzero_idx}) 来确定矩阵nonzero中非0的索引。然后,我们将索引放在我们提供字典的D方法中< / p>

update

这样{k: E[k]/D[k] for k in nonzero_idx} 中的值将根据此词典进行更新。

说明:

D做的是

D.update({k: E[k]/D[k] for k in nonzero_idx})

请注意,这会更改for k in {k: E[k]/D[k] for k in nonzero_idx}.keys(): D[k] = E[k]/D[k] 。如果您想创建新的稀疏矩阵而不是修改D,请将D复制到另一个矩阵,例如D

ret

答案 1 :(得分:0)

我不认为这是nan值的问题,这是预期的结果。

如果您想用nan替换0,可以使用np.nan_to_num(doc:https://docs.scipy.org/doc/numpy/reference/generated/numpy.nan_to_num.html

答案 2 :(得分:0)

使用multiply的简单实现:

def sparse_divide_nonzero(a, b):
    inv_b = b.copy()
    inv_b.data = 1 / inv_b.data
    return a.multiply(inv_b)

用作:

import scipy as sp
import scipy.sparse

N, M = 4, 4
M1 = sp.sparse.random(N, M, 0.5, 'csr')
M2 = sp.sparse.random(N, M, 0.5, 'csr')

M3 = sparse_divide_nonzero(M1, M2)

print(M1, '\n')
#   (0, 1)  0.9360024198546736
#   (1, 1)  0.625073080022902
#   (1, 2)  0.4086612951451881
#   (2, 0)  0.06864456080221182
#   (2, 1)  0.9871542989102963
#   (2, 3)  0.4371900022237898
#   (3, 0)  0.12121502419640318
#   (3, 3)  0.22950388104392383 

print(M2, '\n')
#   (1, 0)  0.9753308317090571
#   (1, 2)  0.29870724277296024
#   (1, 3)  0.21116220574550637
#   (2, 1)  0.5039729514070662
#   (2, 2)  0.4463809800134303
#   (3, 0)  0.36751994181969416
#   (3, 1)  0.6189763803260612
#   (3, 2)  0.3870101687623324 

print(M3, '\n')
#   (1, 2)  1.368099719817645
#   (2, 1)  1.9587446035629748
#   (3, 0)  0.3298189034212229