将每行10个随机非零值的值设置为零

时间:2018-07-24 09:43:22

标签: python pandas

我有一个非常稀疏的Pandas DataFrame,它有大约1,000行和大约10,000列。大多数行仅包含20-100个非零值。我现在想在每行中选择任意10个随机非零值,并将其值设置为0。

这是我的第一次尝试(非常不支持熊猫):

for i in range(df.shape[0]):
    row = df.iloc[i]
    nonZeros = np.where(row > 0)[0]
    rand = np.random.choice(nonZeros, 10)
    for j in rand:
        df.iloc[i, j] = 0

2 个答案:

答案 0 :(得分:0)

Something like this?

def setrandom(x):
    counter=10
    while counter>0:
        randindex = np.random.randint(1,10000)
        if x[randindex] !=0:
            x[randindex] = 0
            counter -=1        
    return x

df = df.apply(setrandom, axis=1)

this is not really an optimal way of doing it, especially since your dataframe is a sparse one!

答案 1 :(得分:0)

修改后的答案

您可以使用以下代码:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"  xmlns:m="http://jasperreports.sourceforge.net/jasperreports/print"
    exclude-result-prefixes="xs m"
    version="2.0">
    <xsl:output indent="yes"/>
    <xsl:template match="m:page">
        <secondChildLis>
            <xsl:for-each select="//m:secondChild">
                <secondChild>
                    <xsl:apply-templates/>
                </secondChild>
            </xsl:for-each>

        </secondChildLis>
    </xsl:template>

</xsl:stylesheet>
You may also try it.

也许不是最快的方法,但对熊猫友好得多