给定一个如下所示的数据框:
A B
2005-09-06 5 -2
2005-09-07 -1 3
2005-09-08 4 5
2005-09-09 -8 2
2005-09-10 -2 -5
2005-09-11 -7 9
2005-09-12 2 8
2005-09-13 6 -5
2005-09-14 6 -5
有没有像这样创建2x2矩阵的pythonic方法:
1 0
1 a b
0 c d
其中:
a = ob的数量,其中A列和B列的相应元素均为正值。
b =柱的相应元素在B列中为正和负的obs数。
c =列A中相应元素为负且在B列中为正的障碍数。
d = A列和B列的相应元素均为负值的obs数。
对于此示例,输出将为:
1 0
1 2 3
0 3 1
由于
答案 0 :(得分:26)
可能最容易使用pandas函数crosstab
。借用以上的Dyno Fu:
import pandas as pd
from StringIO import StringIO
table = """dt A B
2005-09-06 5 -2
2005-09-07 -1 3
2005-09-08 4 5
2005-09-09 -8 2
2005-09-10 -2 -5
2005-09-11 -7 9
2005-09-12 2 8
2005-09-13 6 -5
2005-09-14 6 -5
"""
sio = StringIO(table)
df = pd.read_table(sio, sep=r"\s+", parse_dates=['dt'])
df.set_index("dt", inplace=True)
pd.crosstab(df.A > 0, df.B > 0)
输出:
B False True
A
False 1 3
True 3 2
[2 rows x 2 columns]
如果您想使用scipy.stats
等进行Fisher精确测试,该表也可用:
from scipy.stats import fisher_exact
tab = pd.crosstab(df.A > 0, df.B > 0)
fisher_exact(tab)
答案 1 :(得分:16)
让我们调用您的数据框a = data['A']>0
b = data['B']>0
data.groupby([a,b]).count()
。尝试
<!DOCTYPE html>
<html lang="en">
<?php
$ty=$_GET['param'];
$name=$_GET['param1'];
if($ty=='teacher')
{
$web = "<a href='teacherrepute.php?a=$name'>My repute score</a>";
$rep = "<a href='teacherreported.php?a=$name'>My reported sites</a>";
$blk = "<a href='newblocktryteacher.php?a=$name'>Block this site</a>";
$unblk = "<a href='newtryunblockteacher.php?a=$name>Unblock this site";
}
else
{
$web = "<a href='pupilrepute.php?a=$name'>My repute score</a>";
$rep = "<a href='pupilreported.php?a=$name'>My reported sites</a>";
$blk = "<a href='newblocktrypupil.php?a=$name'>Block this site</a>";
$unblk = "<a href='newtryunblockpupil.php?a=$name>Unblock this site";
}
// $type=$_GET['param2'];
$courseA='A';
$courseB='B';
?>
<body>
<a href="reporttable.html"><?php echo $rep; ?></a>
<FORM action = <?php echo $blk; ?> method ="POST";>
Block : <input type ="text" name = "url" /></br>
<br>
<input type="submit" value="block" />
<br>
</FORM>
</body>
</html>
答案 2 :(得分:6)
这是关于pandas交叉表功能的一个非常有用的页面:
http://chrisalbon.com/python/pandas_crosstabs.html
所以我想你应该做什么,你应该使用
import pandas as pd
pd.crosstab(data['A']>0, data['B']>0)
希望有所帮助!
答案 3 :(得分:4)
import pandas as pd
from StringIO import StringIO
table = """dt A B
2005-09-06 5 -2
2005-09-07 -1 3
2005-09-08 4 5
2005-09-09 -8 2
2005-09-10 -2 -5
2005-09-11 -7 9
2005-09-12 2 8
2005-09-13 6 -5
2005-09-14 6 -5
"""
sio = StringIO(table)
df = pd.read_table(sio, sep=r"\s+", parse_dates=['dt'])
df.set_index("dt", inplace=True)
a = df['A'] > 0
b = df['B'] > 0
df1 = df.groupby([a,b]).count()
print df1["A"].unstack()
输出:
B False True
A
False 1 3
True 3 2
这只是lnanenok的回答并使用unstack()
使其更具可读性。应该归功于lanenok。