考虑一个简单的记录数组结构:
import numpy as np
ijv_dtype = [
('I', 'i'),
('J', 'i'),
('v', 'd'),
]
ijv = np.array([
(0, 0, 3.3),
(0, 1, 1.1),
(0, 1, 4.4),
(1, 1, 2.2),
], ijv_dtype)
print(ijv) # [(0, 0, 3.3) (0, 1, 1.1) (0, 1, 4.4) (1, 1, 2.2)]
我想通过对v
和I
的唯一组合进行分组,J
select i, j, sum(v) as v from ijv group by i, j;
i | j | v
---+---+-----
0 | 0 | 3.3
0 | 1 | 5.5
1 | 1 | 2.2
来自# Get unique groups, index and inverse
u_ij, idx_ij, inv_ij = np.unique(ijv[['I', 'J']], return_index=True, return_inverse=True)
# Assemble aggregate
a_ijv = np.zeros(len(u_ij), ijv_dtype)
a_ijv['I'] = u_ij['I']
a_ijv['J'] = u_ij['J']
a_ijv['v'] = [ijv['v'][inv_ij == i].sum() for i in range(len(u_ij))]
print(a_ijv) # [(0, 0, 3.3) (0, 1, 5.5) (1, 1, 2.2)]
的某些统计信息(总和,最小值,最大值等)。从SQL思考,预期结果是:
import java.util.Scanner;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.PrintStream;
public class Project03 {
public static void main(String[] args) throws FileNotFoundException {
CaesarCipher CaesarCipher = new CaesarCipher("", 0);
Scanner choice = new Scanner(System.in);
Scanner intoff = new Scanner(System.in);
Scanner output = new Scanner(System.in);
System.out.println("Type E to encrypt a file, or D to decrypt a file");
String pick = choice.nextLine();
if (pick.toLowerCase().equals("e")) {
System.out.println("Enter the file path of the text you'd like to encrypt: ");
File file = new File(choice.nextLine());
Scanner textfile = new Scanner(file);
String line = textfile.nextLine();
System.out.println("Enter the offset you would like to use (must be 1-25)");
int offset = intoff.nextInt();
System.out.println("Name the file you would like to output to");
String TextOutput = output.nextLine();
System.out.println(CaesarCipher.encode(line, offset));
PrintStream out = new PrintStream(new FileOutputStream(TextOutput));
System.setOut(out);
} else if (pick.toLowerCase().equals("d")) {
System.out.println("Enter the file path of the text you'd like to decrypt: ");
File file = new File(choice.nextLine());
Scanner textfile = new Scanner(file);
String line = textfile.nextLine();
System.out.println("Enter the offset you would like to use (must be 1-25)");
int offset = choice.nextInt();
System.out.println("Name the file you would like to output to");
String TextOutput = output.nextLine();
System.out.println(CaesarCipher.decode(line, offset));
PrintStream out = new PrintStream(new FileOutputStream(TextOutput));
System.setOut(out);
} else {
System.out.println("Something went Wrong");
}
}
}
(顺序并不重要)
我能想到的最好的NumPy是丑陋的,我不相信我已经正确地订购了结果(虽然它似乎在这里工作):
@Inject
我想有更好的方法来做到这一点!我正在使用NumPy 1.4.1。
答案 0 :(得分:1)
numpy
有点太低了。我认为你的解决方案很好,如果你必须使用纯numpy
,但如果你不介意使用具有更高抽象级别的东西,试试pandas
:
import pandas as pd
df = pd.DataFrame({
'I': (0, 0, 0, 1),
'J': (0, 1, 1, 1),
'v': (3.3, 1.1, 4.4, 2.2)})
print(df)
print(df.groupby(['I', 'J']).sum())
输出:
I J v
0 0 0 3.3
1 0 1 1.1
2 0 1 4.4
3 1 1 2.2
v
I J
0 0 3.3
1 5.5
1 1 2.2
答案 1 :(得分:1)
与您已有的内容相比,这并不是一个很大的进步,但至少可以摆脱for循环。
# Starting with your original setup
# Get the unique ij values and the mapping from ungrouped to grouped.
u_ij, inv_ij = np.unique(ijv[['I', 'J']], return_inverse=True)
# Create a totals array. You could do the fancy ijv_dtype thing if you wanted.
totals = np.zeros_like(u_ij.shape)
# Here's the magic bit. You can think of it as
# totals[inv_ij] += ijv["v"]
# except the above doesn't behave as expected sadly.
np.add.at(totals, inv_ij, ijv["v"])
print(totals)
您正在使用numpy的multi-dtype东西这一事实有点说明您应该使用熊猫。尝试将i
,j
和v
保持在一起时,通常会减少代码的麻烦。