计算csv文件中的列数

时间:2017-12-21 07:10:45

标签: python

我有一个以这两行开头的普通csv文件:

HANDLE tProviderPipe = INVALID_HANDLE_VALUE;
SECURITY_ATTRIBUTES tSecurityAttributes;
OVERLAPPED ov = {};
DWORD lLastStatus;

ov.hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
if (!ov.hEvent) {
    lLastStatus = GetLastError();
    OHTRACE(Trace::eTAlways, L"Pipe Event Create Error: " << (UINT)lLastStatus);
    return;
}

tSecurityAttributes.nLength = sizeof(tSecurityAttributes);
tSecurityAttributes.bInheritHandle = FALSE;
tSecurityAttributes.lpSecurityDescriptor = NULL;

while (1) {
    if (INVALID_HANDLE_VALUE == tProviderPipe) {
        tProviderPipe = ::CreateNamedPipe(L"\\\\.\\pipe\\MyPipe", PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED, PIPE_TYPE_BYTE | PIPE_READMODE_BYTE | PIPE_WAIT, 1, 128, 128, 5000, &tSecurityAttributes);
        if (INVALID_HANDLE_VALUE == tProviderPipe) {
            lLastStatus = GetLastError();
            OHTRACE(Trace::eTAlways, L"Pipe Create Error: " << (UINT)lLastStatus);
            break;
        }
    }

    if (!ConnectNamedPipe(tProviderPipe, &ov)) {
        lLastStatus = GetLastError();
        if (ERROR_IO_PENDING == lLastStatus) {
            if (WaitForSingleObject(ov.hEvent, 20000) != WAIT_OBJECT_0) {
                OHTRACE(Trace::eTAlways, L"Pipe not connected in 20 seconds");
                break;
            }
        }
        else if (ERROR_PIPE_CONNECTED != lLastStatus) {
            OHTRACE(Trace::eTAlways, L"Pipe Connect Error: " << (UINT)lLastStatus);
            continue;
        }
    }

    // use tProviderPipe as needed ...

    DisconnectNamedPipe(tProviderPipe);
}

if (INVALID_HANDLE_VALUE != tProviderPipe) {
    CloseHandle(tProviderPipe);
}

CloseHandle(ov.hEvent);

总的来说,我有62458行,每行(代表一个房子)有75列(每个房子代表75种类型的家具。我只计算这些数字以便更容易实现,实际上这些逗号之间没有任何内容,所以意味着有充满75种家具的房子,没有)。我想用python写出一个程序来计算每列在csv文件中出现的次数。我尝试了几种方法,但没有成功。

我的输出应该是这样的:

1. Clubhouse,Fibre Ready,Fitness Corner,Aircon,Security,Maid Room,Gym,Tennis Court,Fridge,Bathtub,Swimming Pool,Utility Room,Squash Court,Playground,Oven,Parking,Washer,Pool View,Wading Pool,BBQ Pits,Balcony,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,
2. Clubhouse,Aircon,Greenery View,Fridge,Jacuzzi,Swimming Pool,Dryer,Steam Room,Playground,Fitness Corner,Washer,Gym,Balcony,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75, 
3. ...
And so on

P / s:我无法将文件导入数据库,因为该文件不可读

1 个答案:

答案 0 :(得分:2)

您可以使用collections.Counter

from collections import Counter
import csv

counter = Counter()
with open('furniture.csv') as fobj:
    reader = csv.reader(fobj)
    for row in reader:
        counter.update(row)

for k, v in counter.items():
    print('{}: {} times'.format(k, v))

两行输出:

Clubhouse: 2 times
Fibre Ready: 1 times
Fitness Corner: 2 times
Aircon: 2 times
...

您还可以访问单个项目::

>>> counter['Clubhouse']
2
>>> counter['Fibre Ready']
1

collections.Counter对此类任务非常有用:

  

用于计算可散列物品的Dict子类。有时叫一个包   或多重集。元素存储为字典键及其计数   存储为字典值。