我遇到的问题似乎无法从我应该使用的csv文件中添加正确的信息。我需要' tally' csv文件中正确列的对象总数。
No Column Sum
0 Company 28
1 Booth 28
2 Full-Time 25
3 Full-Time Visa Sponsor 5
4 Part-Time 1
5 Internship 18
6 Freshman 7
7 Sophomore 9
8 Junior 17
9 Senior 24
10 Post-Bacs 17
11 MS 17
12 PhD 6
13 Alumni 15
这是正确的输出,但是我得到了
0 Column Sum
1 Company 27
2 Booth 27
3 Full-Time 27
4 Full-Time Visa Sponsor 27
5 Part-Time 27
6 Internship 27
7 Freshman 27
8 Sophomore 27
9 Junior 27
10 Senior 27
11 Post-Bacs 27
12 MS 27
13 PhD 27
我认为这是我的字典的用法。这是我的以下代码,并在其下发布了csv文件。
import csv #Allows you to import or export spreadsheets
filename = "Spring.csv" #I assigned the file to a variable
f = open(filename) #I couldn't leave it default due to UTF-8 error from orginial
reader = csv.reader(f) #The reader allows you to pull data from the CSV
#I made a dictonary of the problem stated
company_dict = {0:"Company", 1:"Booth",
2:"Full-Time", 3:"Full-Time Visa Sponsor",
4:"Part-Time", 5:"Internship",
6:"Freshman", 7:"Sophomore",
8:"Junior", 9:"Senior",
10:"Post-Bacs", 11:"MS",
12:"PhD", 13:"Alumni"}
#Loop to organize the company_dict
for lines in company_dict:
print(repr(lines),company_dict[lines])
keywords = ("AIG","Baylor","CGG","Citi","ExxonMobil","Flow-Cal Inc.", #I used a list to help me get the information I wanted from the csv file
"Global SHop Solutions","Harris Count CTS","HCSS",
"Hitachi Consulting", "HP Inc.","INT Inc.","JPMorgan Chase & Co",
"Leidos","McKesson","MRE Consulting Ltd.","NetIQ","PROS",
"San Jacinto College","SAS","Smartbridge","Sogeti USA",
"Southwest Research Institute","The Reynolds and Reynolds Company",
"UH Enterprise Systems","U.S. Marine Corps","ValuD Consuting LLC","Wipro")
DataList = [] #I made a blank list
with f as filterf: #This loop will look for the keywords in the file, and only add those keywords
output_line_counter = 0 #I needed it to print with rows, so I set it to 0
for line in filterf:
if any(keyword in line for keyword in keywords): #The actual code that looks for keywords in the line in my file
output_line_counter += 1 #Adds the column (might not be necessary but it works for me)
DataList.append(line)
CleanerData = sorted(set(DataList)) #I made a new 'cleaner' list so that it would be alphabetically without spaces
line_counter = 0
for i in CleanerData: #I had to do another loop to add rows again, it now prints what is required in the question
line_counter += 1
print(line_counter, i, end='')
DataList2 = []
data_employer = {'No': ('Column', 'Sum')}
for empdata in range(14):
sum = 0
for i in CleanerData:
if i[empdata] != '':
sum += 1
data_employer[empdata] = (company_dict[empdata], sum)
for k in data_employer:
print(list(data_employer.keys()).index(k), data_employer[k][0], data_employer[k][1])
这是我的csv文件信息
ALPHABETICAL ORDER,,,,,,,,,,,,,
,,Positions,,,,Classifications,,,,,,,
Company,Booth,Full-Time,"Full-Time Visa Sponsor",Part-Time,Internship,Freshman,Sophomore,Junior,Senior,Post-Bacs,MS,PhD,Alumni
AIG,10,,,,Yes,,,Jr,,,MS,,
Baylor College of Medicine,19,Yes,Yes,,,,,,,,,,Recent
CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
,...
Flow-Cal Inc.,16,Yes,,,Yes,,,Jr,Sr,,,,All
Global Shop Solutions,18,Yes,,,Yes,,,,Sr,PB,,,All
Harris County CTS,22,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
Hitachi Consulting,13,Yes,,,,,,,Sr,,MS,,
HP Inc.,1,Yes,,,Yes,,,Jr,,,MS,,Recent
INT Inc.,20,Yes,Yes,,Yes,,,Jr,Sr,,MS,PhD,
JPMorgan Chase & Co,3,Yes,,,Yes,,,Jr,Sr,,,,
Leidos,390,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,
McKesson,26,Yes,,,,,,,Sr,,,,
,,,,,,,,,,,,,
MRE Consulting Ltd.,2,Yes,,,,,,,Sr,PB,MS,,All
NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
San Jacinto College ,14,,,,Yes,,Soph,Jr,Sr,PB,MS,,
SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
Sogeti USA,15,Yes,,,,,,,Sr,PB,MS,,
Southwest Research Institute,12,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
The Reynolds and Reynolds Company,23,Yes,Yes,,Yes,Fr,Soph,Jr,Sr,PB,,,All
UH Enterprise Systems,9,Yes,Yes,Yes,Yes,Fr,Soph,Jr,Sr,PB,MS,PhD,All
U.S. Marine Corps,25,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,All
ValuD Consuting LLC,5,Yes,,,,,,,Sr,PB,,,All
Wipro,24,Yes,,,,,,,Sr,PB,,,
BOOTH ORDER,,,,,,,,,,,,,
,Booth,Positions,,,,Classifications,,,,,,,
Company,#,Full-Time,"Full-Time
Visa Sponsor",Part-Time,Internship,Freshman,Sophomore,Junior,Senior,Post-Bacs,MS,PhD,Alumni
HP�Inc.,1,Yes,,,Yes,,,Jr,,,MS,,Recent
"MRE Consulting, Ltd.",2,Yes,,,,,,,Sr,PB,MS,,All
JPMorgan Chase & Co,3,Yes,,,Yes,,,Jr,Sr,,,,
SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
ValuD Consuting LLC,5,Yes,,,,,,,Sr,PB,,,All
NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
UH Enterprise Systems,9,Yes,Yes,Yes,Yes,Fr,Soph,Jr,Sr,PB,MS,PhD,All
AIG,10,,,,Yes,,,Jr,,,MS,,
ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
Southwest Research Institute,12,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
Hitachi Consulting,13,Yes,,,,,,,Sr,,MS,,
San Jacinto College ,14,,,,Yes,,Soph,Jr,Sr,PB,MS,,
Sogeti USA,15,Yes,,,,,,,Sr,PB,MS,,
"Flow-Cal, Inc.",16,Yes,,,Yes,,,Jr,Sr,,,,All
CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
Global Shop Solutions,18,Yes,,,Yes,,,,Sr,PB,,,All
Baylor College of Medicine,19,Yes,Yes,,,,,,,,,,Recent
"INT, Inc.",20,Yes,Yes,,Yes,,,Jr,Sr,,MS,PhD,
PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
Harris County CTS,22,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
The Reynolds and Reynolds Company,23,Yes,Yes,,Yes,Fr,Soph,Jr,Sr,PB,,,All
Wipro,24,Yes,,,,,,,Sr,PB,,,
U.S. Marine Corps,25,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,All
McKesson,26,Yes,,,,,,,Sr,,,,
Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
Leidos,30,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,
这是从excel中的csv文件粘贴的副本。
ALPHABETICAL ORDER
Positions Classifications
Company Booth Full-Time Full-Time Visa Sponsor Part-Time Internship Freshman Sophomore Junior Senior Post-Bacs MS PhD Alumni
AIG 10 Yes Jr MS
Baylor�College�of�Medicine 19 Yes Yes Recent
CGG 17 Yes Yes MS PhD Recent
Citi 27/28 Yes Yes Jr Sr
ExxonMobil 11 Yes Yes Fr Soph Jr Sr PB
...
Flow-Cal�Inc. 16 Yes Yes Jr Sr All
Global�Shop�Solutions 18 Yes Yes Sr PB All
Harris�County�CTS 22 Yes Yes Jr Sr PB MS PhD All
HCSS 29 Yes Yes Fr Soph Jr Sr PB MS Recent
Hitachi�Consulting 13 Yes Sr MS
HP�Inc. 1 Yes Yes Jr MS Recent
INT�Inc. 20 Yes Yes Yes Jr Sr MS PhD
JPMorgan�Chase�&�Co 3 Yes Yes Jr Sr
Leidos 390 Yes Yes Fr Soph Jr Sr PB MS
McKesson 26 Yes Sr
MRE�Consulting�Ltd. 2 Yes Sr PB MS All
NetIQ 7 Yes Soph Jr Sr PB
PROS 21 Yes Sr MS PhD All
San�Jacinto�College�� 14 Yes Soph Jr Sr PB MS
SAS 4 Yes Yes Fr Soph Jr Sr PB MS Recent
Smartbridge 8 Yes Sr PB MS
Sogeti�USA 15 Yes Sr PB MS
Southwest�Research�Institute 12 Yes Yes Jr Sr PB MS PhD All
The�Reynolds�and�Reynolds�Company 23 Yes Yes Yes Fr Soph Jr Sr PB All
UH�Enterprise�Systems 9 Yes Yes Yes Yes Fr Soph Jr Sr PB MS PhD All
U.S.�Marine�Corps 25 Yes Yes Fr Soph Jr Sr PB MS All
ValuD�Consuting�LLC 5 Yes Sr PB All
Wipro 24 Yes Sr PB
BOOTH ORDER
Booth Positions Classifications
Company # Full-Time "Full-Time
Visa Sponsor" Part-Time Internship Freshman Sophomore Junior Senior Post-Bacs MS PhD Alumni
HP�Inc. 1 Yes Yes Jr MS Recent
MRE�Consulting,�Ltd. 2 Yes Sr PB MS All
JPMorgan�Chase�&�Co 3 Yes Yes Jr Sr
SAS 4 Yes Yes Fr Soph Jr Sr PB MS Recent
ValuD�Consuting�LLC 5 Yes Sr PB All
NetIQ 7 Yes Soph Jr Sr PB
Smartbridge 8 Yes Sr PB MS
UH�Enterprise�Systems 9 Yes Yes Yes Yes Fr Soph Jr Sr PB MS PhD All
AIG 10 Yes Jr MS
ExxonMobil 11 Yes Yes Fr Soph Jr Sr PB
Southwest�Research�Institute 12 Yes Yes Jr Sr PB MS PhD All
Hitachi�Consulting 13 Yes Sr MS
San�Jacinto�College�� 14 Yes Soph Jr Sr PB MS
Sogeti�USA 15 Yes Sr PB MS
Flow-Cal,�Inc. 16 Yes Yes Jr Sr All
CGG 17 Yes Yes MS PhD Recent
Global�Shop�Solutions 18 Yes Yes Sr PB All
Baylor�College�of�Medicine 19 Yes Yes Recent
INT,�Inc. 20 Yes Yes Yes Jr Sr MS PhD
PROS 21 Yes Sr MS PhD All
Harris�County�CTS 22 Yes Yes Jr Sr PB MS PhD All
The�Reynolds�and�Reynolds�Company 23 Yes Yes Yes Fr Soph Jr Sr PB All
Wipro 24 Yes Sr PB
U.S.�Marine�Corps 25 Yes Yes Fr Soph Jr Sr PB MS All
McKesson 26 Yes Sr
Citi 27/28 Yes Yes Jr Sr
HCSS 29 Yes Yes Fr Soph Jr Sr PB MS Recent
Leidos 30 Yes Yes Fr Soph Jr Sr PB MS
任何意见或建议?我 CAN' T 尽可能多地使用pandas :)
答案 0 :(得分:0)
代码有一个小错误,计算总和,这里是更正后的版本
DataList2 = []
data_employer = {'No': ('Column', 'Sum')}
for empdata in range(14):
sum = 0
for i in CleanerData:
k = i.split(',')
if k[empdata] != '':
sum += 1
data_employer[empdata] = (company_dict[empdata], sum)