用于计算平均年龄,最高工资等的python代码来自.txt文件

时间:2019-07-18 23:31:47

标签: python function dictionary

首先,对我的英语(不是我的母语)很抱歉。我是python(或任何类型的编程)的新手 我有一个带有4个列的.txt文件(列头不在文件中): “姓名,年龄,职业,薪水,公司年限”

我需要为一个函数编写代码,并使用以下命令打印出字典:

{
    'average_age': $avg,
    'best_paid_job': $best_paid,
    'best_paid_employee': $best_paid_employee,
    'no_employees': $no_employees,
    'top_3_jobs': [$job1,$job2, $job3],
    'seniors': $no_seniors,
    'middle': $no_middle,
    'juniors': $no_juniors'
}

其中

$ avg: float representing the average age of employees.
$ best_paid_job: int, the highest value for salary.
$ best_paid_employee: string, the name of the employee with the highest salary.
$ no_employees: int, number of employees.
$ top_3_jobs: List of strings representing the most common 3 jobs.
$ no_seniors: The number of senior employees in the company. They are all considered to be seniors employees who have more than 5 years of seniority in the company ,'years_in_the_company' 'is greater than 5).
$ no_middle: The number of middle employees in the company. They are all considered middle employees aged 3-5 years.
$ no_junior: The number of junior employees in the company. All are considered junior employees aged less than 3 years.

诀窍是不能使用任何预定义的python函数(例如,csv,numpy ..) 我需要基本代码 你们可以帮助部分代码吗?我在过去的一周中尽力而为,

这是我的文件(data.txt)

Helen   20  Network Designer    5449    9
Jasmine 40  Software Architect  2536    1
Phoebe  28  Software Engineer   2627    7
Aysha   34  Software Developer  6441    3
Madeleine   26  Systems Engineer    5948    6
Christina   27  Python Developer    8366    5
Melissa 29  Data Scientist  6262    4
Marie   44  Researcher  6936    6
Tamara  40  System Administrator    9727    1
Freya   43  Software Engineer in Test   5686    10
Charles 43  System Administrator    3114    8
John    24  Software Engineer in Test   7035    4
Joe 30  Network Designer    2916    4
Elmer   37  Software Architect  4641    10
Tobias  38  Systems Engineer    5757    7
Samuel  43  Python Developer    4092    7
Casey   43  Systems Engineer    5318    1
Otis    45  Software Architect  3356    2
Frank   37  Python Developer    8111    1
Hugo    37  Software Architect  4632    5
Justin  35  Python Developer    2260    7
Jessie  39  System Administrator    4162    6

2 个答案:

答案 0 :(得分:0)

我已经为这种数据完成了解析器...

我用双空格分隔每行。此外,我们需要为两名员工提供一个小小的解决方案...

Hexdump

file name: data.txt
mime type: 

0000-0010:  48 65 6c 65-6e 20 20 20-32 30 20 20-4e 65 74 77  Helen... 20..Netw
0000-0020:  6f 72 6b 20-44 65 73 69-67 6e 65 72-20 20 20 20  ork.Desi gner....
0000-0030:  35 34 34 39-20 20 20 20-39 0d 0a 4a-61 73 6d 69  5449.... 9..Jasmi
0000-0040:  6e 65 20 34-30 20 20 53-6f 66 74 77-61 72 65 20  ne.40..S oftware.
0000-0050:  41 72 63 68-69 74 65 63-74 20 20 32-35 33 36 20  Architec t..2536.
0000-0060:  20 20 20 31-0d 0a 50 68-6f 65 62 65-20 20 32 38  ...1..Ph oebe..28
0000-0070:  20 20 53 6f-66 74 77 61-72 65 20 45-6e 67 69 6e  ..Softwa re.Engin
0000-0080:  65 65 72 20-20 20 32 36-32 37 20 20-20 20 37 0d  eer...26 27....7.
0000-0090:  0a 41 79 73-68 61 20 20-20 33 34 20-20 53 6f 66  .Aysha.. .34..Sof
0000-00a0:  74 77 61 72-65 20 44 65-76 65 6c 6f-70 65 72 20  tware.De veloper.
0000-00b0:  20 36 34 34-31 20 20 20-20 33 0d 0a-4d 61 64 65  .6441... .3..Made
0000-00c0:  6c 65 69 6e-65 20 20 20-32 36 20 20-53 79 73 74  leine... 26..Syst
0000-00d0:  65 6d 73 20-45 6e 67 69-6e 65 65 72-20 20 20 20  ems.Engi neer....
0000-00e0:  35 39 34 38-20 20 20 20-36 0d 0a 43-68 72 69 73  5948.... 6..Chris
0000-00f0:  74 69 6e 61-20 20 20 32-37 20 20 50-79 74 68 6f  tina...2 7..Pytho
0000-0100:  6e 20 44 65-76 65 6c 6f-70 65 72 20-20 20 20 38  n.Develo per....8
0000-0110:  33 36 36 20-20 20 20 35-0d 0a 4d 65-6c 69 73 73  366....5 ..Meliss
0000-0120:  61 20 32 39-20 20 44 61-74 61 20 53-63 69 65 6e  a.29..Da ta.Scien
0000-0130:  74 69 73 74-20 20 36 32-36 32 20 20-20 20 34 0d  tist..62 62....4.
0000-0140:  0a 4d 61 72-69 65 20 20-20 34 34 20-20 52 65 73  .Marie.. .44..Res
0000-0150:  65 61 72 63-68 65 72 20-20 36 39 33-36 20 20 20  earcher. .6936...
0000-0160:  20 36 0d 0a-54 61 6d 61-72 61 20 20-34 30 20 20  .6..Tama ra..40..
0000-0170:  53 79 73 74-65 6d 20 41-64 6d 69 6e-69 73 74 72  System.A dministr
0000-0180:  61 74 6f 72-20 20 20 20-39 37 32 37-20 20 20 20  ator.... 9727....
0000-0190:  31 0d 0a 46-72 65 79 61-20 20 20 34-33 20 20 53  1..Freya ...43..S
0000-01a0:  6f 66 74 77-61 72 65 20-45 6e 67 69-6e 65 65 72  oftware. Engineer
0000-01b0:  20 69 6e 20-54 65 73 74-20 20 20 35-36 38 36 20  .in.Test ...5686.
0000-01c0:  20 20 20 31-30 0d 0a 43-68 61 72 6c-65 73 20 34  ...10..C harles.4
0000-01d0:  33 20 20 53-79 73 74 65-6d 20 41 64-6d 69 6e 69  3..Syste m.Admini
0000-01e0:  73 74 72 61-74 6f 72 20-20 20 20 33-31 31 34 20  strator. ...3114.
0000-01f0:  20 20 20 38-0d 0a 4a 6f-68 6e 20 20-20 20 32 34  ...8..Jo hn....24
0000-0200:  20 20 53 6f-66 74 77 61-72 65 20 45-6e 67 69 6e  ..Softwa re.Engin
0000-0210:  65 65 72 20-69 6e 20 54-65 73 74 20-20 20 37 30  eer.in.T est...70
0000-0220:  33 35 20 20-20 20 34 0d-0a 4a 6f 65-20 33 30 20  35....4. .Joe.30.
0000-0230:  20 4e 65 74-77 6f 72 6b-20 44 65 73-69 67 6e 65  .Network .Designe
0000-0240:  72 20 20 20-20 32 39 31-36 20 20 20-20 34 0d 0a  r....291 6....4..
0000-0250:  45 6c 6d 65-72 20 20 20-33 37 20 20-53 6f 66 74  Elmer... 37..Soft
0000-0260:  77 61 72 65-20 41 72 63-68 69 74 65-63 74 20 20  ware.Arc hitect..
0000-0270:  34 36 34 31-20 20 20 20-31 30 0d 0a-54 6f 62 69  4641.... 10..Tobi
0000-0280:  61 73 20 20-33 38 20 20-53 79 73 74-65 6d 73 20  as..38.. Systems.
0000-0290:  45 6e 67 69-6e 65 65 72-20 20 20 20-35 37 35 37  Engineer ....5757
0000-02a0:  20 20 20 20-37 0d 0a 53-61 6d 75 65-6c 20 20 34  ....7..S amuel..4
0000-02b0:  33 20 20 50-79 74 68 6f-6e 20 44 65-76 65 6c 6f  3..Pytho n.Develo
0000-02c0:  70 65 72 20-20 20 20 34-30 39 32 20-20 20 20 37  per....4 092....7
0000-02d0:  0d 0a 43 61-73 65 79 20-20 20 34 33-20 20 53 79  ..Casey. ..43..Sy
0000-02e0:  73 74 65 6d-73 20 45 6e-67 69 6e 65-65 72 20 20  stems.En gineer..
0000-02f0:  20 20 35 33-31 38 20 20-20 20 31 0d-0a 4f 74 69  ..5318.. ..1..Oti
0000-0300:  73 20 20 20-20 34 35 20-20 53 6f 66-74 77 61 72  s....45. .Softwar
0000-0310:  65 20 41 72-63 68 69 74-65 63 74 20-20 33 33 35  e.Archit ect..335
0000-0320:  36 20 20 20-20 32 0d 0a-46 72 61 6e-6b 20 20 20  6....2.. Frank...
0000-0330:  33 37 20 20-50 79 74 68-6f 6e 20 44-65 76 65 6c  37..Pyth on.Devel
0000-0340:  6f 70 65 72-20 20 20 20-38 31 31 31-20 20 20 20  oper.... 8111....
0000-0350:  31 0d 0a 48-75 67 6f 20-20 20 20 33-37 20 20 53  1..Hugo. ...37..S
0000-0360:  6f 66 74 77-61 72 65 20-41 72 63 68-69 74 65 63  oftware. Architec
0000-0370:  74 20 20 34-36 33 32 20-20 20 20 35-0d 0a 4a 75  t..4632. ...5..Ju
0000-0380:  73 74 69 6e-20 20 33 35-20 20 50 79-74 68 6f 6e  stin..35 ..Python
0000-0390:  20 44 65 76-65 6c 6f 70-65 72 20 20-20 20 32 32  .Develop er....22
0000-03a0:  36 30 20 20-20 20 37 0d-0a 4a 65 73-73 69 65 20  60....7. .Jessie.
0000-03b0:  20 33 39 20-20 53 79 73-74 65 6d 20-41 64 6d 69  .39..Sys tem.Admi
0000-03c0:  6e 69 73 74-72 61 74 6f-72 20 20 20-20 34 31 36  nistrato r....416
0000-03c6:  32 20 20 20-20 36                                2....6
import operator

# Person class
class Person:
    def __init__(self, name, age, occupation, salary, years_in_the_company):
        self.name = name
        self.age = age
        self.occupation = occupation
        self.salary = salary
        self.years_in_the_company = years_in_the_company

# https://stackoverflow.com/a/53261987/5802172
def check_num(s):
  try:
    int(s)
    return True
  except:
    return False

Persons = list()

f = open("data.txt")
# read file
data = f.read()

# split each line :)
data = data.split("\n")

# parse each line
for i in range(len(data)):
    # split each whitespace
    tmp = data[i].split("  ")

    # remove useless whitespaces
    tmp = list(filter(None, tmp))

    # quick & dirty fix for melissa and charles
    if " " in tmp[0]:
        xFix = tmp[0].split(" ")
        tmp[0] = xFix[0]
        tmp.insert(1, xFix[1])
    else:
        tmp[1] = tmp[1]
    tmp[2] = tmp[2].strip()
    tmp[3] = tmp[3].strip()

    # Check if number or string, due to occupation
    #print(tmp)
    Persons.append(Person(tmp[0], tmp[1], tmp[2], tmp[3], tmp[4]))


avg = 0
salary = list()
top_3_jobs = {}
highest_employee = list((0, ""))
seniors_middle_junior = { "senior": 0, "middle": 0, "junior": 0 }

for per in range(len(Persons)):
    print("Name: {}, Age: {}, Occupation: {}, Salary: {}, Years: {}".format(Persons[per].name, Persons[per].age, Persons[per].occupation, Persons[per].salary, Persons[per].years_in_the_company))
    # add to avg
    avg = avg + int(Persons[per].age)

    # add salary to salary
    salary.append(Persons[per].salary)

    # get highest paid employee
    if int(Persons[per].salary) > int(highest_employee[0]):
        highest_employee[0] = Persons[per].salary
        highest_employee[1] = Persons[per].name

    # count occupations
    if Persons[per].occupation in top_3_jobs:
        top_3_jobs[Persons[per].occupation] = top_3_jobs[Persons[per].occupation] + 1
    else:
        top_3_jobs[Persons[per].occupation] = 1

    # count senior, middle, junior
    if int(Persons[per].years_in_the_company) > 5:
        seniors_middle_junior["senior"] = seniors_middle_junior["senior"] + 1
    elif int(Persons[per].years_in_the_company) > 2 and int(Persons[per].years_in_the_company) < 5:
        seniors_middle_junior["middle"] = seniors_middle_junior["middle"] + 1
    elif int(Persons[per].years_in_the_company) < 3:
        seniors_middle_junior["junior"] = seniors_middle_junior["junior"] + 1

# sort
sorted_top_3_jobs = sorted(top_3_jobs.items(), key=operator.itemgetter(1))

print("Avg: {}\nBest paid: {}\nBest paid employee: {}\nEmployees: {}\nTop 3 Jobs: {}\nSeniors: {}\nMiddles: {}\nJuniors: {}".format(
    avg / len(Persons),
    max(salary),
    highest_employee[1],
    len(Persons),
    sorted_top_3_jobs[-3:],
    seniors_middle_junior["senior"],
    seniors_middle_junior["middle"],
    seniors_middle_junior["junior"]
))

答案 1 :(得分:0)

import json

person = {"people": []}

with open("outfile", 'r') as thefile:
    thefile = thefile.readlines()
    for line in thefile:
        per_person = line.split("    ")
        name = per_person[0]
        age = per_person[1]
        job = per_person[2]
        not_sure_head = per_person[3]
        not_sure_head2 = per_person[4]

        person_dict = {name: {"age":age,
                              "job": job,
                              "bleh": not_sure_head,
                              "bleh2": not_sure_head2}}

        person["people"].append(person_dict)

json_data = json.dumps(person, indent=4)

print((json_data))

您可以逐行读取文件,并使用标签将每一行分开。每行都是一个人的数据。根据需要将其存储到变量。在这里我用了字典。

输出-

{
    "people": [
        {
            "Helen": {
                "age": "20",
                "job": "Network Designer",
                "bleh": "5449",
                "bleh2": "9\n"
            }
        },
        {
            "Jasmine": {
                "age": "40",
                "job": "Software Architect",
                "bleh": "2536",
                "bleh2": "1\n"
            }
        },
        {
            "Phoebe": {
                "age": "28",
                "job": "Software",
                "bleh": "Engineer",
                "bleh2": "2627"
            }
        },
        {
            "Aysha": {
                "age": "34",
                "job": "Software Developer",
                "bleh": "6441",
                "bleh2": "3\n"
            }
        },
        {
            "Madeleine": {
                "age": "26",
                "job": "Systems Engineer",
                "bleh": "5948",
                "bleh2": "6"
            }
        }
    ]
}

您需要进行一些调整才能轻松浏览字典。