所以,我正在开展一个项目,在这个项目中,我必须对一个充满歌曲数据的大型34mb文本文件进行排序。文本文件的每一行都有一年,唯一编号,艺术家和歌曲。我无法弄清楚的是如何有效地将数据排序到其他文本文件中。我想按艺术家姓名和歌曲名称排序。可悲的是,这就是我所拥有的一切:
#Opening the file to read here
with open('tracks_per_year.txt', 'r',encoding='utf8') as in_file:
#Creating 'lists' to put information from array into
years=[]
uics=[]
artists=[]
songs=[]
#Filling up the 'lists'
for line in in_file:
year,uic,artist,song=line.split("<SEP>")
years.append(year)
uics.append(uic)
artists.append(artist)
songs.append(song)
print(year)
print(uic)
print(artist)
print(song)
#Sorting:
with open('artistsort.txt', 'w',encoding='utf8') as artist:
for x in range(1,515576):
if artists[x]==artists[x-1]:
artist.write (years[x])
artist.write(" ")
artist.write(uics[x])
artist.write(" ")
artist.write(artists[x])
artist.write(" ")
artist.write(songs[x])
artist.write("\n")
with open('Onehitwonders.txt','w',encoding='utf8') as ohw:
for x in range(1,515576):
if artists[x]!= artists[x-1]:
ohw.write (years[x])
ohw.write(" ")
ohw.write(uics[x])
ohw.write(" ")
ohw.write(artists[x])
ohw.write(" ")
ohw.write(songs[x])
ohw.write("\n")
请记住我是新手,所以请尽量用简单的方式解释。如果你们有任何其他想法我也很想听。谢谢!
答案 0 :(得分:0)
您可以将数据导入基于字典的结构,即每个艺术家和歌曲:
data = {artist_name: {song_name: {'year': year, 'uid': uid},
... },
...}
然后在输出时,使用sorted
按字母顺序获取它们:
for artist in sorted(data):
for song in sorted(data[artist]):
# use data[artist][song] to access details
答案 1 :(得分:0)
请尝试这样的事情:
from operator import attrgetter
class Song:
def __init__(self, year, uic, artist, song):
self.year = year
self.uic = uic
self.artist = artist
self.song = song
songs = []
with open('tracks_per_year.txt', 'r', encoding='utf8') as in_file:
for line in in_file:
year, uic, artist, song = line.split("<SEP>")
songs.append(Song(year, uic, artist, song))
print(year)
print(uic)
print(artist)
print(song)
with open('artistsort.txt', 'w', encoding='utf8') as artist:
for song in sorted(songs, key=attrgetter('artist', 'song')):
artist.write (song.year)
artist.write(" ")
artist.write(song.uic)
artist.write(" ")
artist.write(song.artist)
artist.write(" ")
artist.write(song.song)
artist.write("\n")
答案 2 :(得分:0)
你不能超越pandas的简单性。要阅读您的文件:
import pandas as pd
data = pd.read_csv('tracks_per_year.txt', sep='<SEP>')
data
# year uic artist song
#0 1981 uic1 artist1 song1
#1 1934 uic2 artist2 song2
#2 2004 uic3 artist3 song3
然后按特定列排序并写入新文件只需执行:
data.sort(columns='year').to_csv('year_sort.txt')