熊猫字典行到列

时间:2018-07-29 16:55:55

标签: json pandas dictionary

我有一个JSON文件,并且尝试将其读入熊猫时,其输出是一行字典。我想要的是将这些字典行拆分为键,这些键是列名。

这是JSON文件样本数据的第一行

"{ \"time_iso8601\": \"2018-05-11T03:33:26+00:00\", \"request\": \"GET /atom/ HTTP/1.1\", \"request_uri\": \"/atom/\", \"scheme\": \"https\", \"host\": \"grochmal.org\", \"server_protocol\": \"HTTP/1.1\", \"status\": 301, \"request_method\": \"GET\", \"request_length\": 272, \"request_time\": 0.000, \"request_id\": \"43db8100f133bd2b011dbf649e304a0d\", \"remote_addr\": \"207.46.13.183\", \"remote_port\": 1217, \"cookie_userid\": \"\", \"cookie_sessionid\": \"\", \"cookie_csrftoken\": \"\", \"content_length\": \"\", \"content_type\": \"\", \"ssl_cipher\": \"ECDHE-RSA-AES128-GCM-SHA256\", \"ssl_curves\": \"0x001d:prime256v1:secp384r1\", \"ssl_protocol\": \"TLSv1.2\", \"ssl_session_reused\": \".\", \"http2\": \"\", \"http_user_agent\": \"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)\", \"http_x_forwarded_for\": \"\", \"http_referer\": \"\", \"http_x_csrf_token\": \"\", \"geoip_country_name\": \"United States\", \"geoip_country_code3\": \"USA\", \"geoip_city\": \"\", \"geoip_region_name\": \"\", \"geoip_latitude\": \"13.7500\", \"geoip_longitude\": \"100.4667\", \"gzip_ratio\": \"\", \"upstream_cache_status\": \"\", \"upstream_status\": \"\", \"upstream_response_time\": \"\", \"bytes_sent\": 486 }\n",

这是我的尝试

df_accesslog=pd.read_json(‘data.json')
train = pd.concat([pd.DataFrame(x) for x in df_accesslog[0]).\
reset_index(level=1, drop=True).reset_index()

上面是

的直接使用

Convert a column containing a list of dictionaries to multiple columns in pandas dataframe

Here is the screenshot of the sample output of the above lines of code

1 个答案:

答案 0 :(得分:1)

您可以使用stringlist转换为import json, ast from pandas.io.json import json_normalize with open('data.json', encoding="utf8") as data_file: data = json.load(data_file) train = json_normalize([ast.literal_eval(x) for x in data]) ,然后再转换json_normalize

seqs <- readRNAStringSet("DataSubSet15Fasta.fasta")
aligned <- AlignSeqs(seqs)
#class "RNAStringSet
dnaaligned<-DNAStringSet(aligned, use.names=TRUE)
#class "Biostrings"
writeXStringSet(dnaaligned, file="dnaalignedDataSubSet15Fasta.fasta")
#read in fasta as DNAbin
sequences <- read.dna("dnaalignedDataSubSet15Fasta.fasta",format="fasta")
#class "DNAbin"