我是Python的新手,正在使用Google Speech to text开发一个项目。最终弄清楚了如何导入Google STT(JSON)的结果以及如何在csv中格式化数据。但是...
Google为您提供了好与坏的替代字词。所附的代码只会读取第一个替代方案并停止,因此我只能选择一个替代方案。
我很想导入其他替代方案,并在自己的列中显示Main,alt1,alt2。有时时间戳与Main相同,有时不同。
建议表示赞赏。 -感觉我慢慢掌握了。
{
"@type": "type.googleapis.com/google.cloud.speech.v1p1beta1.LongRunningRecognizeResponse",
"timestamp": "2018-12-28 14:13:18",
"results": [
{
"alternatives": [
{
"confidence": 0.9319887,
"words": [
{
"confidence": 0.9572171,
"endTime": "2s",
"startTime": "1s",
"word": "Bla1a"
},
{
"confidence": 0.9572171,
"endTime": "3s",
"startTime": "2s",
"word": "Bla1b"
}
]
}
],
"languageCode": "th-th"
},
{
"alternatives": [
{
"confidence": 0.95174015,
"words": [
{
"confidence": 0.9572171,
"endTime": "2s",
"startTime": "1s",
"word": "Bla2a"
},
{
"confidence": 0.9572171,
"endTime": "3s",
"startTime": "2s",
"word": "Bla2b"
}
]
}
],
"languageCode": "th-th"
},
{
"alternatives": [
{
"confidence": 0.95298487,
"words": [
{
"confidence": 0.9572171,
"endTime": "2s",
"startTime": "1s",
"word": "bla3b"
},
{
"confidence": 0.9572171,
"endTime": "3s",
"startTime": "2s",
"word": "Bla3b"
}
]
}
],
"languageCode": "th-th"
},
{
"alternatives": [
{
"confidence": 0.8774771,
"words": [
{
"confidence": 0.7337543,
"endTime": "3s",
"startTime": "2s",
"word": "Bla4a"
},
{
"confidence": 0.9363319,
"endTime": "4s",
"startTime": "3s",
"word": "bla4b"
}
]
}
],
"languageCode": "th-th"
},
{
"alternatives": [
{
"confidence": 0.9491383,
"words": [
{
"confidence": 0.8349256,
"endTime": "4s",
"startTime": "3s",
"word": "Bla5a"
},
{
"confidence": 0.9572171,
"endTime": "5s",
"speakerTag": 1,
"startTime": "4s",
"word": "Bla5b"
}
]
}
],
"languageCode": "th-th"
}
]
}
#!/usr/bin/python
# note = can only show one alternatives list
import json
import pandas as pd
from pandas import ExcelWriter
import numpy as np
with open('Thai_Unicode(bk).json') as f: # this ensures opening and closing file
a = json.loads(f.read())
data = a["results"][0]["alternatives"][0]["words"]
df = pd.DataFrame(data)
#print(df)
df.to_excel('pandas4.xls')