我正在尝试读取和h5py文件，其中有多个数据集，我能够轻松读取这些数据集，但是有一个数据集<HDF5 dataset "word_key": shape (1,), type "|V15060">及其元组（ dataset[0][0]，dataset[0][1]）。

问题：我只对dataset[0][1]感兴趣，它包含一个单词列表，而在阅读时，每个单词我只能得到7个字符，最多不超过7个字符。假设数据有['elephant','umbrella']，我的代码正在读取['elephan','umbrell']。这是我的代码，有关如何更改读取块大小的任何解决方案。

代码

with h5py.File('C:/dataset.h5','r') as hdf:
    data=hdf.get('word_key') 
    print(data)
    dataset=np.array(data)[0][1]
    word_dic={}
    for val, word in enumerate(dataset[0][1],0):
        if len(word.decode('UTF-8'))==7: 
               print(word.decode('UTF-8'))
        word_dic[val]=word.decode('UTF-8')

在h5py中，什么是“ | V15060”

代码

0 个答案: