阅读dicom回购时出现编码问题

时间:2019-04-30 09:06:58

标签: python-3.x dicom pydicom

我正在预处理DICOM图像存储库以为其提供卷积神经网络,但是当我尝试读取存储库时,它引发了以下错误:

  

LookupError:未知编码:ISO 2022 IR 100

这是我使用的代码:

listoflists = []
list = []
for x in range(1, 10):
    data_path = "/home/lorenzo_f/CT COLONOGRAPHY/1.3.6.1.4.1.9328.50.4.000%d" %x 
    output_path ="/home/lorenzo_f/output/"
    subfolders = [f.path for f in os.scandir(data_path) if f.is_dir() ]    
    subfolder = [f.path for f in os.scandir(subfolders[0]) if f.is_dir() ]  
    list.append(load_scan(subfolder[0]))
    list.append(load_scan(subfolder[1]))
    listoflists.append((list))

使用功能load_scan

# Loop over the image files and store everything into a list.

def load_scan(path):
    slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]
    #slices[0].SpecificCharacterSet = 'latin_1'
    slices.sort(key = lambda x: int(x.InstanceNumber))
    try:
        slice_thickness = np.abs(slices[0].ImagePositionPatient[2] - slices[1].ImagePositionPatient[2])
    except:
        slice_thickness = np.abs(slices[0].SliceLocation - slices[1].SliceLocation)

    for s in slices:
        s.SliceThickness = slice_thickness

    return slices

我找不到您要说的标签,这是我使用的代码以及输出的第一项

data_path = "/home/lorenzo_f/CT COLONOGRAPHY/1.3.6.1.4.1.9328.50.4.0010"
output_path ="/home/lorenzo_f/output/"
subfolders = [f.path for f in os.scandir(data_path) if f.is_dir() ]    
subfolder = [f.path for f in os.scandir(subfolders[0]) if f.is_dir() ]
ds=load_scan(subfolder[0])
ds
 (0008, 0008) Image Type                          CS: ['ORIGINAL', 'SECONDARY', 'AXIAL']
 (0008, 0016) SOP Class UID                       UI: CT Image Storage
 (0008, 0018) SOP Instance UID                    UI: 1.3.6.1.4.1.9328.50.4.9867
 (0008, 0020) Study Date                          DA: '20000101'
 (0008, 0021) Series Date                         DA: '20000101'
 (0008, 0022) Acquisition Date                    DA: '20000101'
 (0008, 0023) Content Date                        DA: '20000101'
 (0008, 0030) Study Time                          TM: '091936'
 (0008, 0032) Acquisition Time                    TM: '092131'
 (0008, 0033) Content Time                        TM: '101416'
 (0008, 0050) Accession Number                    SH: ''
 (0008, 0060) Modality                            CS: 'CT'
 (0008, 0070) Manufacturer                        LO: 'GE MEDICAL SYSTEMS'
 (0008, 0080) Institution Name                    LO: ''
 (0008, 0081) Institution Address                 ST: ''
 (0008, 0090) Referring Physician's Name          PN: 'xDONEx'
 (0008, 1030) Study Description                   LO: 'CT COLONOGRAP C'
 (0008, 103e) Series Description                  LO: 'CT COLONOGRAPHY'
 (0008, 1048) Physician(s) of Record              PN: ' '
 (0008, 1090) Manufacturer's Model Name           LO: 'LightSpeed16'
 (0008, 1140)  Referenced Image Sequence   0 item(s) ---- 
 (0008, 2112)  Source Image Sequence   0 item(s) ---- 
 (0010, 0010) Patient's Name                      PN: '1.3.6.1.4.1.9328.50.4.0010'
 (0010, 0020) Patient ID                          LO: '1.3.6.1.4.1.9328.50.4.0010'
 (0010, 0030) Patient's Birth Date                DA: ''
 (0010, 0040) Patient's Sex                       CS: 'M'
 (0010, 1000) Other Patient IDs                   LO: ''
 (0010, 1010) Patient's Age                       AS: '068Y'
 (0010, 21b0) Additional Patient History          LT: 'COLON SCREENING'
 (0010, 21c0) Pregnancy Status                    US: []
 (0012, 0010) Clinical Trial Sponsor Name         LO: ''
 (0012, 0020) Clinical Trial Protocol ID          LO: ''
 (0012, 0021) Clinical Trial Protocol Name        LO: ''
 (0012, 0030) Clinical Trial Site ID              LO: ''
 (0012, 0031) Clinical Trial Site Name            LO: ''
 (0012, 0040) Clinical Trial Subject ID           LO: ''
 (0012, 0042) Clinical Trial Subject Reading ID   LO: ''
 (0013, 0010) Private Creator                     LO: 'CTP'
 (0013, 1010) Private tag data                    UN: b'CT COLONOGRAPHY\x00'
 (0013, 1013) Private tag data                    UN: b'70093008'
 (0018, 0015) Body Part Examined                  CS: 'COLON'
 (0018, 0022) Scan Options                        CS: 'HELICAL MODE'
 (0018, 0050) Slice Thickness                     DS: '0.7999999999999989'
 (0018, 0060) KVP                                 DS: '120'
 (0018, 0090) Data Collection Diameter            DS: '500.000000'
 (0018, 1020) Software Version(s)                 LO: 'LightSpeedverrel'
 (0018, 1030) Protocol Name                       LO: '6.10 CT  COLONOGRAPHY'
 (0018, 1100) Reconstruction Diameter             DS: '330.000000'
 (0018, 1110) Distance Source to Detector         DS: '949.075012'
 (0018, 1111) Distance Source to Patient          DS: '541.000000'
 (0018, 1120) Gantry/Detector Tilt                DS: '0.000000'
 (0018, 1130) Table Height                        DS: '167.199997'
 (0018, 1140) Rotation Direction                  CS: 'CW'
 (0018, 1150) Exposure Time                       IS: '526'
 (0018, 1151) X-Ray Tube Current                  IS: '140'
 (0018, 1152) Exposure                            IS: '2286'
 (0018, 1160) Filter Type                         SH: 'BODY FILTER'
 (0018, 1170) Generator Power                     IS: '16800'
 (0018, 1190) Focal Spot(s)                       DS: '0.700000'
 (0018, 1200) Date of Last Calibration            DA: ''
 (0018, 1201) Time of Last Calibration            TM: ''
 (0018, 1210) Convolution Kernel                  SH: 'STANDARD'
 (0018, 5100) Patient Position                    CS: 'FFS'
 (0020, 000d) Study Instance UID                  UI: 1.3.6.1.4.1.9328.50.4.9864
 (0020, 000e) Series Instance UID                 UI: 1.3.6.1.4.1.9328.50.4.9865
 (0020, 0010) Study ID                            SH: '1'
 (0020, 0011) Series Number                       IS: '102'
 (0020, 0012) Acquisition Number                  IS: '1'
 (0020, 0013) Instance Number                     IS: '1'
 (0020, 0032) Image Position (Patient)            DS: ['-165.000000', '-165.000000', '-8.335000']
 (0020, 0037) Image Orientation (Patient)         DS: ['1.000000', '0.000000', '0.000000', '0.000000', '1.000000', '0.000000']
 (0020, 0052) Frame of Reference UID              UI: 1.3.6.1.4.1.9328.50.4.9866
 (0020, 1040) Position Reference Indicator        LO: 'XY'
 (0020, 1041) Slice Location                      DS: '-8.335000'
 (0028, 0002) Samples per Pixel                   US: 1
 (0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
 (0028, 0010) Rows                                US: 512
 (0028, 0011) Columns                             US: 512
 (0028, 0030) Pixel Spacing                       DS: ['0.644531', '0.644531']
 (0028, 0100) Bits Allocated                      US: 16
 (0028, 0101) Bits Stored                         US: 16
 (0028, 0102) High Bit                            US: 15
 (0028, 0103) Pixel Representation                US: 1
 (0028, 0120) Pixel Padding Value                 SS: -2000
 (0028, 1050) Window Center                       DS: '40'
 (0028, 1051) Window Width                        DS: '400'
 (0028, 1052) Rescale Intercept                   DS: '-1024'
 (0028, 1053) Rescale Slope                       DS: '1'
 (0040, a124) UID                                 UI: ''
 (0088, 0140) Storage Media File-set UID          UI: ''
 (3006, 0024) Referenced Frame of Reference UID   UI: ''
 (3006, 00c2) Related Frame of Reference UID      UI: ''
 (7fe0, 0010) Pixel Data                          OW: Array of 524288 bytes,```

1 个答案:

答案 0 :(得分:0)

我认为您的问题与转让语法无关。错误消息表明特定字符集(0008,0005)的值为

  

ISO 2022 IR 100

仅在使用所谓的代码扩展技术的情况下才允许使用ISO 2022。也就是说,同一属性值可以包含从不同字符集获得的字符,并且使用特殊字节序列(在ISO 2022中定义)在它们之间进行切换。

作为参考,请参阅PS3.3,C.12.1.1.2

代码扩展技术相对难以处理,因此很少使用。实际上,这是我(可能)看到此类对象的第一种情况。这对我来说也很有趣-所以您介意分享创建此图像的制造商和设备吗?

您如何解决此问题?好问题。我不知道任何能够处理此字符串编码的(python)工具包-也许dcm4che可以做到这一点。

如果只想提取像素数据,则可以尝试将(0008,0005)的值更改为“ ISO_IR 100”。这可能会导致读取元数据(例如患者姓名或研究描述)时出现问题。但是与像素数据编码有关的所有属性都不受字符编码的影响,因此应该可以使用。