如何检查文件是否为图像

时间:2018-09-12 08:54:55

标签: java scala

所以我有一个检查文件类型的功能,我通过签名检查文件,但是对于GIF文件,它不起作用

 def checkPhotoType(file: File): Option[String] = {
    val param = new DataInputStream(new BufferedInputStream(new FileInputStream(file)))
    if (param.readInt() == 0xFFd8FFe0 | param.readInt() == 0xFFd8FFe1 )
      Some("jpg/jpeg")
    if(param.readInt() == 0x474946383961L)
      Some("gif")
    else None

3 个答案:

答案 0 :(得分:2)

此代码存在多个问题:

  1. 每次进行测试时,您正在阅读新的Int
  2. 您正在尝试针对6字节的值测试4字节的Int
  3. 在小端处理器上,字节顺序将是错误的

这里是如何构造此代码的示例。

def checkPhotoType(file: File): Option[String] = {
  val param = new DataInputStream(new BufferedInputStream(new FileInputStream(file)))

  val bytes = (1 to 6).map( _ => param.readByte).toList

  bytes match {
    case List(0xFF, 0xD8, 0xFF, 0xDB, _, _) =>
      Some("jpg/jpeg")
    case List(0x47, 0x49, 0x46, 0x38, 0x37, 0x61) =>
      Some("GIF87a")
    case List(0x47, 0x49, 0x46, 0x38, 0x39, 0x61) =>
      Some("GIF89a")
    case _ =>
      None
  }
}

答案 1 :(得分:0)

if (param.readInt() == 0xFFd8FFe0 | param.readInt() == 0xFFd8FFe1 )
  Some("jpg/jpeg")
if(param.readInt() == 0x474946383961L)
  Some("gif")

您继续从int中读取另一个param。将其放在变量中:

int magic = param.readInt();
if (magic == 0xFFd8FFe0 | magic == 0xFFd8FFe1 )
  Some("jpg/jpeg")
if(magic == 0x474946383961L)
  Some("gif")

尽管DawoodIbnKareem指出,magic == 0x474946383961L永远不会为真,因为常量是int范围之外的long值。因此,您需要读取更多数据才能与此匹配。

答案 2 :(得分:0)

除了第一个答案,您还可以按以下方式将十六进制字符串转换为整数:

import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
#to get rid of warning of open windows
plt.rcParams.update({'figure.max_open_warning': 0})
import os.path

stopwords = set(STOPWORDS)
d={}

##################################################################
def color_word(word, *args, **kwargs):
        if (word == name.upper()):
            color = '#ff0000' # red
        else:
            color = '#000000' # black
        return color
##################################################################
with open('f1.txt') as inf:
    for line in inf:
        parts = line.split(',')
        d[parts[0]]=float(parts[1])

with open('f2.txt') as inf:
    for line in inf:
       parts = line.split()
       name=parts[0]
       print(name)
       wordcloud = WordCloud(stopwords=STOPWORDS,
                                  background_color='white',
                                  max_words=210000,
                                  width=1500,
                                  height=1000, color_func=color_word).generate_from_frequencies(d)                   
       plt.figure(figsize = (15, 15), facecolor = None)
       plt.imshow(wordcloud)
       plt.axis("off")
       plt.tight_layout(pad = 0)
       # plt.savefig(name+".eps",format='eps', dpi=1000)
       wordcloud.to_file(name+".png")

Here are few entries from both the files.
f1.txt-
RON,3345.4859813084113
ABODE,63170.64705882353
ARM,12634.129411764707
DELL,27535.923076923078
GAME,56521.10526315789
ANI,357967.0
HEAD,357967.0
CARD,46691.34782608696
LAMP,357967.0
STAR,357967.0
WAR,357967.0
EPISODE,357967.0

f2.txt-
RON,ABODE,ARM,DELL,GAME,ANI,HEAD,CARD,LAMP,STAR,WAR,EPISODE,HOPE,NEW

或更大的数字

String hex = "aa"
int value = Integer.parseInt(hex, 16);
相关问题