我正在处理使用Apache HTTP客户端发出的GET请求(v4-最新版本;而不是旧版本的v3)......
如何获取响应的mimetype?
在apache http客户端的旧v3中,使用以下代码获取mime类型 -
String mimeType = response.getMimeType();
如何使用apache http client的v4获取mimetype?
答案 0 :(得分:30)
要从回复中获取内容类型,您可以使用ContentType类。
HttpEntity entity = response.getEntity();
ContentType contentType;
if (entity != null)
contentType = ContentType.get(entity);
使用此类可以轻松提取mime类型:
String mimeType = contentType.getMimeType();
或charset:
Charset charset = contentType.getCharset();
答案 1 :(得分:17)
“Content-type”HTTP标头应该为您提供mime类型信息:
Header contentType = response.getFirstHeader("Content-Type");
或
Header contentType = response.getEntity().getContentType();
然后你可以提取mime类型本身,因为内容类型也可能包含编码。
String mimeType = contentType.getValue().split(";")[0].trim();
当然,在获取标头值之前不要忘记进行空检查(如果服务器没有发送内容类型标头)。
答案 2 :(得分:0)
请注意,Apache's ContentType 将 SVG 视为 application/svg+xml
,而不是 IANA-defined image/svg+xml
,这似乎是一个不正确的分类。
虽然这个答案没有直接回答问题,但它提供了一种通过使用 Java 的 HTTP 客户端来使用 Apache 的 HTTP 客户端的替代方法。此外,示例代码:
HEAD
请求,而不是 GET
请求,这是一种更轻松的操作;MediaType
枚举而不是字符串来向媒体类型添加强类型;和ContentType
未定义的更多媒体类型,尤其是图像。不用多说,这里有一些 Java 源文件可能会有所帮助。
基本枚举对 IANA 媒体类型进行编码。如果您添加更多官方编码,请更新此答案,以便所有人都能受益。请注意,R Markdown、R XML 和 YAML 并未正式定义,因此您可能希望将其删除。
import static org.apache.commons.io.FilenameUtils.getExtension;
public enum MediaType {
APP_JAVA_OBJECT(
APPLICATION, "x-java-serialized-object"
),
FONT_OTF( "otf" ),
FONT_TTF( "ttf" ),
IMAGE_APNG( "apng" ),
IMAGE_ACES( "aces" ),
IMAGE_AVCI( "avci" ),
IMAGE_AVCS( "avcs" ),
IMAGE_BMP( "bmp" ),
IMAGE_CGM( "cgm" ),
IMAGE_DICOM_RLE( "dicom_rle" ),
IMAGE_EMF( "emf" ),
IMAGE_EXAMPLE( "example" ),
IMAGE_FITS( "fits" ),
IMAGE_G3FAX( "g3fax" ),
IMAGE_GIF( "gif" ),
IMAGE_HEIC( "heic" ),
IMAGE_HEIF( "heif" ),
IMAGE_HEJ2K( "hej2k" ),
IMAGE_HSJ2( "hsj2" ),
IMAGE_X_ICON( "x-icon" ),
IMAGE_JLS( "jls" ),
IMAGE_JP2( "jp2" ),
IMAGE_JPEG( "jpeg" ),
IMAGE_JPH( "jph" ),
IMAGE_JPHC( "jphc" ),
IMAGE_JPM( "jpm" ),
IMAGE_JPX( "jpx" ),
IMAGE_JXR( "jxr" ),
IMAGE_JXRA( "jxrA" ),
IMAGE_JXRS( "jxrS" ),
IMAGE_JXS( "jxs" ),
IMAGE_JXSC( "jxsc" ),
IMAGE_JXSI( "jxsi" ),
IMAGE_JXSS( "jxss" ),
IMAGE_KTX( "ktx" ),
IMAGE_KTX2( "ktx2" ),
IMAGE_NAPLPS( "naplps" ),
IMAGE_PNG( "png" ),
IMAGE_SVG_XML( "svg+xml" ),
IMAGE_T38( "t38" ),
IMAGE_TIFF( "tiff" ),
IMAGE_WEBP( "webp" ),
IMAGE_WMF( "wmf" ),
TEXT_HTML( TEXT, "html" ),
TEXT_MARKDOWN( TEXT, "markdown" ),
TEXT_PLAIN( TEXT, "plain" ),
TEXT_R_MARKDOWN( TEXT, "R+markdown" ),
TEXT_R_XML( TEXT, "R+xml" ),
TEXT_YAML( TEXT, "yaml" ),
UNDEFINED( TypeName.UNDEFINED, "undefined" );
/**
* The IANA-defined types.
*/
public enum TypeName {
APPLICATION,
IMAGE,
TEXT,
UNDEFINED
}
/**
* The fully qualified IANA-defined media type.
*/
private final String mMediaType;
/**
* The IANA-defined type name.
*/
private final TypeName mTypeName;
/**
* The IANA-defined subtype name.
*/
private final String mSubtype;
/**
* Constructs an instance using the default type name of "image".
*
* @param subtype The image subtype name.
*/
MediaType( final String subtype ) {
this( IMAGE, subtype );
}
/**
* Constructs an instance using an IANA-defined type and subtype pair.
*
* @param typeName The media type's type name.
* @param subtype The media type's subtype name.
*/
MediaType( final TypeName typeName, final String subtype ) {
mTypeName = typeName;
mSubtype = subtype;
mMediaType = typeName.toString().toLowerCase() + '/' + subtype;
}
/**
* Returns the {@link MediaType} associated with the given file.
*
* @param file Has a file name that may contain an extension associated with
* a known {@link MediaType}.
* @return {@link MediaType#UNDEFINED} if the extension has not been
* assigned, otherwise the {@link MediaType} associated with this
* {@link File}'s file name extension.
*/
public static MediaType valueFrom( final File file ) {
return valueFrom( file.getName() );
}
/**
* Returns the {@link MediaType} associated with the given file name.
*
* @param filename The file name that may contain an extension associated
* with a known {@link MediaType}.
* @return {@link MediaType#UNDEFINED} if the extension has not been
* assigned, otherwise the {@link MediaType} associated with this
* URL's file name extension.
*/
public static MediaType valueFrom( final String filename ) {
return getMediaType( getExtension( filename ) );
}
/**
* Returns the {@link MediaType} for the given type and subtype names.
*
* @param type The IANA-defined type name.
* @param subtype The IANA-defined subtype name.
* @return {@link MediaType#UNDEFINED} if there is no {@link MediaType} that
* matches the given type and subtype names.
*/
public static MediaType valueFrom(
final String type, final String subtype ) {
for( final var mediaType : MediaType.values() ) {
if( mediaType.equals( type, subtype ) ) {
return mediaType;
}
}
return UNDEFINED;
}
/**
* Answers whether the given type and subtype names equal this enumerated
* value. This performs a case-insensitive comparison.
*
* @param type The type name to compare against this {@link MediaType}.
* @param subtype The subtype name to compare against this {@link MediaType}.
* @return {@code true} when the type and subtype name match.
*/
public boolean equals( final String type, final String subtype ) {
return mTypeName.name().equalsIgnoreCase( type ) &&
mSubtype.equalsIgnoreCase( subtype );
}
/**
* Answers whether the given {@link TypeName} matches this type name.
*
* @param typeName The {@link TypeName} to compare against the internal value.
* @return {@code true} if the given value is the same IANA-defined type name.
*/
public boolean isType( final TypeName typeName ) {
return mTypeName == typeName;
}
/**
* Returns the IANA-defined type and sub-type.
*
* @return The unique media type identifier.
*/
public String toString() {
return mMediaType;
}
/**
* Used by {@link MediaTypeExtensions} to initialize associations where the
* subtype name and the file name extension have a 1:1 mapping.
*
* @return The IANA subtype value.
*/
String getSubtype() {
return mSubtype;
}
}
不同的文件扩展名映射到不同的媒体类型。扩展到 MediaType
的映射并不一定意味着内容与预期的媒体类型匹配。应用程序必须注意读取文件头以确定实际的媒体类型。
enum MediaTypeExtensions {
MEDIA_FONT_OTF( FONT_OTF ),
MEDIA_FONT_TTF( FONT_TTF ),
MEDIA_IMAGE_APNG( IMAGE_APNG ),
MEDIA_IMAGE_BMP( IMAGE_BMP ),
MEDIA_IMAGE_GIF( IMAGE_GIF ),
MEDIA_IMAGE_ICO( IMAGE_X_ICON, of( "ico", "cur" ) ),
MEDIA_IMAGE_JPEG( IMAGE_JPEG, of( "jpg", "jpeg", "jfif", "pjpeg", "pjp" ) ),
MEDIA_IMAGE_PNG( IMAGE_PNG ),
MEDIA_IMAGE_SVG( IMAGE_SVG_XML, of( "svg" ) ),
MEDIA_IMAGE_TIFF( IMAGE_TIFF, of( "tif", "tiff" ) ),
MEDIA_IMAGE_WEBP( IMAGE_WEBP ),
MEDIA_TEXT_MARKDOWN( TEXT_MARKDOWN, of(
"md", "markdown", "mdown", "mdtxt", "mdtext", "mdwn", "mkd", "mkdown",
"mkdn" ) ),
MEDIA_TEXT_PLAIN( TEXT_PLAIN, of( "asc", "ascii", "txt", "text", "utxt" ) ),
MEDIA_TEXT_R_MARKDOWN( TEXT_R_MARKDOWN, of( "Rmd" ) ),
MEDIA_TEXT_R_XML( TEXT_R_XML, of( "Rxml" ) ),
MEDIA_TEXT_YAML( TEXT_YAML, of( "yaml", "yml" ) );
private final MediaType mMediaType;
private final Set<String> mExtensions;
MediaTypeExtensions( final MediaType mediaType ) {
this( mediaType, of( mediaType.getSubtype() ) );
}
MediaTypeExtensions(
final MediaType mediaType, final Set<String> extensions ) {
assert mediaType != null;
assert extensions != null;
assert !extensions.isEmpty();
mMediaType = mediaType;
mExtensions = extensions;
}
static MediaType getMediaType( final String extension ) {
final var sanitized = sanitize( extension );
for( final var mediaType : MediaTypeExtensions.values() ) {
if( mediaType.isType( sanitized ) ) {
return mediaType.getMediaType();
}
}
return UNDEFINED;
}
private boolean isType( final String sanitized ) {
for( final var extension : mExtensions ) {
if( extension.equalsIgnoreCase( sanitized ) ) {
return true;
}
}
return false;
}
private static String sanitize( final String extension ) {
return extension == null ? "" : extension.toLowerCase();
}
private MediaType getMediaType() {
return mMediaType;
}
}
最后,我们可以编写一个小型解析器,将内容类型标头转换为 MediaType
值。请注意,HttpClient
API 本身对标头名称执行区分大小写的比较,因此我们不能使用 firstValue
或 allValues
之类的方法,因为我们不知道服务器是否会返回“内容类型”或“内容类型”。严格来说,这似乎是一个错误,因为 RFC-2616 声明邮件标题不区分大小写。
public class HttpMediaType {
private final static HttpClient HTTP_CLIENT = HttpClient
.newBuilder()
.connectTimeout( ofSeconds( 5 ) )
.followRedirects( NORMAL )
.build();
/**
* Performs an HTTP HEAD request to determine the media type based on the
* Content-Type header returned from the server.
*
* @param uri Determine the media type for this resource.
* @return The data type for the resource or {@link MediaType#UNDEFINED} if
* unmapped.
* @throws MalformedURLException The {@link URI} could not be converted to
* a {@link URL}.
*/
public static MediaType valueFrom( final URI uri )
throws MalformedURLException {
final var mediaType = new MediaType[]{UNDEFINED};
try {
final var request = HttpRequest
.newBuilder( uri )
.method( "HEAD", noBody() )
.build();
final var response = HTTP_CLIENT.send( request, discarding() );
final var headers = response.headers();
final var map = headers.map();
map.forEach( ( key, values ) -> {
if( "Content-Type".equalsIgnoreCase( key ) ) {
var header = values.get( 0 );
// Trim off the character encoding.
var i = header.indexOf( ';' );
header = header.substring( 0, i == -1 ? header.length() : i );
// Split the type and subtype.
i = header.indexOf( '/' );
i = i == -1 ? header.length() : i;
final var type = header.substring( 0, i );
final var subtype = header.substring( i + 1 );
mediaType[ 0 ] = MediaType.valueFrom( type, subtype );
}
} );
} catch( final Exception ex ) {
// TODO: Inform the user?
}
return mediaType[ 0 ];
}
}