我们已经将DSpace 6.3部署到了Google Kubernetes引擎(GKE)上,该部署一直运行良好。但是,当我们将GKE从v1.12.7-gke.24升级到1.14.10-gke.50时,容器突然失败了。对k8s版本的更改是工作k8s节点和发生故障的k8s节点之间的唯一区别。本地构建的Docker容器可以正常工作。我们将其他DSpace模块部署在可以正常工作的单独容器(例如solr)中,只有jspui模块出现故障。
DSpace分支“ dspace-6_x”标记“ dspace-6.3”
Docker镜像:tomcat:8-alpine
通过gitlab CI / CD管道进行部署
该故障是由于在调用各种DSpace工厂服务单例模式bean的早期加载时,Spring Loader跌倒引起的。加载网站时,这会导致404错误,因为该网络应用无法初始化。
/usr/local/tomcat/log/localhost.YYYY-MM-dd.log中的错误消息:
28-Oct-2020 23:47:18.668 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.listenerStart
Exception sending context initialized event to listener instance of class
[org.dspace.servicemanager.servlet.DSpaceKernelServletContextListener]
java.lang.RuntimeException: Failure during filter init: Failed to startup the DSpace Service
Manager: failure starting up spring service manager: Error creating bean with name
'org.dspace.app.sherpa.submit.SHERPASubmitService' defined in URL
[jar:file:/dspace/webapps/jspui/WEB-INF/lib/dspace-api-6.3.jar!/spring/spring-dspace-addon-sherpa-services.xml]:
Cannot resolve reference to bean 'org.dspace.app.sherpa.submit.SHERPASubmitConfigurationService' while setting
bean property'configuration'; nested exception is org.springframework.beans.factory.BeanCreationException:
Error creating bean with name 'org.dspace.app.sherpa.submit.SHERPASubmitConfigurationService' defined in
file [/dspace/config/spring/api/sherpa.xml]: Cannot create inner bean
'org.dspace.app.sherpa.submit.MetadataValueISSNExtractor#1b511285' of type
[org.dspace.app.sherpa.submit.MetadataValueISSNExtractor] while setting bean property
'issnItemExtractors' with key [0]; nestedexception is
org.springframework.beans.factory.BeanCreationException: Error creating bean with name
'org.dspace.app.sherpa.submit.MetadataValueISSNExtractor#1b511285': Injection of autowired dependencies
failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire
field: public org.dspace.content.service.ItemService
org.dspace.app.sherpa.submit.MetadataValueISSNExtractor.itemService; nested exception is
org.springframework.beans.factory.BeanCreationException: Error creating bean with name
'org.dspace.content.ItemServiceImpl#0': Injection of autowireddependencies failed; nested exception is
org.springframework.beans.factory.BeanCreationException: Could not autowire field: protected
org.dspace.handle.service.HandleService org.dspace.content.DSpaceObjectServiceImpl.handleService; ...
在以下位置引发“失败启动Spring服务管理器”错误消息:
org.dspace.servicemanager.DSpaceServiceManager (\ dspace-services \ src \ main \ java \ org \ dspace \ servicemanager \ DSpaceServiceManager.java第215行)
在第212行的catch语句中,该语句调用:
org.dspace.servicemanager.spring.SpringServiceManager.startup() (\ dspace-services \ src \ main \ java \ org \ dspace \ servicemanager \ spring \ SpringServiceManager.java第177行)
它使用Spring框架尽早加载工厂bean。
我们首先想到的是,新的k8s版本可能需要更多的内存。因此,我们将Tomcat内存从1.5GB增加到了4GB。这不能解决问题。
我们已经研究了升级之间的GKE中间版本的发行说明,但没有任何帮助。
我们尝试使用其他Tomcat docker镜像,但无济于事。因此,我们认为这与操作系统无关。
远程调试连接到Tomcat的速度不足以捕获异常。我们尝试了Java专用的Google Cloud Debugger,但是Alpine Linux缺少了一些必需的库。无论如何,我不认为我们会发现比所记录的错误消息更有用的东西。
如果有人有任何想法,我们将不胜感激。
我们生产的k8s配置yaml文件:
ingress:
hosts:
- our.url.uts.edu.au
database:
secret: our_password
name: our_db_name
host: "our.db.instance.url"
port: "5432"
dspace:
env:
- name: DSPACE_HOSTNAME
value: our.url.uts.edu.au
- name: SOLR_PORT
value: "8080"
# Include colon if port is specified
- name: DSPACE_PORT
value: ""
- name: MAX_DB_CONNECTIONS
value: "50"
- name: "MAX_IDLE_DB_CONNECTIONS"
value: "30"
- name: INITIAL_DB_CONNECTIONS
value: "20"
- name: S3_ASSETSTORE_SUBFOLDER
value: "our_folder"
- name: S3_CONNECTION_TTL
value: "120000"
- name: S3_MAX_CONNECTIONS
value: "50"
- name: REST_EVENT_WEBHOOK_URL
value: http://our.rest.service.url/dspace/v2/webhook
- name: UTSLIB_FRAMEWORK_DSPACE_TOKEN
value: OUR_TOKEN
- name: CATALINA_OPTS
value: "-Xms1512m -Xmx1512m"
resources:
requests:
memory: "1640Mi"
cpu: 100m
limits:
memory: "1896Mi"
cpu: "450m"
solr:
pvc:
accessModes:
- ReadWriteOnce
annotations: {}
size: 35Gi
env:
- name: CATALINA_OPTS
value: "-Xms3904m -Xmx3904m -XX:+UseG1GC"
resources:
requests:
memory: "4032Mi"
cpu: 50m
limits:
memory: "4096Mi"
cpu: "800m"
cron:
env:
- name: SOLR_PORT
value: "8080"
- name: MAX_DB_CONNECTIONS
value: "3"
- name: MAX_IDLE_DB_CONNECTIONS
value: "1"
- name: INITIAL_DB_CONNECTIONS
value: "0"
- name: S3_ASSETSTORE_SUBFOLDER
value: "our_folder"
- name: S3_CONNECTION_TTL
value: "120000"
- name: S3_MAX_CONNECTIONS
value: "50"
- name: JAVA_OPTS
value: "-Xms32m -Xmx384m"
- name: REST_EVENT_WEBHOOK_URL
value: http://our.rest.service.url/dspace/v2/webhook
- name: UTSLIB_FRAMEWORK_DSPACE_TOKEN
value: OUR_TOKEN
我们的Dockerfile分为构建和运行时过程。 Dockerfile.build
FROM maven:3-jdk-8
# Modules that should be excluded from depdendency resolution
ARG EXCLUDE_MODULES=!dspace-rdf,!dspace-sword,!dspace-xmlui,!dspace-xmlui-mirage2
ENV DSPACE_VERSION=6.3 \
DSPACE_SHA1=e60db8dee2726933fcc7b7949c16757a510a79c5
ENV ANT_VERSION=1.10.8
ENV ANT_HOME=/opt/ant-$ANT_VERSION
ENV PATH=$ANT_HOME/bin:$PATH \
ANT_SHA1=20658b765bed8a7c3d18daa71a108e15d1937da2
WORKDIR /dspace-src
# Download DSpace source and install Ant
RUN curl -fSL "https://github.com/DSpace/DSpace/releases/download/dspace-${DSPACE_VERSION}/dspace-${DSPACE_VERSION}-src-release.tar.gz" -o dspace.tar.gz && \
echo "${DSPACE_SHA1} *dspace.tar.gz" | sha1sum -c - && \
tar -xz -f dspace.tar.gz --strip-components=1 && \
rm -f dspace.tar.gz && \
curl -fSL "https://archive.apache.org/dist/ant/binaries/apache-ant-${ANT_VERSION}-bin.tar.gz" -o ant.tar.gz && \
echo "${ANT_SHA1} *ant.tar.gz" | sha1sum -c - && \
mkdir ${ANT_HOME} && \
tar -xz -f ant.tar.gz -C ${ANT_HOME} --strip-components=1 && \
rm -rf ant.tar.gz
# Copy in custom artifacts
COPY ./src/artifacts/ ./artifacts
# Copy in pom.xml files
COPY ./src/dspace/pom.xml ./dspace/
COPY ./src/dspace/modules/pom.xml ./dspace/modules/
COPY ./src/dspace/modules/jspui/pom.xml ./dspace/modules/jspui/
COPY ./src/dspace/modules/utslib-copyright/pom.xml ./dspace/modules/utslib-copyright/
COPY ./src/dspace/modules/utslib-taglib/pom.xml ./dspace/modules/utslib-taglib/
# Install custom artifacts and prime the Maven repository
RUN mvn clean install --batch-mode --fail-never -f ./artifacts/JRis-master && \
mvn install -P ${EXCLUDE_MODULES} --batch-mode --fail-never -T 5
Dockerfile.runtime:
ARG BUILD_IMAGE=our.git.url/dspace/build:latest
FROM ${BUILD_IMAGE} as build
# Copy in our source changes
COPY ./src/dspace ./dspace
# We don't use these modules, but they'll be built anyway if not excluded
ARG EXCLUDE_MODULES=!dspace-rdf,!dspace-xmlui,!dspace-sword
# Unzip the MaxMind GeoLite database (IP location stuff for Solr).
# (MaxMind changed their privacy policy so you now have to login to download,
# which makes it fail for the standard DSpace installation)
# Build dspace with our source changes and move it to the installation directory
# Build only our customisations (skip building the specified modules)
# Could multithread the maven build, but there's dependency resolution problems
RUN tar -zxf ./dspace/config/GeoLite2-City_20191224.tar.gz --strip-components=1 -C ./dspace/config && \
rm ./dspace/config/GeoLite2-City_20191224.tar.gz && \
mvn package --batch-mode -P ${EXCLUDE_MODULES} -f ./dspace/pom.xml && \
cd ./dspace/target/dspace-installer && \
ant copy_webapps install_code
FROM tomcat:8-alpine
#FROM tomcat:8-jre8
ARG DSPACE_INSTALL_DIR=/dspace
ENV DSPACE_HOME=${DSPACE_INSTALL_DIR}
# Copy built source into this image
COPY --from=build ${DSPACE_INSTALL_DIR} ${DSPACE_INSTALL_DIR}
# Copy in our config overrides
# (These are not used in compilation, but are applied at runtime)
COPY ./src/local.cfg ${DSPACE_INSTALL_DIR}/config/
# Symlink all webapps and create temp upload directory
RUN ln -s ${DSPACE_INSTALL_DIR}/webapps/* ./webapps/
答案 0 :(得分:0)
在DSpace和Tomcat中实现了最详细的日志记录级别之后,可以获得有关Spring错误源的更多信息。
问题出在我们的自定义工厂类之一上。错误日志摘录:
./startFabric.sh javascript
有问题的属性是可写的,具有有效的getter和setter,并且getter和setter都是long类型。我删除了属性集代码,只是将其保留为默认值。部署有效。
简单升级k8s版本可能导致此错误的方法超出了我们的范围。在具有先前GKE版本的Pod中,完全可以执行相同的代码。