除了Spring的@Transactional
注释之外,还有什么东西可以造成15分钟的大幅减速?
编辑在最后添加信息。
我有一个偶尔表现得非常慢的网页应用。该网站的部分内容由动态json驱动。大多数情况下,该网站运作良好。我想强调一下这个事实:网站并不总是很慢。该网站一直在运行,但它没有多大用处。有时候网站似乎已经入睡并且没有正确显示,因为json请求没有返回。如果我打开网站的其他标签/窗口,我会得到静态资源请求的快速响应。任何需要数据库数据的请求都会挂起如果我打开浏览器并去吃午餐 - 当我回来时,网站恢复正常。一旦网站醒来"它将继续以正常速度继续工作一整天。
为了找出这种奇怪的缓慢行为的原因,我开始使用Dropwizard / Codahale Metrics库。
我已经为网站的大部分后端添加了指标。我最近观察到了缓慢的行为,并且我试图查明原因。
我相信Spring的@Transactional
注释偶尔会为方法调用添加额外的931秒。我意识到,如果这是真的,那么由于Spring上的错误或编码不好而导致的减速并不是因为我可能是错误的配置。这些数字看起来那么远(15分钟?),我需要stackoverflow的帮助
4个问题:
@Transactional
花费了所有时间,您对如何修复它有什么建议吗?我今天早上捕获了这些指标,但是减速事件发生在几天前。 " max"价值观是我特别关注的。
"example.web.controller.CategoryGroupController.listJsonGroups":{"count":76,
"max":931812.646,"mean":79.981,"min":79.981,"p50":79.981,"p75":79.981,"p95":79.981,
"p98":79.981,"p99":79.981,"p999":79.981,"stddev":0.0,"m15_rate":4.121759882047415E-4,
"m1_rate":5.778884146644216E-9,"m5_rate":1.7016008277662782E-4,"mean_rate":1.4308967477371374E-4,
"duration_units":"milliseconds","rate_units":"calls/second"},
"example.web.service.CategoryGroupServiceImpl.findByCategoryGroupId":{"count":76,
"max":931812.091,"mean":79.86399999999999,"min":79.86399999999999,"p50":79.86399999999999,
"p75":79.86399999999999,"p95":79.86399999999999,"p98":79.86399999999999,"p99":79.86399999999999,
"p999":79.86399999999999,"stddev":0.0,"m15_rate":4.121759882047415E-4,
"m1_rate":5.778884146644216E-9,"m5_rate":1.7016008277662782E-4,"mean_rate":1.4499753290885513E-4,
"duration_units":"milliseconds","rate_units":"calls/second"},
"example.web.service.CategoryGroupServiceImpl.getDao":{"count":0,
"max":0.0,"mean":0.0,"min":0.0,"p50":0.0,"p75":0.0,"p95":0.0,"p98":0.0,"p99":0.0,"p999":0.0,
"stddev":0.0,"m15_rate":0.0,"m1_rate":0.0,"m5_rate":0.0,"mean_rate":0.0,
"duration_units":"milliseconds","rate_units":"calls/second"},
"example.web.dao.CategoryGroupDao.findByCategoryGroupId":{"count":76,
"max":10.404,"mean":6.916000069494177,"min":6.9159999999999995,"p50":6.9159999999999995,
"p75":6.9159999999999995,"p95":6.9159999999999995,"p98":6.9159999999999995,
"p99":6.9159999999999995,"p999":6.9159999999999995,"stddev":7.267429203455992E-5,
"m15_rate":4.8689221587514396E-6,"m1_rate":7.154074307612488E-40,"m5_rate":1.1717927914268928E-10,
"mean_rate":1.4546355576766556E-4,"duration_units":"milliseconds","rate_units":"calls/second"}
Controller层调用Service层,然后调用Dao层。
根据上面的指标,控制器层与服务层一起很慢,但Dao层很快。
问题在于服务层并没有真正做任何事情 - 除非@Transactional
注释减慢速度,否则它很慢是没有意义的。
@M。 Deinum解释说getDao()方法的计数为0,因为Spring AOP代理只拦截外部调用。
控制器层:
@Controller
public class CategoryGroupController extends GenericController<CategoryGroup> {
public static final String CATEGORY = "Awesome Groups";
@Value("${spring.datasource.office}")
private String dsOfficeId = null;
public CategoryGroupController() {
super(CategoryGroup.class);
}
@Autowired
private CategoryGroupService _service;
@Override
protected CategoryGroupService getService() {
return _service;
}
@Timed
@RequestMapping(value = "/json/groups/")
public @ResponseBody
FoundGroups listJsonGroups() {
List<CategoryGroup> found = listJsonGroups(dsOfficeId, CATEGORY);
found = applyOverrides(found);
List<FoundGroup> jsonGroups = buildGroup(found);
return new FoundGroups(jsonGroups);
}
public List<CategoryGroup> listJsonGroups(String dbOfficeId, String categoryId) {
CategoryGroupService service = getService();
CategoryGroupId template = new CategoryGroupId(dbOfficeId, categoryId, null);
List<CategoryGroup> found = service.findByCategoryGroupId(template);
return found;
}
// other methods...
}
服务层:
@Service
public class CategoryGroupServiceImpl extends GenericService<CategoryGroup> implements CategoryGroupService {
@Autowired
private ICategoryGroupDao _dao;
public CategoryGroupServiceImpl() {
super(CategoryGroup.class);
}
@Override
@Timed
protected ICategoryGroupDao getDao() {
return _dao;
}
@Override
@Timed
@Transactional(readOnly = true)
public List<CategoryGroup> findByCategoryGroupId(CategoryGroupId idCategoryGroup) {
return getDao().findByCategoryGroupId(idCategoryGroup);
}
// other methods ...
}
道层:
@Repository
public class CategoryGroupDao extends GenericDao<CategoryGroup> implements ICategoryGroupDao {
public CategoryGroupDao() {
super(CategoryGroup.class);
}
@Timed
@Override
public List<CategoryGroup> findByCategoryGroupId(CategoryGroupId idCategoryGroup) {
List<CategoryGroup> retval = null;
SessionFactory sessionFactory = getSessionFactory();
Session currentSession = sessionFactory.getCurrentSession();
Criteria crit = currentSession.createCriteria(CategoryGroup.class).setReadOnly(true);
if (idCategoryGroup != null) {
addEqRestriction(idCategoryGroup.getDbOfficeId(), "id.dbOfficeId", crit);
addEqRestriction(idCategoryGroup.getCategoryId(), "id.categoryId", crit);
addEqRestriction(idCategoryGroup.getGroupId(), "id.groupId", crit);
}
retval = crit.list();
return retval;
}
// other methods...
}
Spring config:
<bean id="dataSource" class = "org.apache.commons.dbcp2.BasicDataSource" >
<property name="driverClassName" value="${jdbc.driverClassName}" />
<property name="url" value="${jdbc.databaseurl}" />
<property name="username" value="${jdbc.username}" />
<property name="password" value="${jdbc.password}" />
<property name="accessToUnderlyingConnectionAllowed" value="true" />
</bean>
<tx:annotation-driven />
<bean id="transactionManager" class="org.springframework.orm.hibernate4.HibernateTransactionManager">
<property name="sessionFactory" ref="sessionFactory" />
</bean>
<bean id="sessionFactory" class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="packagesToScan" value="example.db.entity" />
<property name="hibernateProperties">
<props>
<prop key="hibernate.dialect">${jdbc.dialect}</prop>
<prop key="hibernate.show_sql">false</prop>
<prop key="hibernate.generate_statistics" >false</prop>
</props>
</property>
</bean>
JDBC:
jdbc.driverClassName=oracle.jdbc.OracleDriver
jdbc.dialect=org.hibernate.dialect.Oracle10gDialect
jdbc.username=auser
jdbc.databaseurl=jdbc:oracle:thin:@192.168.0.23:1521:MYDB01
jdbc.password=apassword
新细节:
根据M. Deinum的建议,我改变了我的连接池以使用HikariCP。 我将最大池大小增加到30个连接。
我还创建了以下InstrumentedTransactionManager类,以将定时器添加到Springs @Transactional使用的TransactionManager中:
@Component
public class InstrumentedTransactionManager extends org.springframework.orm.hibernate4.HibernateTransactionManager {
private static final Logger logger = Logger.getLogger(InstrumentedTransactionManager.class.getName());
MetricRegistry metricRegistry;
SessionFactory sessionFactory;
private Map<String, Timer> timers = new HashMap<>();
public InstrumentedTransactionManager() {
}
public InstrumentedTransactionManager(SessionFactory sessionFactory) {
super(sessionFactory);
this.sessionFactory = sessionFactory;
}
@Autowired
public void setSessionFactory(SessionFactory sessionFactory) {
super.setSessionFactory(sessionFactory);
this.sessionFactory = sessionFactory;
}
@Autowired
public void setMetricRegistry(MetricRegistry metricRegistry) {
this.metricRegistry = metricRegistry;
updateTimers();
}
private static String[] methodNames = {"doBegin", "doCommit", "doResume", "doRollback", "getDataSource", "doGetTransaction", "doSuspend"};
private void updateTimers() {
timers.clear();
if (metricRegistry != null) {
for (String methodName : methodNames) {
Timer aTimer = metricRegistry.timer(metricRegistry.name(InstrumentedTransactionManager.class.getName(), methodName));
timers.put(methodName, aTimer);
}
}
}
public Timer.Context getTimerContext(String name) {
Timer.Context context = null;
Timer aTimer = timers.get(name);
if (aTimer != null) {
context = aTimer.time();
}
return context;
}
@Override
protected void doBegin(Object transaction, TransactionDefinition definition) {
Timer.Context context = getTimerContext("doBegin");
try {
super.doBegin(transaction, definition);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected void doCommit(DefaultTransactionStatus status) {
Timer.Context context = getTimerContext("doCommit");
try {
super.doCommit(status);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected void doResume(Object transaction, Object suspendedResources) {
Timer.Context context = getTimerContext("doResume");
try {
super.doResume(transaction, suspendedResources);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected void doRollback(DefaultTransactionStatus status) {
Timer.Context context = getTimerContext("doRollback");
try {
super.doRollback(status);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
public DataSource getDataSource() {
Timer.Context context = getTimerContext("getDataSource");
try {
return super.getDataSource();
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected Object doSuspend(Object transaction) {
Timer.Context context = getTimerContext("doSuspend");
try {
return super.doSuspend(transaction);
} finally {
if (context != null) {
context.stop();
}
}
}
@Override
protected Object doGetTransaction() {
Timer.Context context = getTimerContext("doGetTransaction");
try {
return super.doGetTransaction();
} finally {
if (context != null) {
context.stop();
}
}
}
}
最后,我向Springs @Async注释使用的Executor添加了指标,以尝试识别线程池问题。
<task:annotation-driven executor="myTaskExecutor" />
<bean id="myTaskExecutor" class="org.springframework.scheduling.concurrent.ConcurrentTaskExecutor">
<constructor-arg type="java.util.concurrent.Executor" ref="meteredExecutor"/>
</bean>
<bean id="meteredExecutor" class="com.codahale.metrics.InstrumentedExecutorService">
<constructor-arg type="java.util.concurrent.ExecutorService" ref="es"/>
<constructor-arg type="com.codahale.metrics.MetricRegistry" ref="metricRegistry"/>
</bean>
<bean id="es" class="java.util.concurrent.ThreadPoolExecutor">
<constructor-arg type="int" value="0"/>
<constructor-arg type="int" value="7"/>
<constructor-arg type="long" value="60"/>
<constructor-arg type="java.util.concurrent.TimeUnit" value="SECONDS"/>
<constructor-arg type="java.util.concurrent.BlockingQueue" ref="bq"/>
<constructor-arg type="java.util.concurrent.ThreadFactory" ref="instrumentedThreadFactory"/>
</bean>
<bean id="bq" class="java.util.concurrent.LinkedBlockingQueue"/>
<bean id="instrumentedThreadFactory" class="com.codahale.metrics.InstrumentedThreadFactory" >
<constructor-arg type="java.util.concurrent.ThreadFactory" ref="threadFactory"/>
<constructor-arg type="com.codahale.metrics.MetricRegistry" ref="metricRegistry"/>
</bean>
<bean id="threadFactory" class="java.util.concurrent.Executors" factory-method="defaultThreadFactory" >
</bean>
自从做出上述改变以来,我没有观察到生产的缓慢但缺乏证据并不是缺席的证据......