我正在重构一些非常低效的代码,但我仍然看到MySQL和Java服务器上的巨大负载。我们有一个端点,允许用户上传包含姓名,姓氏,电话号码和电子邮件地址的联系人的CSV文件。电话号码和电子邮件地址对于某个位置必须是唯一的。但是,电话号码存储在与联系人不同的表中,因为它们可以有多个。 CSV只允许一个,但他们可以手动更新联系人以添加更多。我们的用户可能会上传大到50,000条记录的文件。
这是我的相关SQL结构:
Contact Table
+----+-----------+----------+------------------+------------+
| id | firstName | lastName | email | locationId |
+----+-----------+----------+------------------+------------+
| 1 | John | Doe | jdoe@noemail.com | 1 |
+----+-----------+----------+------------------+------------+
Contact Phone Table
+----+-----------+--------------+---------+
| id | contactId | number | primary |
+----+-----------+--------------+---------+
| 1 | 1 | +15555555555 | 1 |
+----+-----------+--------------+---------+
| 2 | 1 | +11231231234 | 0 |
+----+-----------+--------------+---------+
电子邮件&联系人表格中的locationId 和 contactId&联系电话表中的号码 。
原始程序员刚刚在Java中创建了一个循环来循环访问CSV,查询电话号码和电子邮件(两个单独的查询),如果一次没有匹配则插入。这太糟糕了,只会杀死我们的服务器。
这是我最近的尝试:
存储过程:
DELIMITER $$
CREATE PROCEDURE save_bulk_contact(IN last_name VARCHAR(128), IN first_name VARCHAR(128), IN email VARCHAR(320), IN location_id BIGINT, IN organization_id BIGINT, IN phone_number VARCHAR(15))
BEGIN
DECLARE insert_id BIGINT;
INSERT INTO contact
(`lastName`, `firstName`, `primaryEmail`, `locationId`, `firstActiveDate`)
VALUE (last_name, first_name, email, location_id, organization_id, UNIX_TIMESTAMP() * 1000);
SET insert_id = LAST_INSERT_ID();
INSERT INTO contact_phone
(`contactId`, `number`, `type`, `primary`)
VALUE (insert_id, phone_number, 'CELL', 1);
END$$
DELIMITER ;
然后在Java中,我查询所有具有该位置的电话号码的联系人,循环访问它们,删除重复项,然后使用批量更新将它们全部插入。
服务层:
private ContactUploadJSON uploadContacts(ContactUploadJSON contactUploadJSON) throws HandledDataAccessException {
List<ContactUploadData> returnList = new ArrayList<>();
if (contactUploadJSON.getContacts() != null) {
List<Contact> existingContacts = contactRepository.getContactsByLocationId(contactUploadJSON.getLocationId());
List<ContactUploadData> uploadedContacts = contactUploadJSON.getContacts();
Iterator<ContactUploadData> uploadedContactsIterator = uploadedContacts.iterator();
while (uploadedContactsIterator.hasNext()) {
ContactUploadData current = uploadedContactsIterator.next();
boolean anyMatch = existingContacts.stream().anyMatch(existingContact -> {
try {
boolean contactFound = contactEqualsContactUploadData(existingContact, current);
if(contactFound) {
contactUploadJSON.incrementExisted();
current.setError("Duplicate Contact: " + StringUtils.joinWith(" ", existingContact.getFirstName(), existingContact.getLastName()));
returnList.add(current);
}
return contactFound;
} catch (PhoneParsingException | PhoneNotValidException e) {
contactUploadJSON.incrementFailed();
current.setError("Failed with error: " + e.getMessage());
returnList.add(current);
return true;
}
});
if(anyMatch) {
uploadedContactsIterator.remove();
}
}
contactUploadJSON.setCreated(uploadedContacts.size());
if(!uploadedContacts.isEmpty()){
contactRepository.insertBulkContacts(uploadedContacts, contactUploadJSON.getLocationId());
}
}
contactUploadJSON.setContacts(returnList);
return contactUploadJSON;
}
private static boolean contactEqualsContactUploadData(Contact contact, ContactUploadData contactUploadData) throws PhoneParsingException, PhoneNotValidException {
if(contact == null || contactUploadData == null) {
return false;
}
String normalizedPhone = PhoneUtils.validatePhoneNumber(contactUploadData.getMobilePhone());
List<ContactPhone> contactPhones = contact.getPhoneNumbers();
if(contactPhones != null && contactPhones.stream().anyMatch(contactPhone -> StringUtils.equals(contactPhone.getNumber(), normalizedPhone))) {
return true;
}
return (StringUtils.isNotBlank(contactUploadData.getEmail()) &&
StringUtils.equals(contact.getPrimaryEmail(), contactUploadData.getEmail())) ||
(contact.getPrimaryPhoneNumber() != null &&
StringUtils.equals(contact.getPrimaryPhoneNumber().getNumber(), normalizedPhone));
}
存储库代码:
public void insertBulkContacts(List<ContactUploadData> contacts, long locationId) throws HandledDataAccessException {
String sql = "CALL save_bulk_contact(:last_name, :first_name, :email, :location_id, :phone_number)";
try {
List<Map<String, Object>> contactsList = new ArrayList<>();
contacts.forEach(contact -> {
Map<String, Object> contactMap = new HashMap<>();
contactMap.put("last_name", contact.getLastName());
contactMap.put("first_name", contact.getFirstName());
contactMap.put("email", contact.getEmail());
contactMap.put("location_id", locationId);
contactMap.put("phone_number", contact.getMobilePhone());
contactsList.add(contactMap);
});
Map<String, Object>[] paramList = contactsList.toArray(new Map[0]);
namedJdbcTemplate.batchUpdate(sql, paramList);
} catch (DataAccessException e) {
log.severe("Failed to insert contacts:\n" + ExceptionUtils.getStackTrace(e));
throw new HandledDataAccessException("Failed to insert contacts");
}
}
返回ContactUploadJSON包含联系人列表,locationId以及添加,已存在和失败的指标。
此解决方案有效,但我想知道是否有更好的方法?在未来,我们将需要一种更新联系人的机制,而不仅仅是插入新联系人,因此我必须做出相应的计划。是否有可能在MySQL中完成所有这些操作?会更有效率吗?我认为与复合唯一约束的一对多关系使其更加困难。