我正在尝试从Apache Phoenix表中读取行,该表包含文件名和一列,分别在我开始和结束处理文件时使用。我在执行设置开始和结束时间戳记的UPSERT时看到不一致的行为。有时UPSERT会按预期工作,有时似乎没有提交,但不会引发我可以看到的错误。
我试图通过将大多数调用包装在try / catch块中来解决问题,但这没有帮助。
我尝试使用Phoenix UPSERT和UPSERT SELECT,但我的想法是,由于文件名中有些奇怪的字符,文件名字段与最终发送到数据库的内容之间可能不匹配,但这也使得没有差异。
我尝试了多种设置和取消自动提交的方法。分别地,我在JDBC连接字符串中将autocommit设置为true / false,在创建JDBC Connection对象时将其显式设置,最后作为Phoenix Config对象的一部分。在所有这些实例中将autocommit设置为true时,我不会显式提交UPSERT,而在关闭自动提交时,则显式调用commit。
我观察到的唯一对我有帮助的行为是,如果我等待大量时间,则UPSERT似乎可以正常工作,但是在测试过程中,我随后手动重置了启动/完成列设置为NULL,然后重试以确保解决方案有效,但只能解决一次,以后的尝试似乎都失败了。
所有这些,我可以通过执行以下操作来一致地重现该问题:
第一个UPSERT起作用(将时间戳设置为now(),第二个UPSERT起作用(将时间戳字段设置为NULL),第三个UPSERT(将时间戳设置为当前时间戳(必须与第一个时间戳不同))抛出完全没有错误,但没有反映在数据库表中。
这是我正在使用的表的DDL:
CREATE TABLE DEFAULT.FILE_INDEX(
MTIME TIMESTAMP NOT NULL,
FILENAME VARCHAR NOT NULL,
TYPE VARCHAR NOT NULL,
SUBTYPE VARCHAR NOT NULL,
SENSOR VARCHAR NOT NULL,
SIZE BIGINT NOT NULL,
OWNER VARCHAR NOT NULL,
GROUP_OWNER VARCHAR NOT NULL,
PERMISSIONS VARCHAR NOT NULL,
STARTED TIMESTAMP,
PROCESSED TIMESTAMP,
EVENT_COUNT BIGINT
CONSTRAINT PK PRIMARY KEY(MTIME ROW_TIMESTAMP,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS)) COMPRESSION='SNAPPY',DEFAULT_COLUMN_FAMILY='F';
由于我能够在SQL终端(phoenix-sqlline)中重现该问题,只是为了消除潜在的红色鲱鱼,所以下面是phoenix-sqlline的一个片段来显示问题。请注意,似乎无效的UPSERT仍显示有1行受到影响,从而告诉我UPSERT成功执行:
0: jdbc:phoenix:master-1.> UPSERT INTO DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED) SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,now() AS STARTED FROM DEFAULT.FILE_INDEX WHERE FILENAME='hdfs://filename.log';
1 row affected (5.041 seconds)
0: jdbc:phoenix:master-1.> select * from default.file_index where started is not null;
+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+
| MTIME | FILENAME | TYPE | SUBTYPE | SENSOR | SIZE | OWNER | GROUP_OWNER | PERM |
+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+
| 2018-11-01 00:00:00.000 | hdfs://filename.log | BRO | DNS | something | 224500 | somebody | hdfs | rw-r |
+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+
1 row selected (4.046 seconds)
0: jdbc:phoenix:master-1.> UPSERT INTO DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED) SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,NULL AS STARTED FROM DEFAULT.FILE_INDEX WHERE FILENAME='hdfs://filename.log';
1 row affected (4.541 seconds)
0: jdbc:phoenix:master-1.> select * from default.file_index where started is not null;
+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+
| MTIME | FILENAME | TYPE | SUBTYPE | SENSOR | SIZE | OWNER | GROUP_OWNER | PERMISSIONS | STARTED | PROCESSED | EVENT_COUNT |
+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+
+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+
No rows selected (4.782 seconds)
0: jdbc:phoenix:master-1.> UPSERT INTO DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED) SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,now() AS STARTED FROM DEFAULT.FILE_INDEX WHERE FILENAME='hdfs://filename.log';
1 row affected (5.254 seconds)
0: jdbc:phoenix:master-1.> select * from default.file_index where started is not null;
+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+
| MTIME | FILENAME | TYPE | SUBTYPE | SENSOR | SIZE | OWNER | GROUP_OWNER | PERMISSIONS | STARTED | PROCESSED | EVENT_COUNT |
+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+
+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+
No rows selected (4.389 seconds)