MySQL计数两列相同而不是空,相同的表

时间:2015-09-09 07:13:05

标签: mysql sql join count null

我有一张包含以下数据的表格:

mysql> describe Post;
+-------------+--------------+------+-----+---------+----------------+
| Field       | Type         | Null | Key | Default | Extra          |
+-------------+--------------+------+-----+---------+----------------+
| id          | int(11)      | NO   | PRI | NULL    | auto_increment |
| user_id     | int(11)      | NO   | MUL | NULL    |                |
| post_date   | datetime     | NO   |     | NULL    |                |
| in_reply_to | int(11)      | YES  |     | NULL    |                |
| text        | varchar(160) | NO   |     | NULL    |                |
+-------------+--------------+------+-----+---------+----------------+

mysql> select id as "Row ID", user_id as "User ID", post_date as "Post Date", IF(in_reply_to is NULL, "None", in_reply_to) as "In Reply To Post ID:", CONCAT(LEFT(text,40),"...") as "Post Text" from Post;
+--------+---------+---------------------+----------------------+---------------------------------------------+
| Row ID | User ID | Post Date           | In Reply To Post ID: | Post Text                                   |
+--------+---------+---------------------+----------------------+---------------------------------------------+
|      1 |       1 | 2015-08-14 20:38:00 | None                 | This is the original test post that I pu... |
|      2 |       2 | 2015-08-14 20:39:00 | None                 | This is the second post that I put into ... |
|      3 |       5 | 2015-08-14 22:00:00 | 1                    | Hahaha, that post was hilarious. I canno... |
|      4 |       4 | 2015-08-14 23:00:00 | 1                    | Today I saw a cat jump off the roof, ont... |
|      5 |       4 | 2015-08-14 23:00:00 | None                 | Today I saw a cat jump off the roof, ont... |
|     27 |       1 | 2015-09-08 05:53:40 | 2                    | This is a mad reply ay...                   |
|     28 |       1 | 2015-09-08 11:24:05 | None                 | Yolo Swag...                                |
+--------+---------+---------------------+----------------------+---------------------------------------------+
7 rows in set (0.05 sec)

如果您不确定它们代表什么,每个列都有说明。我关注此问题的两列是idin_reply_to

in_reply_to是一个NULLABLE FK整数,在同一个表中引用id;如果in_reply_toNULL,则表示帖子是原始帖子,如果是整数值,则是回复帖子,代表帖子的ID是回复。

在下面的示例中,有4个原始帖子(1,2,5,28)和3个回复(3,4,27),即3是对1的回复,4也是对1的回复,和27是对2.的回复。我希望执行一个产生如下输出的SQL查询:

intended output

Num Replies COUNTin_reply_to表示同一个表中id等于0的行数的in_reply_to;如果没有对该帖子的回复,则显示mysql> SELECT Post.id, Post.user_id, Post.post_date, Post.in_reply_to, CONCAT(LEFT(Post.text,40)), IF(counts.count IS NULL, 0, counts.count) AS 'Num of Replies' FROM Post LEFT JOIN (SELECT in_reply_to AS id, COUNT(*) AS count FROM Post WHERE in_reply_to IS NOT NULL GROUP BY in_reply_to) AS counts ON Post.id = counts.id; +----+---------+---------------------+-------------+------------------------------------------+----------------+ | id | user_id | post_date | in_reply_to | CONCAT(LEFT(Post.text,40)) | Num of Replies | +----+---------+---------------------+-------------+------------------------------------------+----------------+ | 1 | 1 | 2015-08-14 20:38:00 | NULL | This is the original test post that I pu | 2 | | 2 | 2 | 2015-08-14 20:39:00 | NULL | This is the second post that I put into | 1 | | 3 | 5 | 2015-08-14 22:00:00 | 1 | Hahaha, that post was hilarious. I canno | 0 | | 4 | 4 | 2015-08-14 23:00:00 | 1 | Today I saw a cat jump off the roof, ont | 0 | | 5 | 4 | 2015-08-14 23:00:00 | NULL | Today I saw a cat jump off the roof, ont | 0 | | 27 | 1 | 2015-09-08 05:53:40 | 2 | This is a mad reply ay | 0 | | 28 | 1 | 2015-09-08 11:24:05 | NULL | Random Text | 0 | +----+---------+---------------------+-------------+------------------------------------------+----------------+ 7 rows in set (0.00 sec) (即没有行包含特定帖子的ID为$ tesseract captcha.tif output -psm 6 列。

感谢。

解决方案(根据Anders'回答):

def binarize_image_using_opencv(captcha_path, binary_image_path='input-black-n-white.jpg'):
    im_gray = cv2.imread(captcha_path, cv2.CV_LOAD_IMAGE_GRAYSCALE)
    (thresh, im_bw) = cv2.threshold(im_gray, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
    # although thresh is used below, gonna pick something suitable
    im_bw = cv2.threshold(im_gray, thresh, 255, cv2.THRESH_BINARY)[1]
    cv2.imwrite(binary_image_path, im_bw)

    return binary_image_path

def preprocess_image_using_opencv(captcha_path):
    bin_image_path = binarize_image_using_opencv(captcha_path)

    im_bin = Image.open(bin_image_path)
    basewidth = 300  # in pixels
    wpercent = (basewidth/float(im_bin.size[0]))
    hsize = int((float(im_bin.size[1])*float(wpercent)))
    big = im_bin.resize((basewidth, hsize), Image.NEAREST)

    # tesseract-ocr only works with TIF so save the bigger image in that format
    tif_file = "input-NEAREST.tif"
    big.save(tif_file)

    return tif_file

def get_captcha_text_from_captcha_image(captcha_path):

    # Preprocess the image befor OCR
    tif_file = preprocess_image_using_opencv(captcha_path)

    #   Perform OCR using tesseract-ocr library
    # OCR : Optical Character Recognition
    image = Image.open(tif_file)
    ocr_text = image_to_string(image, config="-psm 6")
    alphanumeric_text = ''.join(e for e in ocr_text)

    return alphanumeric_text    

2 个答案:

答案 0 :(得分:1)

您需要在同一个表上加入两个查询。第一个只选择所有帖子,第二个计算每个帖子的回复数量。这是一个左连接,因为您想要包含没有任何回复的帖子(不会从第二个查询返回)。 IF用于将NULL值转换为0

SELECT
  post.id,
  -- Other fields...,
  IF(counts.count IS NULL, 0, counts.count) AS count
FROM post
LEFT JOIN 
  (SELECT
     in_reply_to AS id,
     COUNT(*) AS count
   FROM post
   WHERE in_reply_to IS NOT NULL
   GROUP BY in_reply_to) AS counts
 ON post.id = counts.id

Disclaimar:我没有测试过这个。

答案 1 :(得分:0)

您可以使用传统方式进行连接,也可以直接在新列中进行连接。

示例:

select a.id, 
(select count(*) from (select 1 as id union all select 1 union all select 2)b where b.id=a.id) as count_of_replies

 from 
    (select 1 as id union all select 1  union all select 2)a

请注意,2个子查询“表”都是同一个表。