我有一个数据库表# This library is needed to extact zip archives. A .docx is a zip archive
# .NET 4.5 or later is requried
Add-Type -AssemblyName System.IO.Compression.FileSystem
# This function gets plain text from a word document
# adapted from http://stackoverflow.com/a/19503654/284111
# It is not ideal, but good enough
function Extract-Text([string]$fileName) {
#Generate random temporary file name for text extaction from .docx
$tempFileName = [Guid]::NewGuid().Guid
#Extract document xml into a variable ($text)
$entry = [System.IO.Compression.ZipFile]::OpenRead($fileName).GetEntry("word/document.xml")
[System.IO.Compression.ZipFileExtensions]::ExtractToFile($entry,$tempFileName)
$text = [System.IO.File]::ReadAllText($tempFileName)
Remove-Item $tempFileName
#Remove actual xml tags and leave the text behind
$text = $text -replace '</w:r></w:p></w:tc><w:tc>', " "
$text = $text -replace '</w:r></w:p>', "`r`n"
$text = $text -replace "<[^>]*>",""
return $text
}
$fileList = Get-ChildItem "C:\Users\WP\Desktop\SearchFiles" -Include *.docx -Force -recurse
# Adapted from http://stackoverflow.com/a/36023783/284111
$fileList |
Foreach-Object {[regex]::matches((Extract-Text $_), '(?<=[A-Za-z]{3}\s*(?:-|–)\s*)\d{4}')} |
Select-Object -ExpandProperty captures |
Sort-Object value -Descending |
Select-Object -First 1 -ExpandProperty value
,其中包含以下列:
id | author_name | author_email | author_date(timestamp)| total_lines
示例内容为:
commits
我想得到如下结果:
1 | abc | abc@xyz.com | 2013-03-24 15:32:49 | 1234
2 | abc | abc@xyz.com | 2013-03-27 15:32:49 | 534
3 | abc | abc@xyz.com | 2014-05-24 15:32:49 | 2344
4 | abc | abc@xyz.com | 2014-05-28 15:32:49 | 7623
我在网上搜索类似的解决方案,但无法获得任何有用的解决方案。
我尝试了这个查询:
id | name | week | commits
1 | abc | 1 | 2
2 | abc | 2 | 0
但这不是正确的结果。
答案 0 :(得分:53)
如果您有多年,您也应该考虑这一年。一种方法是:
SELECT date_part('year', author_date::date) as year,
date_part('week', author_date::date) AS weekly,
COUNT(author_email)
FROM commits
GROUP BY year, weekly
ORDER BY year, weekly;
更自然的方法是使用date_trunc()
:
SELECT date_trunc('week', author_date::date) AS weekly,
COUNT(author_email)
FROM commits
GROUP BY weekly
ORDER BY weekly;
答案 1 :(得分:8)
问了这个问题已经很久了。
无论如何,只要有任何人通过这个。
如果您要计算所有中间周以及没有没有提交/记录的情况,则可以通过提供start_date
end_date
至generate_series()
功能
SELECT t1.year_week week,
t2.commit_count
FROM (SELECT week,
To_char(week, 'IYYY-IW') year_week
FROM generate_series('2020-02-01 06:06:51.25+00'::DATE,
'2020-04-05 12:12:33.25+00'::
DATE, '1 week'::interval) AS week) t1
LEFT OUTER JOIN (SELECT To_char(author_date, 'IYYY-IW') year_week,
COUNT(author_email) commit_count
FROM commits
GROUP BY year_week) t2
ON t1.year_week = t2.year_week;
输出将是:
week | commit_count
----------+-------------
2020-05 | 2
2020-06 | NULL
2020-07 | 1