Question

在这里寻找一些建议。

我知道这可以使用AWStats或类似的东西来完成，但这似乎对我想要做的事情有点过分。

我的webroot中有一个包含数千个XML文件的目录。这些都是通过使用URL中的GET请求调用单个swf文件来加载的。

例如：

<shortcuts xmlns:android="http://schemas.android.com/apk/res/android" >
    <shortcut
        android:shortcutId="add_website"
        android:icon="@drawable/add"
        android:shortcutShortLabel="@string/add_new_website_short"
        android:shortcutLongLabel="@string/add_new_website"
        >
        <intent
            android:action="com.example.android.appshortcuts.ADD_WEBSITE"
            android:targetPackage="com.example.android.appshortcuts"
            android:targetClass="com.example.android.appshortcuts.Main"
            />
    </shortcut>
</shortcuts>

网址是动态构建的，其中有数千个。所有这些都指向相同的swf文件，但从XML目录中提取不同的XML文件。

我要做的是记录对文本文件请求每个XML文件的次数。

我知道目标目录，是否有一个bash脚本或者我可以运行的东西，它将监视XML目录并使用时间戳记录每个命中记录？

例如：

https://www.example.com/myswf.swf?url=https://www.example.com/xml/1234567.xml

有什么建议吗？

Answer 1

更简单，更直接的方法 -

uniq -c requests.txt

我假设您的所有请求网址都位于名为requests.txt的文件中。

更好的格式化输出 -

awk -F/ '{print $8}' requests.txt | uniq -c

Answer 2

这是一种丑陋的方式，因为它使用循环来处理文本而不是优雅的awk数组，但它应该工作（缓慢）。绝对需要优化。

我假设您的所有请求网址都位于名为requests.txt

的文件中

#Put all the unique URLs in an index file

awk -F/ '{print $8}' requests.txt | sort -u > index 

#Look through the file to count the number of occurrences of each item.

while read i
do 
    echo -n "$i | " 
    grep -c -w "$i" requests.txt 
done < index

记录Apache2上请求某个文件的次数

2 个答案: