我整天都在敲打我的脑袋,仍然无法提出一些有效的答案。
我想在我的网站上通过cookie跟踪我的所有访问者,并使用mysql行中包含的唯一用户密钥保存用户访问的当前URL(或路径)。
经过一段时间和一些独特的访问者,我想创建一个用户通过我的网站跟踪的路径树。 (像谷歌分析这样的访问者流量报告)。
但我无法以某种方式弄清楚如何进行遍历这些行的查询并创建URL的“树”(具有计数或百分比)
如果有人可以帮助我,那将非常感激。
- 编辑 我已经有一个mysql数据库和表格
CREATE TABLE `journies` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`site_id` int(11) NOT NULL,
`profile_id` int(11) NOT NULL,
`url` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `profiles` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`site_id` int(11) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
- 更新2 哦,这可以更好地解释我想做什么。
我想用这样的网址创建一个树:
/- domain.com/news (100%)
/- domain.com/about (33%) --/
domain.com (100%) --/
\- domain.com/contact (33%)
\- domain.com/news (33%) --\
\- domain.com/news/id/1 (100%)
当然,这可以通过多个查询来完成(尽管如果可以用一个查询完成它会很棒;)
答案 0 :(得分:1)
好的,这是一个很糟糕的事情,但我想我终于有了一些适合你的东西。
前提是从数据库中提取URL和计数,按URL分组和排序。将这些结果存储到主数组中。然后,找到网址的基本域,我称之为主干。循环访问主阵列,找到与主干具有相同基域的任何项目,并找到它们的百分比。
另外需要注意的是,这些是特定级别的。因此cnn.com/employees
(2个级别)与cnn.com/news
处于同一级别。
以下是我用于此的SQL数据:
INSERT INTO `journies` (`id`, `site_id`, `profile_id`, `url`, `created_at`, `updated_at`) VALUES
(1, 1, 1, 'domain.com', '2014-02-19 15:34:54', '0000-00-00 00:00:00'),
(2, 1, 1, 'domain.com/about', '2014-02-19 15:35:57', '0000-00-00 00:00:00'),
(3, 1, 1, 'domain.com/contact', '2014-02-19 15:36:12', '0000-00-00 00:00:00'),
(4, 1, 1, 'domain.com/news', '2014-02-19 15:36:29', '0000-00-00 00:00:00'),
(5, 1, 1, 'domain.com/news/id/1', '2014-02-19 15:39:26', '0000-00-00 00:00:00'),
(6, 1, 1, 'domain.com/contact', '2014-02-19 15:50:26', '0000-00-00 00:00:00'),
(7, 1, 1, 'cnn.com/news/id/1', '2014-02-19 16:00:02', '0000-00-00 00:00:00'),
(8, 1, 1, 'cnn.com/news', '2014-02-19 16:00:15', '0000-00-00 00:00:00'),
(9, 1, 1, 'cnn.com', '2014-02-19 16:00:25', '0000-00-00 00:00:00'),
(10, 1, 1, 'cnn.com', '2014-02-19 16:46:16', '0000-00-00 00:00:00'),
(11, 1, 1, 'cnn.com', '2014-02-19 16:46:16', '0000-00-00 00:00:00'),
(12, 1, 1, 'domain.com/news/id/1', '2014-02-20 08:47:23', '0000-00-00 00:00:00'),
(13, 1, 1, 'domain.com/news/id/1', '2014-02-20 08:47:23', '0000-00-00 00:00:00'),
(14, 1, 1, 'domain.com/news/id/2', '2014-02-20 08:53:29', '0000-00-00 00:00:00'),
(15, 1, 1, 'domain.com/prices', '2014-02-20 12:40:44', '0000-00-00 00:00:00'),
(16, 1, 1, 'domain.com/prices', '2014-02-20 12:40:44', '0000-00-00 00:00:00'),
(17, 1, 1, 'cnn.com/employees/friekot', '2014-02-20 15:23:34', '0000-00-00 00:00:00'),
(18, 1, 1, 'cnn.com/employees', '2014-02-20 15:23:34', '0000-00-00 00:00:00');
以下是我提出的代码:
<?php
$link = mysqli_connect("localhost", "user", "pass", "database");
// SET THE DEFAULTS
$trunk_array = array();
$master_array = array();
// PULL OUT THE DATA FROM THE DATABASE
$q_get_tracking_info = "SELECT *, COUNT(url) AS url_count FROM journies WHERE site_id = 1 AND profile_id = 1 GROUP BY url ORDER BY url;";
$r_get_tracking_info = mysqli_query($link, $q_get_tracking_info) or trigger_error("Cannot Get Tracking Info: (".mysqli_error().")", E_USER_ERROR);
while ($row_get_tracking_info = mysqli_fetch_array($r_get_tracking_info)) {
$url = $row_get_tracking_info['url'];
$url_count = $row_get_tracking_info['url_count'];
// EXPLODE THE DOMAIN PARTS
$domain_parts = explode('/', $url);
// FIND THE TOTAL COUNTS FOR EACH LEVEL OF ARRAY
// - SO THAT WE CAN DIVIDE BY IT LATER TO GET THE PERCENTAGE
$count = count($domain_parts);
if (!isset($level_totals[$domain_parts[0]])) {
$level_totals[$domain_parts[0]] = array();
}
if (isset($level_totals[$domain_parts[0]][$count])) {
$level_totals[$domain_parts[0]][$count] += $url_count;
}
else {
$level_totals[$domain_parts[0]][$count] = $url_count;
}
// BUILD A TRUNK ARRAY SO WE CAN DEFINE SECTIONS
if ($url == $domain_parts[0]) {
$trunk_array[] = array($url, $url_count);
}
// BUILD A MASTER ARRAY OF THE ITEMS AS WE WILL LAY THEM OUT
$master_array[$url] = $url_count;
}
// FIND THE TOTAL TRUNK COUNT SO WE CAN DIVIDE BY IT LATER
$total_trunk_count = 0;
foreach ($trunk_array AS $trunk_array_key => $trunk_array_val) {
foreach($trunk_array_val AS $trunk_count_val) {
$total_trunk_count += $trunk_count_val[0];
}
}
// LOOP THROUGH THE TRUNK ITEMS AND PULL OUT ANY MATCHES FOR THAT TRUNK
foreach ($trunk_array AS $trunk_item_key => $trunk_item_val) {
$trunk_item = $trunk_item_val[0];
$trunk_count = $trunk_item_val[1];
// FIND THE PERCENTAGE THIS TRUNK WAS ACCESSED
$trunk_percent = round(($master_array[$trunk_item] / $total_trunk_count) * 100);
// PRINT THE TRUNK OUT
print "<BR><BR>".$trunk_item.' - ('.$trunk_percent.'%)';
// LOOP THROUGH THE MASTER ARRAY AND GET THE RESULTS FOR ANY PATHS UNDER THE TRUNK
foreach ($master_array AS $master_array_key => $master_array_val) {
// PERFORM A MATCH FOR DOMAINS BELONGING TO THIS PARTICULAR TRUNK
if (preg_match('/^'.$trunk_item.'/', $master_array_key)) {
// SET A DEFAULT DELIMITER PAD
$delimiter_pad = '';
// EXPLODE EACH PATH INTO PARTS AND COUNT HOW MANY PARTS WE HAVE
$domain_parts_2 = explode('/', $master_array_key);
$count = count($domain_parts_2);
// SET THE DELIMITER FOR HOW FAR DOWN ON THE TREE WE ARE
// EACH INDENT WILL HAVE 8 SPACES
for ($i = 2; $i <= $count; $i++) {
$delimiter_pad .= ' ';
}
// SINCE WE ALREADY PRINTED OUT THE TRUNK, WE WILL ONLY SHOW ITEMS THAT ARE NOT THE TRUNK
if ($master_array_key != $trunk_item) {
// FIND THE PERCENTAGE OF THE ITEM, GIVEN THEIR LEVEL IN THE TREE
$path_percentage = round(($master_array[$master_array_key] / $level_totals[$trunk_item][$count]) * 100);
// PRINT OUT THE PATH AND PERCENTAGE
print "<BR>".$delimiter_pad."|- ".$master_array_key.' - ('.$path_percentage.'%)';
}
}
}
}
最后,所有这些都输出了这个:
cnn.com - (75%)
|- cnn.com/employees - (50%)
|- cnn.com/employees/friekot - (100%)
|- cnn.com/news - (50%)
|- cnn.com/news/id/1 - (100%)
domain.com - (25%)
|- domain.com/about - (17%)
|- domain.com/contact - (33%)
|- domain.com/news - (17%)
|- domain.com/news/id/1 - (75%)
|- domain.com/news/id/2 - (25%)
|- domain.com/prices - (33%)
可能有一种更简单的方法可以做到这一点,但这是我想到的方法。我希望这适合你!
答案 1 :(得分:0)
好的,所以你可以编写一个这样的查询来查找特定的用户和网站,并按他们访问过的顺序吐出他们访问过的每个url
。
$sql = "SELECT url FROM journies WHERE site_id = 1 AND profile_id = 1 ORDER BY created_at";
$result = mysqli_query($connection, $sql);
while ($row = mysqli_fetch_array($result)) {
$page_array[] = $row['url'];
}
$page_list = implode("\n =>", $page_array);
这将输出:
domain.com
=> domain.com/about