Drupal:将分类条款与大量重复项合并

时间:2016-10-09 11:58:17

标签: drupal drupal-7 drupal-taxonomy

我有一个用于研究目的的数据库。不幸的是,在这项研究中,一个算法被允许进行太长时间,这无意中创建了重复的分类术语,而不是重复使用原始TID作为术语的第一个实例。

为了纠正这个问题,我们尝试使用" term_merge"和" taxonomy_manager"模块。 " term_merge"提供了一个删除重复项的界面,并且它能够设置一个时间加载的术语限制,以防止耗尽数据库服务器的内存限制。但是,根据我的用例,我甚至无法加载位于/ admin / structure / taxonomy / [My-Vocabulary] / merge的配置屏幕,更不用说在/ admin / structure / taxonomy / [My]中找到的重复界面了。 -Vocabulary] / merge / duplicates,因为这两个都耗尽了内存限制,尽管上限设置为1024M。

为了解决这个问题,我编写了一个自定义模块,该模块调用term_merge模块中的term_merge函数。由于此项目中只有一个节点包使用了相关的分类词汇表,我能够安全地编写自己的逻辑来合并重复的术语,而不必使用term_merge模块提供的功能,但我想利用它,因为它是为此目的而设计的,理论上,它允许更安全的过程。

我的模块提供页面回调以及获取TID列表的逻辑,这些TID引用了重复的分类术语。以下是包含对term_merge函数的调用的代码:

//Use first element, with lowest TID value, as the 'trunk'
// which all other terms will be merged into

$trunk = $tids[0];

//Remove first element from branch array, to ensure the trunk 
//is not being merged into itself

array_shift($tids);

//Set the merge settings array, similarly to the default values 
//which are given in _term_merge_batch_process of term_merge.batch.inc

$merge_settings = array(
  'term_branch_keep' => FALSE,
  'merge_fields' => array(),
  'keep_only_unique' => TRUE,
  'redirect' => -1,
  'synonyms' => array(),
);

term_merge($tids, $trunk, $merge_settings);

这不会导致任何合并条款,也不会在Watchdog或网络服务器日志中提供任何错误或通知。

我还尝试为每个要合并的单个重复TID调用term_merge,而不是整体使用TID数组。

我很感激有关如何以编程方式最好地使用term_merge函数的任何输入,或者我可以从大型数据库中删除许多重复术语,其中某些术语有数千个重复项。

供参考,以下是提供有关term_merge中所用参数的信息的注释,可在term_merge模块的term_merge.module中找到:

/**
 * Merge terms one into another using batch API.
 *
 * @param array $term_branch
 *   A single term tid or an array of term tids to be merged, aka term branches
 * @param int $term_trunk
 *   The tid of the term to merge term branches into, aka term trunk
 * @param array $merge_settings
 *   Array of settings that control how merging should happen.     Currently
 *   supported settings are:
 *     - term_branch_keep: (bool) Whether the term branches should not be
 *       deleted, also known as "merge only occurrences" option
 *     - merge_fields: (array) Array of field names whose values should be
 *       merged into the values of corresponding fields of term trunk (until
 *       each field's cardinality limit is reached)
 *     - keep_only_unique: (bool) Whether after merging within one field only
 *       unique taxonomy term references should be kept in other entities. If
 *       before merging your entity had 2 values in its taxonomy term reference
 *       field and one was pointing to term branch while another was pointing to
 *       term trunk, after merging you will end up having your entity
 *       referencing to the same term trunk twice. If you pass TRUE in this
 *       parameter, only a single reference will be stored in your entity after
 *       merging
 *     - redirect: (int) HTTP code for redirect from $term_branch to
 *       $term_trunk, 0 stands for the default redirect defined in Redirect
 *       module. Use constant TERM_MERGE_NO_REDIRECT to denote not creating any
 *       HTTP redirect. Note: this parameter requires Redirect module enabled,
 *       otherwise it will be disregarded
 *     - synonyms: (array) Array of field names of trunk term into which branch
 *       terms should be added as synonyms (until each field's cardinality limit
 *       is reached). Note: this parameter requires Synonyms module enabled,
 *       otherwise it will be disregarded
 *     - step: (int) How many term branches to merge per script run in batch. If
 *       you are hitting time or memory limits, decrease this parameter
 */

1 个答案:

答案 0 :(得分:0)

似乎因为函数term_merge是在函数中用于处理表单提交的意图而开发的,所以我的自定义模块以不能调用batch_process的方式使用它。

明确地调用以下内容解决了这个问题:

batch_process()

不需要将参数传递给函数。