我有一个用于研究目的的数据库。不幸的是,在这项研究中,一个算法被允许进行太长时间,这无意中创建了重复的分类术语,而不是重复使用原始TID作为术语的第一个实例。
为了纠正这个问题,我们尝试使用" term_merge"和" taxonomy_manager"模块。 " term_merge"提供了一个删除重复项的界面,并且它能够设置一个时间加载的术语限制,以防止耗尽数据库服务器的内存限制。但是,根据我的用例,我甚至无法加载位于/ admin / structure / taxonomy / [My-Vocabulary] / merge的配置屏幕,更不用说在/ admin / structure / taxonomy / [My]中找到的重复界面了。 -Vocabulary] / merge / duplicates,因为这两个都耗尽了内存限制,尽管上限设置为1024M。
为了解决这个问题,我编写了一个自定义模块,该模块调用term_merge模块中的term_merge函数。由于此项目中只有一个节点包使用了相关的分类词汇表,我能够安全地编写自己的逻辑来合并重复的术语,而不必使用term_merge模块提供的功能,但我想利用它,因为它是为此目的而设计的,理论上,它允许更安全的过程。
我的模块提供页面回调以及获取TID列表的逻辑,这些TID引用了重复的分类术语。以下是包含对term_merge函数的调用的代码:
//Use first element, with lowest TID value, as the 'trunk'
// which all other terms will be merged into
$trunk = $tids[0];
//Remove first element from branch array, to ensure the trunk
//is not being merged into itself
array_shift($tids);
//Set the merge settings array, similarly to the default values
//which are given in _term_merge_batch_process of term_merge.batch.inc
$merge_settings = array(
'term_branch_keep' => FALSE,
'merge_fields' => array(),
'keep_only_unique' => TRUE,
'redirect' => -1,
'synonyms' => array(),
);
term_merge($tids, $trunk, $merge_settings);
这不会导致任何合并条款,也不会在Watchdog或网络服务器日志中提供任何错误或通知。
我还尝试为每个要合并的单个重复TID调用term_merge,而不是整体使用TID数组。
我很感激有关如何以编程方式最好地使用term_merge函数的任何输入,或者我可以从大型数据库中删除许多重复术语,其中某些术语有数千个重复项。
供参考,以下是提供有关term_merge中所用参数的信息的注释,可在term_merge模块的term_merge.module中找到:
/**
* Merge terms one into another using batch API.
*
* @param array $term_branch
* A single term tid or an array of term tids to be merged, aka term branches
* @param int $term_trunk
* The tid of the term to merge term branches into, aka term trunk
* @param array $merge_settings
* Array of settings that control how merging should happen. Currently
* supported settings are:
* - term_branch_keep: (bool) Whether the term branches should not be
* deleted, also known as "merge only occurrences" option
* - merge_fields: (array) Array of field names whose values should be
* merged into the values of corresponding fields of term trunk (until
* each field's cardinality limit is reached)
* - keep_only_unique: (bool) Whether after merging within one field only
* unique taxonomy term references should be kept in other entities. If
* before merging your entity had 2 values in its taxonomy term reference
* field and one was pointing to term branch while another was pointing to
* term trunk, after merging you will end up having your entity
* referencing to the same term trunk twice. If you pass TRUE in this
* parameter, only a single reference will be stored in your entity after
* merging
* - redirect: (int) HTTP code for redirect from $term_branch to
* $term_trunk, 0 stands for the default redirect defined in Redirect
* module. Use constant TERM_MERGE_NO_REDIRECT to denote not creating any
* HTTP redirect. Note: this parameter requires Redirect module enabled,
* otherwise it will be disregarded
* - synonyms: (array) Array of field names of trunk term into which branch
* terms should be added as synonyms (until each field's cardinality limit
* is reached). Note: this parameter requires Synonyms module enabled,
* otherwise it will be disregarded
* - step: (int) How many term branches to merge per script run in batch. If
* you are hitting time or memory limits, decrease this parameter
*/
答案 0 :(得分:0)
似乎因为函数term_merge是在函数中用于处理表单提交的意图而开发的,所以我的自定义模块以不能调用batch_process的方式使用它。
明确地调用以下内容解决了这个问题:
batch_process()
不需要将参数传递给函数。