什么是生产环境的最佳indices.auto配置

时间:2017-12-22 11:45:55

标签: neo4j neo4j-ogm

问题基于http://neo4j.com/docs/ogm-manual/current/reference/#reference:indexing

中解释的配置

虽然import tensorflow as tf import glob images_path = "./" #RELATIVE glob pathname of current directory images_extension = "*.png" # Save the list of files matching pattern, so it is only computed once. filenames = tf.train.match_filenames_once(glob.glob(images_path+images_extension)) batch_size = len(glob.glob1(images_path,images_extension)) num_epochs=1 standard_size = [500, 500] num_channels = 3 min_after_dequeue = 10 num_preprocess_threads = 3 seed = 14131 """ IMPORTANT: Cropping params. These are arbitrary values used only for this example. You will have to change them according to your requirements. """ crop_size=[200,200] boxes = [1,1,460,460] """ 'WholeFileReader' is a Reader who's 'read' method outputs the next key-value pair of the filename and the contents of the file (the image) from the Queue, both of which are string scalar Tensors. Note that the The QueueRunner works in a thread separate from the Reader that pulls filenames from the queue, so the shuffling and enqueuing process does not block the reader. 'resize_images' is used so that all images are resized to the same size (Aspect ratios may change, so in that case use resize_image_with_crop_or_pad) 'set_shape' is used because the height and width dimensions of 'image' are data dependent and cannot be computed without executing this operation. Without this Op, the 'image' Tensor's shape will have None as Dimensions. """ def read_my_file_format(filename_queue, standard_size, num_channels): image_reader = tf.WholeFileReader() _, image_file = image_reader.read(filename_queue) if "jpg" in images_extension: image = tf.image.decode_jpeg(image_file) elif "png" in images_extension: image = tf.image.decode_png(image_file) image = tf.image.resize_images(image, standard_size) image.set_shape(standard_size+[num_channels]) print "Successfully read file!" return image """ 'string_input_producer' Enters matched filenames into a 'QueueRunner' FIFO Queue. 'shuffle_batch' creates batches by randomly shuffling tensors. The 'capacity' argument controls the how long the prefetching is allowed to grow the queues. 'min_after_dequeue' defines how big a buffer we will randomly sample from -- bigger means better shuffling but slower startup & more memory used. 'capacity' must be larger than 'min_after_dequeue' and the amount larger determines the maximum we will prefetch. Recommendation: min_after_dequeue + (num_threads + a small safety margin) * batch_size """ def input_pipeline(filenames, batch_size, num_epochs, standard_size, num_channels, min_after_dequeue, num_preprocess_threads, seed): filename_queue = tf.train.string_input_producer(filenames, num_epochs=num_epochs, shuffle=True) example = read_my_file_format(filename_queue, standard_size, num_channels) capacity = min_after_dequeue + 3 * batch_size example_batch = tf.train.shuffle_batch([example], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue, num_threads=num_preprocess_threads, seed=seed, enqueue_many=False) print "Batching Successful!" return example_batch """ Any transformation on the image batch goes here. Refer the documentation for the details of how the cropping is done using this function. """ def crop_batch(image_batch, batch_size, b_boxes, crop_size): cropped_images = tf.image.crop_and_resize(image_batch, boxes=[b_boxes for _ in xrange(batch_size)], box_ind=[i for i in xrange(batch_size)], crop_size=crop_size) print "Cropping Successful!" return cropped_images example_batch = input_pipeline(filenames, batch_size, num_epochs, standard_size, num_channels, min_after_dequeue, num_preprocess_threads, seed) cropped_images = crop_batch(example_batch, batch_size, boxes, crop_size) """ if 'num_epochs' is not `None`, the 'string_input_producer' function creates local counter `epochs`. Use `local_variables_initializer()` to initialize local variables. 'Coordinator' class implements a simple mechanism to coordinate the termination of a set of threads. Any of the threads can call `coord.request_stop()` to ask for all the threads to stop. To cooperate with the requests, each thread must check for `coord.should_stop()` on a regular basis. `coord.should_stop()` returns True` as soon as `coord.request_stop()` has been called. A thread can report an exception to the coordinator as part of the `should_stop()` call. The exception will be re-raised from the `coord.join()` call. After a thread has called `coord.request_stop()` the other threads have a fixed time to stop, this is called the 'stop grace period' and defaults to 2 minutes. If any of the threads is still alive after the grace period expires `coord.join()` raises a RuntimeError reporting the laggards. IMPORTANT: 'start_queue_runners' starts threads for all queue runners collected in the graph, & returns the list of all threads. This must be executed BEFORE running any other training/inference/operation steps, or it will hang forever. """ with tf.Session() as sess: _, _ = sess.run([tf.global_variables_initializer(), tf.local_variables_initializer()]) coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) try: while not coord.should_stop(): # Run training steps or whatever cropped_images1 = sess.run(cropped_images) print cropped_images1.shape except tf.errors.OutOfRangeError: print('Load and Process done -- epoch limit reached') finally: # When done, ask the threads to stop. coord.request_stop() coord.join(threads) sess.close() 似乎是一个不错的选择。在prod上应用配置时意外错过可能会导致系统崩溃。

那么,如果缺少程序化工作,是否有一种方法可以应用索引?

2 个答案:

答案 0 :(得分:0)

正如that doc所述,您只需使用ogm.properties配置indexes.auto=validate即可避免以编程方式执行任何操作。

答案 1 :(得分:0)

对于Neo4j数据库中的模式与域模型中声明的模式相同的简单系统,您可以使用assert

但事实并非如此。 OGM 3.1中新的update选项将更有用(它不会丢弃现有的索引/约束),但仍然不足以进行更复杂的部署(例如,未在域中定义的索引,多个应用程序)访问数据库,...)。

您是否应该手动或使用某些工具(例如http://www.liquigraph.org/)管理索引,并在OGM应用程序中使用validate选项来验证所有预期索引是否存在。

这样做时你应该检查

  • 架构更新不会影响现有负载(例如,计划低负载时间)
  • 现有数据与新架构匹配(例如,所有节点上都存在新约束的属性)
  • 不再需要的索引/约束被删除