计算每个文件夹在复杂文件夹结构中有多少个文件夹?

时间:2017-06-13 06:47:13

标签: r tree hierarchical-data directory

考虑以下tree

library(data.tree)

acme <- Node$new("Acme Inc.")
    accounting <- acme$AddChild("Accounting")
        software <- accounting$AddChild("New Software")
        standards <- accounting$AddChild("New Accounting Standards")
    research <- acme$AddChild("Research")
        newProductLine <- research$AddChild("New Product Line")
        newLabs <- research$AddChild("New Labs")
    it <- acme$AddChild("IT")
        outsource <- it$AddChild("Outsource")
        agile <- it$AddChild("Go agile")
        goToR <- it$AddChild("Switch to R")

然后我要计算averageBranchingFactor

averageBranchingFactor(acme)

这会产生2.5

然而,由于各种原因,我希望能够获得所有分支因子,而不仅仅是平均分支因子。例如,我需要这个,以统计方式比较两个文件结构,以及平均分支因子之间的显着差异。

根据data.tree AverageBranchingFactor() acme.df <- ToDataFrameTree(acme, "averageBranchingFactor") mean(acme.df$averageBranchingFactor[acme.df$averageBranchingFactor>0]) 函数执行以下操作:&#34;计算每个非叶子的平均分支数。&#34;因此,我首先尝试了以下内容:

2.375

这会产生mean(acme.df$averageBranchingFactor) ,然后让我尝试更简单的版本:

0.8636364

这会产生2.5

我如何得出平均值为data.frame的所有单个分支因子?

理想情况下,我想创建一个列出每个文件夹的top_level_folder sub_folder_1 sub_folder_2 sub_folder_3 ,其中包含一个变量,其中列出了每个文件夹的分支因子。例如,我有这个非常简单的文件夹结构:

Folders             Subfolders (BranchingFactor)
top_level_folder    2
sub_folder_1        0
sub_folder_2        1
sub_folder_3        0

回答这个问题将涉及创建一个如下所示的输出:

list.dirs("/Users/username/Downloads/top_level/")

可以通过调用top_level_folder简单地生成第一列,但我不知道如何生成第二列。请注意,第二列是非递归的,这意味着子文件夹中的文件夹不计算(即sub_folder_2仅包含2个子文件夹,即使sub_folder_2包含另一个文件夹<?php ($err) ? '<div class="alert-failure">Error: ' . $err . '</div>' : (($success) ? '<div class="alert-success">Success: ' . $success . '</div>' : '') ?> )。

如果您想了解您的解决方案是否可扩展,请下载Rails代码库:manual并在Rails&#39;上进行尝试。更复杂的文件结构。

5 个答案:

答案 0 :(得分:1)

您可以简单地循环文件夹结构并在每个级别计算文件夹的数量(没有递归):

dir.create("top_level_folder/sub_folder_2/sub_folder_3", recursive = TRUE)
dir.create("top_level_folder/sub_folder_1")


dirs <- list.dirs()
branching_factor <- vector(length = length(dirs))
for (i in 1:length(dirs)) {
    branching_factor[i] <- length(list.dirs(path = dirs[i], 
                                            full.names = FALSE, recursive = FALSE))
}

result <- data.frame(Folders = basename(dirs), BranchingFactor = branching_factor)
result[-1,]

您还可以使用此代码的更短,更多的idomatic和矢量化版本:

dirs <- list.dirs()
branching_factor <- sapply(dirs, function(x) length(list.dirs(x, FALSE, FALSE)))
result2 <- data.frame(Folders = basename(dirs), BranchingFactor = branching_factor, 
                      row.names = NULL)[-1,]

结果如下:

> head(result2[rev(order(result2[,2])),])
          Folders BranchingFactor
208      fixtures              24
122      fixtures              23
42       fixtures              18
440      core_ext              17
340 active_record              17
562         rails              16

答案 1 :(得分:1)

只是纠正@Gilles解决方案,

private void takeScreenShot() {

    try {

    //here getScroll is my scrollview id

    View u = ((Activity) mContext).findViewById(R.id.getScroll);

    ScrollView z = (ScrollView) ((Activity) mContext).findViewById(R.id.getScroll);
    int totalHeight = z.getChildAt(0).getHeight();
    int totalWidth = z.getChildAt(0).getWidth();

    Bitmap bitmap = getBitmapFromView(u,totalHeight,totalWidth);

    Image image;

    //Save bitmap
    String path = Environment.getExternalStorageDirectory()+"/Folder/";
    String fileName = "report1.pdf";

    File dir = new File(path);
    if (!dir.exists())
        dir.mkdirs();

    Log.v("PDFCreator", "PDF Path: " + path);

    File myPath = new File(path, fileName);



        ByteArrayOutputStream stream = new ByteArrayOutputStream();
        bitmap.compress(Bitmap.CompressFormat.JPEG, 10, stream);
        image = Image.getInstance(stream.toByteArray());
        image.setAbsolutePosition(0, 0);
        Document document = new Document(image);
        PdfWriter.getInstance(document, new FileOutputStream(myPath));
        document.open();
        document.add(image);
        document.close();


    } catch (Exception i1) {
        i1.printStackTrace();
    }
}

希望这有帮助。

答案 2 :(得分:0)

我正在递归地获取所有文件夹的列表,然后创建一个文件夹子文件夹对的表,从这些我可以按文件夹计算子文件夹的数量。

我想念空文件夹,所以我用左连接的初始文件夹重新编写它,然后用零填充NA。

path <- getwd()
all_folders <- path %>% list.dirs(full.names=TRUE,recursive=TRUE) %>% 

data.frame(stringsAsFactors=FALSE) %>% setNames("Folders")
all_sub_folders <- all_folders$Folders %>%
  strsplit("/") %>%
  lapply(function(x){c(x[length(x)-1],x[length(x)])}) %>%
  do.call(rbind,.) %>%
  as.data.frame(stringsAsFactors=FALSE) %>%
  setNames(c("ParentFolders","Folders"))
output <- all_sub_folders$ParentFolders %>% table %>% as.data.frame(stringsAsFactors=FALSE) %>% setNames(c("Folders","SubFolders")))
output <- merge(all_sub_folders,output,all.x = TRUE)[,c("Folders","SubFolders")]
output$SubFolders[is.na(output$SubFolders)] <- 0
output <- output[match(all_sub_folders$Folders,output$Folders),]

head(output)
#      Folders SubFolders
# 2160   Rhome        126
# 17   acepack          5
# 856     help          1
# 992     html          9
# 1486    libs        124
# 1130    i386          0

答案 3 :(得分:0)

您可以在my answer上调整your other question,用list.dirs代替recursive = FALSE代替list.files

library(purrr)

files <- .libPaths()[1] %>%    # omit for current directory or supply alternate path
    list.dirs() %>% 
    map_df(~list(path = .x, 
                 dirs = length(list.dirs(.x, recursive = FALSE))))

files
#> # A tibble: 4,457 x 2
#>                                                                           path  dirs
#>                                                                          <chr> <int>
#>  1              /Library/Frameworks/R.framework/Versions/3.4/Resources/library   314
#>  2        /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind     4
#>  3   /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/help     0
#>  4   /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/html     0
#>  5   /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/Meta     0
#>  6      /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/R     0
#>  7      /Library/Frameworks/R.framework/Versions/3.4/Resources/library/acepack     5
#>  8 /Library/Frameworks/R.framework/Versions/3.4/Resources/library/acepack/help     0
#>  9 /Library/Frameworks/R.framework/Versions/3.4/Resources/library/acepack/html     0
#> 10 /Library/Frameworks/R.framework/Versions/3.4/Resources/library/acepack/libs     1
#> # ... with 4,447 more rows

mean(files$dirs[files$dirs != 0])
#> [1] 2.952949

或在基地R,

files <- do.call(rbind, lapply(list.dirs(.libPaths()[1]), function(path){
    data.frame(path = path, 
               dirs = length(list.dirs(path, recursive = FALSE)), 
               stringsAsFactors = FALSE)
}))

head(files)
#>                                                                        path dirs
#> 1            /Library/Frameworks/R.framework/Versions/3.4/Resources/library  314
#> 2      /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind    4
#> 3 /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/help    0
#> 4 /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/html    0
#> 5 /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/Meta    0
#> 6    /Library/Frameworks/R.framework/Versions/3.4/Resources/library/abind/R    0

mean(files$dirs[files$dirs != 0])
#> [1] 2.952949

答案 4 :(得分:0)

averageBranchingFactor排除了叶子。 附注:您可以使用"status": "PROVISIONED"直接获取极致。

data(acme)

这将显示如下:

library(data.tree)
data(acme)
acme$averageBranchingFactor
acme$count
print(acme, abf = "averageBranchingFactor", "count")

levelName abf count 1 Acme Inc. 2.5 3 2 ¦--Accounting 2.0 2 3 ¦ ¦--New Software 0.0 0 4 ¦ °--New Accounting Standards 0.0 0 5 ¦--Research 2.0 2 6 ¦ ¦--New Product Line 0.0 0 7 ¦ °--New Labs 0.0 0 8 °--IT 3.0 3 9 ¦--Outsource 0.0 0 10 ¦--Go agile 0.0 0 11 °--Switch to R 0.0 0 的实施不承担任何秘密,因此您可以根据自己的需要进行调整。只需在控制台中键入?averageBranchingFactor(不带括号):

averageBranchingFactor

简而言之,我们遍历树(叶子除外),并获得每个节点的function (node) { t <- Traverse(node, filterFun = isNotLeaf) if (length(t) == 0) return(0) cnt <- Get(t, "count") if (!is.numeric(cnt)) browser() return(mean(cnt)) } 值。最后,我们计算平均值。

希望有所帮助。