我有一个连接到Sharepoint Online的脚本,为所有文件和文件夹建立索引,以系统的方式下载它们,并以文件,文件夹,大小,路径等名称生成.csv。< / p>
由于种种原因,我最终陷入了我拥有所有数据但元数据已损坏的情况(上述.csv文件)。
不幸的是,重新运行整个脚本并不是一个真正的选择,因为这大约需要90个小时。
我一直在尝试分解代码以删除“下载文件”功能,并仅保留生成.csv的部分,但到目前为止没有运气。
我发现似乎由它负责的功能(WriteLog),但我正努力将其与其他功能区分开来。
P.S。该代码不是我的,我是从一个我无法访问的开发人员那里继承的(不幸的是)
请在下面找到代码:
param(
[Parameter(Mandatory = $true)]
[string]$srcUrl,
[Parameter(Mandatory = $true)]
[string]$username,
[Parameter(Mandatory = $false,HelpMessage = "From Date: (dd/mm/yyyy)")]
[string]$fromDate,
[Parameter(Mandatory = $false,HelpMessage = "To Date: (dd/mm/yyyy)")]
[string]$toDate,
[Parameter(Mandatory = $true)]
[string]$folderPath,
[Parameter(Mandatory = $true)]
[string]$csvPath
) #end param
cls
#Load SharePoint CSOM Assemblies
Add-Type -Path "C:\Program Files\SharePoint Online Management Shell\Microsoft.Online.SharePoint.PowerShell\Microsoft.SharePoint.Client.dll"
Add-Type -Path "C:\Program Files\SharePoint Online Management Shell\Microsoft.Online.SharePoint.PowerShell\Microsoft.SharePoint.Client.Runtime.dll"
$global:OutFilePath = -join ($csvPath,"\Documents.csv")
$global:OutFilePathError = -join ($csvPath,"\ErrorLog_GetDocuments.csv")
$header = "Title,Type,Parent,Name,Path,FileSize(bytes),Created,Created by,Modified,Modified by,Matterspace title,Matterspace url"
$srcLibrary = "Documents"
$securePassword = Read-Host -Prompt "Enter your password: " -AsSecureString
$credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials ($username,$securePassword)
$sUrl = [System.Uri]$srcUrl
$domainUrl = -join ("https://",$sUrl.Host)
function WriteLog
{
param(
[Parameter(Mandatory = $true)] $title,$type,$folderName,$name,$path,$fileSize,$created,$createdby,$modifed,$modifiedby,$matterspacetitle,$materspaceUrl
)
$nowTime = Get-Date -Format "dd-MMM-yy,HH:mm:ss"
$folderName = $folderName.Replace(",","|") ### sometime folder / file name has comma so replace it with something
$name = $name.Replace(",","|")
#$path = $path.Replace(",","|")
$title=[System.String]::Concat("""""""$title""""""")
$type=[System.String]::Concat("""""""$type""""""")
$folderName=[System.String]::Concat("""""""$folderName""""""")
$name=[System.String]::Concat("""""""$name""""""")
$path=[System.String]::Concat("""""""$path""""""")
$fileSize=[System.String]::Concat("""""""$fileSize""""""")
$created=[System.String]::Concat("""""""$created""""""")
$createdby=[System.String]::Concat("""""""$createdby""""""")
$modified=[System.String]::Concat("""""""$modified""""""")
$modifiedby=[System.String]::Concat("""""""$modifiedby""""""")
$matterspacetitle=[System.String]::Concat("""""""$matterspacetitle""""""")
$materspaceUrl=[System.String]::Concat("""""""$materspaceUrl""""""")
$lineContent = "$("$title"),$($type),$($folderName),$($name),$($path),$($fileSize),$($created),$($createdby),$($modified),$($modifiedby),$($matterspacetitle),$($materspaceUrl)"
Add-Content -Path $global:OutFilePath -Value "$lineContent"
}
#Function to get all files of a folder
Function Get-FilesFromFolder([Microsoft.SharePoint.Client.Folder]$Folder,$SubWeb,$MTitle)
{
Write-host -f Yellow "Processing Folder:"$Folder.ServerRelativeUrl
$folderItem = $Folder.ListItemAllFields
#$srcContext.Load($f)
$Ctx.Load($folderItem)
$Ctx.ExecuteQuery()
#Get All Files of the Folder
$Ctx.load($Folder.files)
$Ctx.ExecuteQuery()
$authorEmail = $folderItem["Author"].Title
$editorEmail = $folderItem["Editor"].Title
$filepath = $folderItem["FileDirRef"]
if([string]::IsNullOrEmpty($filepath))
{
$filepath=$Folder.ServerRelativeUrl
}
$created = $folderItem["Created"]
$modified = $folderItem["Modified"]
$title = $folderItem["Title"]
if ([string]::IsNullOrEmpty($title))
{
$title = "Not Specified"
}
#$fileSize = $fItem["File_x0020_Size"]
$fileName = $Folder.Name
#list all files in Folder
write-host $Folder.Name
$splitString=$Folder.ServerRelativeUrl -split('/')
$dirUrl="";
write-host $splitString.Length
$parentUrl=""
For($i=3; $i -le $splitString.Length;$i++)
{
if($splitString[$i] -notcontains('.'))
{
Write-Host $i
Write-Host $splitString[$i]
$dirUrl=-join($dirUrl,"\",$splitString[$i])
$parentUrl=-join($parentUrl,"\",$splitString[$i+1])
}
}
$dirPath = -join ($folderPath,$dirUrl)
WriteLog $title "Folder" $parentUrl.TrimEnd('\') $fileName $filepath 0 $created $authorEmail $modified $editorEmail $MTitle $SubWeb
write-host $dirPath
if (-not (Test-Path -Path $dirPath))
{
New-Item -ItemType directory -Path $dirPath
}
ForEach ($File in $Folder.files)
{
try{
$remarkDetail = ""
$replacedUser = ""
$fItem = $File.ListItemAllFields
#$srcContext.Load($f)
$Ctx.Load($fItem)
$Ctx.ExecuteQuery()
$authorEmail = $fItem["Author"].Email
$editorEmail = $fItem["Editor"].Email
$filepath = $fItem["FileDirRef"]
$fileSizeBytes = $fItem["File_x0020_Size"];
$fileSize = ($fileSizeBytes) / 1MB
$fileName = $fItem["FileLeafRef"]
$title = $fItem["Title"]
$filecreated = $fitem["Created"]
$fileModified = $fitem["Modified"]
$FileUrl = $fItem["FileRef"]
$Fname=$File.Name
if ([string]::IsNullOrEmpty($title))
{
$title = "Not Specified"
}
#$title,$type, $folderName,$name,$path,$fileSize,$created,$createdby,$modifed,$modifiedby,$matterspacetitle,$materspaceUrl
$dateToCompare = Get-Date (Get-Date -Date $fileModified -Format 'dd/MM/yyyy')
#Get the File Name or do something
if (($dateToCompare -ge $startDate -and $dateToCompare -le $endDate) -or ($startDate -eq $null -and $endDate -eq $null))
{
$downloadUrl = -join ($dirPath,$File.Name)
$fromfile = -join ($domainUrl,$FileUrl)
Write-Host "Downloading the file from " $fromfile -ForegroundColor Cyan
try{
$webclient = New-Object System.Net.WebClient
$webclient.Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials ($username,$securePassword)
$webclient.Headers.Add("X-FORMS_BASED_AUTH_ACCEPTED","f")
$webclient.DownloadFile($fromfile,$downloadUrl)
$webclient.Dispose()
}
catch{
$ErrorMessage=$_.Exception.Message
$ErrorMessage = $ErrorMessage -replace "`t|`n|`r",""
$ErrorMessage = $ErrorMessage -replace " ;|; ",";"
$lineContent = "$($Fname),$($fromfile ),$($ErrorMessage)"
Add-Content -Path $global:OutFilePathError -Value "$lineContent"
Write-Host "Skipping the file and recalling the function" -ForegroundColor Blue
}
WriteLog $title "File" $Folder.Name $fileName $FileUrl $fileSize $created $authorEmail $modified $editorEmail $MTitle $SubWeb
Write-host -f Magenta $File.Name
}
else
{
Write-Host "Skipping the matterspace :" $title " as the matterspace was not in the date range" -ForegroundColor Blue
}
}
catch{
$ErrorMessage=$_.Exception.Message
$ErrorMessage = $ErrorMessage -replace "`t|`n|`r",""
$ErrorMessage = $ErrorMessage -replace " ;|; ",";"
$lineContent = "$($Fname),$($fromfile ),$($ErrorMessage)"
Add-Content -Path $global:OutFilePathError -Value "$lineContent"
}
}
#Recursively Call the function to get files of all folders
$Ctx.load($Folder.Folders)
$Ctx.ExecuteQuery()
#Exclude "Forms" system folder and iterate through each folder
ForEach($SubFolder in $Folder.Folders | Where {$_.Name -ne "Forms"})
{
Get-FilesFromFolder -Folder $SubFolder -SubWeb $SubWeb -Mtitle $MTitle
}
}
Function Get-SPODocLibraryFiles()
{
param
(
[Parameter(Mandatory=$true)] [string] $SiteURL,
[Parameter(Mandatory=$true)] [string] $LibraryName
)
#Setup the context
$Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
$Ctx.Credentials = $credentials
$srcWeb = $Ctx.Web
$childWebs = $srcWeb.Webs
$Ctx.Load($childWebs)
$Ctx.ExecuteQuery()
foreach ($childweb in $childWebs)
{
try
{
#Get the Library and Its Root Folder
$Library=$childweb.Lists.GetByTitle($LibraryName)
$Ctx.Load($Library)
$Ctx.Load($Library.RootFolder)
$Ctx.ExecuteQuery()
#Call the function to get Files of the Root Folder
if($childweb.Url.ToLower() -notlike "*ehcontactus*" -and $childweb.Url.ToLower() -notlike "*ehfaqapp*" -and $childweb.Url.ToLower() -notlike "*ehquicksearch*" -and $childweb.Url.ToLower() -notlike "*ehsiteapps*" -and $childweb.Url.ToLower() -notlike "*ehsitelist*" -and $childweb.Url.ToLower() -notlike "*ehwelcomeapp*" -and $childweb.Url.ToLower() -notlike "*ehimageviewer*")
{
Get-FilesFromFolder -Folder $Library.RootFolder -SubWeb $childweb.Url -MTitle $childweb.Title
}
}
catch{
write-host "Skipping the matterpsace as the library does not exists" -ForegroundColor Blue
}
}
}
#Config Parameters
#$SiteURL= "https://impigerspuat.sharepoint.com/sites/ELeave/Eleave1/adminuat@impigerspuat.onmicrosoft.com"
$LibraryName="Documents"
#$securePassword = Read-Host -Prompt "Enter your password: " -AsSecureString
#Call the function to Get All Files from a document library
if (-not ([string]::IsNullOrEmpty($fromDate)))
{
$startDate = Get-Date (Get-Date -Date $fromDate -Format 'dd/MM/yyyy')
}
else
{
$startDate = $null;
}
if (-not ([string]::IsNullOrEmpty($toDate)))
{
$endDate = Get-Date (Get-Date -Date $toDate -Format 'dd/MM/yyyy')
}
else
{
$endDate = $null
}
Get-SPODocLibraryFiles -SiteURL $srcUrl -LibraryName $LibraryName
答案 0 :(得分:0)
您是否尝试过仅运行该函数并为其提供函数中所请求的参数?
将代码复制到WriteLog.ps1文件中,然后使用参数调用脚本文件。
即。
Writelog.ps1 $srcUrl $username $fromDate $toDate $folderPath $csvPath
很明显,输入数据代替变量。
FWIW,将相关代码片段从其他人的脚本中拉出来是一种很好的练习技巧。您想做的所有事情之前都已经做完了,但是您可能必须先分解别人的工作,然后才能适合您的确切环境。
答案 1 :(得分:0)
不幸的是,您似乎必须按照旧的方式进行操作。问题是在下载文件时作者正在输出到日志(csv)。与先下载到暂存区相反...
我建议在代码中设置一个早期的断点,然后逐步查看它的运行方式。那应该给您一个一般的想法,以及足够的信息来开始编写重构代码。
逆向工程总是很困难,至少要做好准备,这将是有条不紊的练习。
答案 2 :(得分:0)
坏消息:这将是一个反复的过程,而不是一个“解决”的过程。该代码没有什么“错”,但是有一些设计选择使这成为一个挑战。它不是一致地缩进,而是以略有不同的方式编织所有变量分配。看起来比我的大多数代码要好,我只是在告诉你是什么使它成为挑战。
好消息:至少WriteLog函数是独立的。它实际上只是向在此分配的变量中定义的.csv文件添加内容:
$global:OutFilePath = -join ($csvPath,"\Documents.csv")
(我的副本中的第20行)
*
建议:(这是一种方法,只是最终解决方案的指南)
采用现有代码并将其放入IDE中,以直观地帮助您。 Windows Powershell ISE足够了,但是我强烈建议您使用VSCode。
注释最后一行:
Get-SPODocLibraryFiles -SiteURL $srcUrl -LibraryName $LibraryName
因此,您可以从实际要保留的脚本中保留任何其他上下文。
function Get-FilesFromLocalFolder ($localdir, $SubWeb, $MTitle)
代替现有功能Get-FilesFromFolder。这样,您可以遍历所需的任何目录,获取文件并分配变量以作为参数传递。然后,当您调用WriteLog时,它将看起来非常相似。仅因为WriteLog需要它们,所以传递了最后两个参数($ SubWeb,$ MTitle)。您可以将它们设置为自己的标签,也可以将其删除并在WriteLog中使其可选。
这将需要您进行一些迭代(同意@Steven),并且绝对是一项有价值的练习(同意@TheIdesOfMark)。 :)