我使用下面的代码将文件从CSV转换为xlsx。但它一次只能转换单个文件。我希望它一次转换目录中的所有文件。
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $true
$Workbook = $xl.Workbooks.Open("$loglocation\errors_$server.csv")
$Worksheets = $Workbooks.Worksheets
$Workbook.SaveAs("$loglocation\errors_$server.xls",1)
$Workbook.Saved = $true
$xl.Quit()
答案 0 :(得分:1)
使用PSExcel模块,您可以使用import org.apache.hadoop.io.{LongWritable, Text}
import com.databricks.spark.xml.XmlInputFormat
val conf = sc.hadoopConfiguration
conf.set(XmlInputFormat.START_TAG_KEY, "<xxx>")
conf.set(XmlInputFormat.END_TAG_KEY, "</xxx>")
org.apache.hadoop.fs.FileSystem.get(conf)
val xml = ssc.fileStream[LongWritable,Text,XmlInputFormat](monitoredDirectory,true,false)
,这使得此过程非常简单:
Export-XLSX