我正在寻找从字符串中获取一定数量单词(按顺序)的最有效方法。
所以,让我说我有一段文字:
" Lorem Ipsum只是印刷和排版行业的虚拟文本。自16世纪以来,Lorem Ipsum一直是业界标准的虚拟文本,当时一台未知的打印机采用了类型的厨房,并将其拼凑成一个类型的样本书。它不仅存在了五个世纪,而且还延续了电子排版,基本保持不变。它在20世纪60年代随着包含Lorem Ipsum段落的Letraset表格的推出而普及,最近还推出了像Aldus PageMaker这样的桌面出版软件,包括Lorem Ipsum版本。"
我希望能够在段落中的随机位置抓取可变数量的单词。因此,如果需要5个单词,则某些输出的示例可能是:
这样做的最佳方法是什么?
答案 0 :(得分:5)
按空格拆分数据以获取单词列表,然后找到一个随机位置来选择单词(从结尾起至少5个单词),然后再将单词连接在一起。
public static void extract(char [][] arr, int row, int col){
char [][] ex = new char [3][3];
int rc = 0;
int cc = 0;
for(int i=row-1; i<row+2; i++){
for(int j=col-1; j<col+2; j++){
if(i < 0 || j < 0 || i >= arr.length || j >= arr[i].length){
ex[rc][cc] = '?';
} else {
ex[rc][cc] = arr[i][j];
}
cc++;
}
rc++;
cc = 0;
}
for(int i=0; i<3;i++){
for(int j=0; j<3; j++){
System.out.print (ex[i][j] +" ");
}
System.out.println();
}
}
示例输出:
import os, csv
rootDir = 'path'
items = {}
for dirName, subdirList, fileList in os.walk(rootDir, topdown=False):
for fname in fileList:
with open(os.path.join(dirName,fname),'rb') as f:
reader = csv.reader(f,delimiter=',')
for row in reader:
try:
items[fname].append(row)
except KeyError:
items[fname] = list()
print items
答案 1 :(得分:1)
对于顺序变化,我会这样做:
Array
split(' ')
个字词中
Array
Random
减5的长度的随机值
VB版+测试结果
(这可能是你更感兴趣的)
Imports System
Imports System.Text
Public Module Module1
Public Sub Main()
Dim str As String = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."
Console.WriteLine(GrabRandSequence(str))
Console.WriteLine(GrabRandSequence(str))
Console.WriteLine(GrabRandSequence(str))
Console.ReadKey()
End Sub
Public Function GrabRandSequence(inputstr As String)
Try
Dim words As String() = inputstr.Split(New Char() {" "c})
Dim index As Integer
index = CInt(Math.Floor((words.Length - 5) * Rnd()))
Return [String].Join(" ", words, index, 5)
Catch e As Exception
Return e.ToString()
End Try
End Function
End Module
结果
C#版
string[] words = input.Split(' '); //Read 1.
int val = (new Random()).Next(0, words.Length - 5); //Read 2.
string result = string.Join(" ", words, val, 5); //Read 3. improved by Enigmativy's suggestion
额外尝试
对于随机变体,我会这样做:
List
LINQ
split(' ')
Distinct
LINQ
之间选择Lorem Lorem Lorem Lorem Lorem
(可选,以避免List
之类的结果)Random
大小List
(不明显时重复拾取)string input = "the input sentence, blabla";
input = input.Replace(",","").Replace(".",""); //Read 1. add as many replace as you want
List<string> words = input.Split(' ').Distinct.ToList(); //Read 2. and 3.
Random rand = new Random();
List<int> vals = new List<int>();
do { //Read 4.
int val = rand.Next(0, words.Count);
if (!vals.Contains(val))
vals.Add(val);
} while (vals.Count < 5);
string result = "";
for (int i = 0; i < 5; ++i)
result += words[vals[i]] + (i == 4 ? "" : " "); //read 5. and 6.
警告:句子可能没有任何意义!!
C#版(仅限)
result
您的结果位于 pot-1_Sam [Sam is the word to be extracted]
pot_444_Jack [Jack is the word to be extracted]
pot_US-1_Sam [Sam is the word to be extracted]
pot_RUS_444_Jack[Jack is the word to be extracted]
pot_UK_3_Nick_Samuel[Nick_Samuel is the word to be extracted]
pot_8_James_Baldwin[James_Baldwin is the word to be extracted]
pot_8_Jack_Furleng_Derik[Jack_Furleng_Derik is the word to be extracted]
答案 2 :(得分:0)
string input = "Your long sentence here";
int noOfWords = 5;
string[] arr = input.Split(' ');
Random rnd = new Random();
int start = rnd.Next(0, arr.Length - noOfWords);
string output = "";
for(int i = start; i < start + noOfWords; i++)
output += arr[i] + " ";
Console.WriteLine(output);
答案 3 :(得分:0)
string sentense = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";
string[] wordCollections = sentense.Split(' ');
Random rnd = new Random();
int randomPos=rnd.Next(0, wordCollections.Length);
string grabAttempt1 = String.Join(" ", wordCollections.ToArray(), randomPos, 5);
// Gives you a random string of 5 words
randomPos = rnd.Next(0, wordCollections.Length);
string grabAttempt2 = String.Join(" ", wordCollections, randomPos, 5);
// Gives you another random string of 5 words
答案 4 :(得分:0)
这可能会为你做到这一点
.close