用给定的字典构建C ++翻译器?

时间:2018-06-15 11:02:02

标签: c++ string word

我正在尝试构建一个基于给定字典翻译句子的简单翻译器。我们假设我们有两个单词串

string ENG[] = {"black","coffee", "want","yesterday"};
string SPA[] = {"negro", "café", "quiero", ayer"};

如果用户说“我想要一杯黑咖啡”。结果应该是“我?quiro a?negro cafe”。这意味着对于字典字符串中没有翻译的单词,旁边应该有问号。

#include <iostream>
using namespace std;

int main(int argc, char *argv[]) {

  string input string ENG[] = {"black", "coffee", "want", "yesterday"};
  string SPA[] = {"negro", "café", "quiero", "ayer"};

  cout << "Enter a word";
  cin >> input;

  for (int i = 0; i < 10; ++i) {
    if (ENG[i] == input) {
      cout << "You entered " << SPA[i] << endl;
    }
  }
  return 0;
}

我所写的内容只会转换为单词。如何编写此代码并使句子成为可能?

2 个答案:

答案 0 :(得分:0)

你走了。

#include <iostream>
#include <string>
#include <vector>

using namespace std;

vector <string> split_sentence(const string& arg)
{

    vector <string> ret;

    auto it = arg.begin();
    while (it != arg.end()) {

        string tmp;

        while (it != arg.end() && *it == ' ') ++it;
        while (it != arg.end() && *it != ' ')
            tmp += *it++;

        if (tmp.size())
            ret.push_back(tmp);
    }

    return ret;
}

int main(int argc, char *argv[])
{
    string input = "I want a black     coffee .";

    string ENG[4] = {"black","coffee", "want","yesterday"};
    string SPA[4] = {"negro", "café", "quiero", "ayer"};

    cout << "Enter sentence\n";
    /*
        cin >> input;
    */

    for (auto& str: split_sentence(input)) {

        bool found = false;

        for (int j=0; j<4 && !found; ++j) {

            if (ENG[j] == str) {
                cout << SPA[j] << " ";
                found = true;
            }
        }

        if (!found)
            cout << str << "? ";
    }

    cout << endl;
}

输出:

Enter sentence
I? quiero a? negro café .?

用空格分割句子,然后从dict中找到合适的单词。 如果你的dict是big enough,你需要使用一些类似树的数据结构来提高速度或排序和散列。

编辑:

Trie will be faster for this. For each query you 
can get the appropriate word in O(m), m = length of
query(English word)

答案 1 :(得分:0)

正如评论中所建议的那样,这两个分离的数组使用起来非常麻烦且难以更新。想象一下,在中间插入一个新的值对并弄乱偏移......

所以这里更好的解决方案是使用library(shiny) ui <- fluidPage( sidebarLayout( sidebarPanel( fluidRow( column(6, actionButton('addFilter', 'Add filter')), offset = 6 ), tags$hr(), tags$div(id = 'placeholderAddRemFilt'), tags$div(id = 'placeholderFilter'), width = 4 # sidebar ), mainPanel( tableOutput("data") ) ) ) server <- function(input, output,session) { filter <- character(0) makeReactiveBinding("aggregFilterObserver") aggregFilterObserver <- list() observeEvent(input$addFilter, { add <- input$addFilter filterId <- paste0('Filter_', add) colfilterId <- paste0('Col_Filter_', add) rowfilterId <- paste0('Row_Filter_', add) rowfilterId_num <- paste0('Row_Filter_num_', add) removeFilterId <- paste0('Remove_Filter_', add) headers <- names(mtcars) insertUI( selector = '#placeholderFilter', ui = tags$div(id = filterId, actionButton(removeFilterId, label = "Remove filter", style = "float: right;"), selectInput(colfilterId, label = "Some Filter", choices = as.list(headers), selected = 1), sliderInput(rowfilterId_num, label = "Select variable values", min = 1, max = 2, value = 1:4) ) ) observeEvent(input[[colfilterId]], { print(rowfilterId) print(paste0(input[[colfilterId]])) col <- input[[colfilterId]] values <- as.list(unique(mtcars[col]))[[1]] print(values) print(paste0("example",as.list(unique(mtcars[col])))) updateCheckboxGroupInput(session, rowfilterId , label = "Select variable values", choices = values, selected = values, inline = TRUE) updateSliderInput(session, rowfilterId_num , label = "Select variable",min = min(values), max = max(values), value = c(min(values),max(values))) aggregFilterObserver[[filterId]]$col <<- col aggregFilterObserver[[filterId]]$rows <<- NULL }) observeEvent(input[[rowfilterId]], { rows <- input[[rowfilterId]] aggregFilterObserver[[filterId]]$rows <<- rows }) observeEvent(input[[removeFilterId]], { removeUI(selector = paste0('#', filterId)) aggregFilterObserver[[filterId]] <<- NULL }) }) output$data <- renderTable({ dataSet <- mtcars invisible(lapply(aggregFilterObserver, function(filter){ dataSet <<- dataSet[which((dataSet[[filter$col]] %in% filter$rows)), ] })) dataSet }) } shinyApp(ui = ui, server = server) ,特别是考虑到这应该是一个简单的1:1映射。

因此,您可以使用std::map作为键(原始单词)和std::map作为其值(翻译)来定义std::string

使用现代C ++时,初始化可能如下所示:

std::string

现在,逐字逐句地获取输入字符串,最快的内置方法是使用std::map<std::string, std::string> translations { {"black", "negro"}, {"coffee", "café"}, // ... };

std::istringstream

查找实际翻译也变得微不足道。迭代所有翻译都发生在后台(std::istringstream stream(myInputText); std::string word; while (stream >> word) { // do something with each word } 类内):

std::map

至于一个完整的小例子:

const auto &res = translations.find(word);

if (res == translations.end()) // nothing found
    std::cout << "? ";
else
    std::cout << res->second << " "; // `res->second` is the value, `res->first` would be the key, i.e. `word`

此特定示例将创建以下输出:

#include <iostream>
#include <string>
#include <sstream>
#include <map>

int main(int argc, char **argv) {
    std::map<std::string, std::string> translations {
        {"black", "negro"},
        {"coffee", "café"}
    };

    std::string source("I'd like some black coffee");
    std::istringstream stream(source);
    std::string word;

    while (stream >> word) {
        const auto &t = translations.find(word);

        if (t != translations.end()) // found
            std::cout << word << ": " << t->second << "\n";
        else
            std::cout << word << ": ???\n";
        }

        return 0;
    }