Android Text-To-Speech API听起来很机器人

时间:2018-07-25 02:50:25

标签: android text-to-speech voice speech-synthesis google-text-to-speech

我是第一次学习android开发,我的目标是创建一个简单的Hello World应用程序,该应用程序可以吸收一些文本,然后大声读出来。

我的代码基于我发现的示例,这是我的代码:

class MainFeeds : AppCompatActivity() {


    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main_feeds)



        card.setOnClickListener{
            Toast.makeText(this, "Hello", Toast.LENGTH_LONG).show()
            TTS(this, "Hello this is leo")
        }
    }

}


class TTS(private val activity: Activity,
          private val message: String) : TextToSpeech.OnInitListener {

          private val tts: TextToSpeech = TextToSpeech(activity, this, "com.google.android.tts")

    override fun onInit(i: Int) {
        if (i == TextToSpeech.SUCCESS) {

            val localeUS = Locale.US

            val result: Int
            result = tts.setLanguage(localeUS)

            if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                Toast.makeText(activity, "This Language is not supported", Toast.LENGTH_SHORT).show()
            } else {
                speakOut(message)
            }

        } else {
            Toast.makeText(activity, "Initilization Failed!", Toast.LENGTH_SHORT).show()
        }
    }

    private fun speakOut(message: String) {
        tts.speak(message, TextToSpeech.QUEUE_FLUSH, null, null)
    }
}

它运行得很好,我遇到的问题是,合成器发出的音频听起来非常机器人化,就像我在使用Google Maps并断开互联网连接时一样。使用Google Assistant语音功能是否需要利用其他一些我必须启用的API?

编辑:我已经尝试在像素2xl上运行该应用程序,但它听起来还是很机器人,因为它没有使用Google Assistant语音。

2 个答案:

答案 0 :(得分:0)

首先,语音质量取决于您创建的TextToSpeech对象使用的“语音引擎”:

private val tts: TextToSpeech = TextToSpeech(activity, this)

如果您改为输入:

private val tts: TextToSpeech = TextToSpeech(activity, this, "com.google.android.tts")

然后,您在其上运行该代码的任何设备都会尝试 来使用Google语音引擎...但是只有在该设备上存在该设备时,它才会实际使用。

类似地,使用“ com.samsung.SMT”将尝试使用三星语音引擎(它也是高质量的,但通常仅安装在三星[real]设备上)。

是否可以使用Google语音引擎在很大程度上取决于设备的Android API级别(只要它最近才足以运行Google引擎),而是实际的Google text-to -语音引擎完全安装在设备上。

要确保已安装Google引擎:

在Android Studio模拟器上:

创建一个新的模拟器,并在“目标”列中选择一个具有“ Google API”或“ Google Play”的系统映像。

在真实设备上:

转到Play商店并安装Google speech engine

我目前正在编写一个“ TTS诊断”应用程序,准备就绪后将通过Github发布。我了解到Android上的TTS(或至少试图预测其行为)可能是真正的野兽。

我也建议您使用文档,当然是:Java | Kotlin

答案 1 :(得分:0)

我已经制作了一个小测试程序,可以为您回答这个问题。

它会向您显示Google引擎中所有语音的列表,然后单击它们并收听!是的!

它的实际作用:

  • 如果设备上存在Google文本语音转换引擎,则将其初始化为TextToSpeech对象。
  • 让您从ListView中选择特定的语音,该语音包含与代码中指定的语言环境(在这种情况下为英语)相对应的所有潜在可用语音,并与您所使用的Google文本语音引擎的版本相对应已安装。

这样,您可以测试所有语音,以查看所寻找的“ Google Assistant”语音是否在其中,如果不可用,您可以继续检查是否为新版本的Google文本语音转换引擎被释放。在我看来,此测试中质量最高的声音均为quality:400,并指定需要网络连接。

注意:

  • 即使未“安装”,声音(尤其是英语)也很可能仍在“播放”。这是因为使用setVoice(Voice v)时,只要请求的语音不可用(!),只要手头上还有其他“备份”语音,(Google)引擎就会返回“成功” int。相同的语言。不幸的是,即使在使用getVoice()并比较对象时,它也会在后台执行所有这些操作,并且仍然偷偷地报告它使用的是您要求的相同声音。 :(。

  • 通常,如果有声音说已安装,则您听到的声音就是您要求的声音。

  • 由于这些原因,当您测试这些声音时,您需要确保自己在互联网上(以便当您请求不可用的声音时它们会自动安装)...还要确保要求网络连接不会“自动降级”。

  • 您可以滑动/刷新语音的列表视图,以检查是否已安装语音,或使用系统的下拉菜单来观看下载...或进入Google的语音合成设备系统设置中的设置。

  • 在列表视图中,“语音功能”(例如“需要网络”和“已安装”)只是Google引擎报告内容的回声,可能并不准确。 :(

  • Voice class documentation中指定的最大语音质量为500。在我的测试中,我只能找到质量为400的语音。这可能是因为我没有最新版本的Google文本-语音转换安装在我的测试设备上(并且我没有Play商店访问权限以进行更新)。如果您使用的是真实设备,建议您使用Google Play商店安装最新版本的Google TTS。您可以在日志中验证引擎版本。根据Wikipedia的说法,撰写本文时,最新版本为3.15.18.200023596。我的测试设备上的版本是3.13.1。

要重新创建此测试应用程序,请在Android Studio中创建一个空白Java项目,最低API为21。(getVoices()在21之前的版本中不起作用)。

清单:

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package=" [ your.package.name ] "
    android:windowSoftInputMode="stateHidden">

    <application
        android:allowBackup="true"
        android:icon="@mipmap/ic_launcher"
        android:label="@string/app_name"
        android:roundIcon="@mipmap/ic_launcher_round"
        android:supportsRtl="true"
        android:theme="@style/AppTheme">
        <activity android:name=".MainActivity">
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />
                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>

</manifest>

MainActivity:

package [ your package name ];

import android.content.Intent;
import android.content.pm.PackageManager;
import android.content.pm.ResolveInfo;
import android.graphics.Color;
import android.speech.tts.TextToSpeech;
import android.speech.tts.UtteranceProgressListener;
import android.speech.tts.Voice;
import android.support.v4.widget.SwipeRefreshLayout;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.util.Log;
import android.view.View;
import android.view.inputmethod.InputMethodManager;
import android.widget.AdapterView;
import android.widget.EditText;
import android.widget.ListView;
import android.widget.Spinner;
import android.widget.TextView;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Locale;

public class MainActivity extends AppCompatActivity {

    EditText textToSpeak;
    TextView progressView;
    TextToSpeech googleTTS;
    ListView voiceListView;
    SwipeRefreshLayout swipeRefreshLayout;
    Long timeOfSpeakRequest;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        textToSpeak = findViewById(R.id.textToSpeak);
        textToSpeak.setText("Do I sound robotic to you?  1,2,3,4... yabadabadoo.  "
                + "ooo! ahh! la-la-la-la-la!  num-num-dibby-dibby-num-tick-tock...  "
                + "Can I pronounce the word, Antidisestablishmentarianism?  "
                + "Gerp!  My pants are too tight!  "
                + "CODE RED!  CODE RED!  Initiate disassemble!  Ice Cream is cold "
                + "...in my pants.  Exterminate!  exterminate!  Directive 4 is "
                + "classified."
        );
        progressView = findViewById(R.id.progressView);
        voiceListView = findViewById(R.id.voiceListView);
        swipeRefreshLayout = findViewById(R.id.swipeRefresh);


        // Create the TTS and wait until it's initialized to do anything else
        if (isGoogleEngineInstalled()) {
            createGoogleTTS();
        } else {
            Log.i("XXX", "onCreate(): Google not installed -- nothing done.");
        }

    }

    @Override
    protected void onStart() {
        super.onStart();

        swipeRefreshLayout.setOnRefreshListener(new SwipeRefreshLayout.OnRefreshListener() {
            @Override
            public void onRefresh() {
                assignFullSetOfVoicesToVoiceListView();
            }
        });

    }

    // this is where the program really begins (when the TTS is initialized)
    private void onTTSInitialized() {

        setUpWhatHappensWhenAVoiceItemIsClicked();
        setUtteranceProgressListenerOnTheTTS();
        assignFullSetOfVoicesToVoiceListView();

    }

    // FACTORED/EXTRACTED METHODS ----------------------------------------------------------------
    // These are just pulled out to make onCreate() easier to read and the basic sequence
    // of events more obvious.

    private void createGoogleTTS() {

        googleTTS = new TextToSpeech(this, new TextToSpeech.OnInitListener() {
            @Override
            public void onInit(int status) {
                if (status != TextToSpeech.ERROR) {
                    Log.i("XXX", "Google tts initialized");
                    onTTSInitialized();
                } else {
                    Log.i("XXX", "Internal Google engine init error.");
                }
            }
        }, "com.google.android.tts");

    }

    private void setUpWhatHappensWhenAVoiceItemIsClicked() {
        voiceListView.setOnItemClickListener(new AdapterView.OnItemClickListener() {
            @Override
            public void onItemClick(AdapterView<?> parent, View view, int position, long id) {
                Voice desiredVoice = (Voice) parent.getAdapter().getItem(position);
                // if (setting the desired voice is "successful")...
                // in the case of google engine, this does not necessarily mean the voice you
                // want will actually be used. :(
                if (googleTTS.setVoice(desiredVoice) == 0) {
                    Log.i("XXX", "Speech voice set to: " + desiredVoice.toString());
                    // TTS did may "auto-downgrade" voice selection
                    // due to internal reason such as no data
                    // Unfortunately it will not tell you, and there seems to be no
                    // way of checking whether the presently selected voice (getVoice()) "equals"
                    // the desired voice.
                    speak();
                }
            }
        });
    }

    private void setUtteranceProgressListenerOnTheTTS() {

        UtteranceProgressListener blurp = new UtteranceProgressListener() {

            @Override // MIN API 15
            public void onStart(String s) {
                long timeSinceSpeakCall = System.currentTimeMillis() - timeOfSpeakRequest;
                Log.i("XXX", "progress.onStart() callback.  "
                        + timeSinceSpeakCall + " millis since speak() was called.");
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        progressView.setTextColor(Color.GREEN);
                        progressView.setText("PROGRESS: STARTED");
                    }
                });
            }

            @Override // MIN API 15
            public void onDone(String s) {
                long timeSinceSpeakCall = System.currentTimeMillis() - timeOfSpeakRequest;
                Log.i("XXX", "progress.onDone() callback.  "
                        + timeSinceSpeakCall + " millis since speak() was called.");
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        progressView.setTextColor(Color.GREEN);
                        progressView.setText("PROGRESS: DONE");
                    }
                });
            }

            // Getting an error can simply mean that the particular voice is not available
            // to the device yet... and still needs to be downloaded / is still downloading
            @Override // MIN API 15 (depracated at API 21)
            public void onError(String s) {
                long timeSinceSpeakCall = System.currentTimeMillis() - timeOfSpeakRequest;
                Log.i("XXX", "progress.onERROR() callback.  "
                        + timeSinceSpeakCall + " millis since speak() was called.");
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        progressView.setTextColor(Color.RED);
                        progressView.setText("PROGRESS: ERROR");
                    }
                });

            }
        };
        googleTTS.setOnUtteranceProgressListener(blurp);

    }

    // must happens AFTER tts is initialized
    private void assignFullSetOfVoicesToVoiceListView() {

        googleTTS.stop();

        List<Voice> tempVoiceList = new ArrayList<>();

        for (Voice v : googleTTS.getVoices()) {
            if (v.getLocale().getLanguage().contains("en")) { // only English voices
                tempVoiceList.add(v);
            }
        }

        // Sort the list alphabetically by name
        Collections.sort(tempVoiceList, new Comparator<Voice>() {
            @Override
            public int compare(Voice v1, Voice v2) {
                Log.i("XXX", "comparing item");
                return (v2.getName().compareToIgnoreCase(v1.getName()));
            }
        });

        VoiceAdapter tempAdapter = new VoiceAdapter(this, tempVoiceList);

        voiceListView.setAdapter(tempAdapter);
        swipeRefreshLayout.setRefreshing(false);
        progressView.setTextColor(Color.BLACK);
        progressView.setText("PROGRESS: ...");

    }

    private void speak() {
        HashMap<String, String> map = new HashMap<>();
        map.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, "merp");
        timeOfSpeakRequest = System.currentTimeMillis();
        googleTTS.speak(textToSpeak.getText().toString(), TextToSpeech.QUEUE_FLUSH, map);
    }

    // Checks if Google Engine is installed
    // ... (and gives more info in Logs).
    // The version number is going to dictate the quality of voices available
    private boolean isGoogleEngineInstalled() {

        final Intent ttsIntent = new Intent();
        ttsIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
        final PackageManager pm = getPackageManager();
        final List<ResolveInfo> list = pm.queryIntentActivities(ttsIntent, PackageManager.GET_META_DATA);

        boolean googleIsInstalled = false;

        for (int i = 0; i < list.size(); i++) {

            ResolveInfo resolveInfoUnderScrutiny = list.get(i);
            String engineName = resolveInfoUnderScrutiny.activityInfo.applicationInfo.packageName;

            if (engineName.equals("com.google.android.tts")) {
                String version = "null";
                try {
                    version = pm.getPackageInfo(engineName,
                            PackageManager.GET_META_DATA).versionName;
                } catch (Exception e) {
                    Log.i("XXX", "Error getting google engine verion: " + e.toString());
                }
                Log.i("XXX", "Google engine version " + version + " is installed!");
                googleIsInstalled = true;
            } else {
                Log.i("XXX", "Google Engine is not installed!");
            }

        }
        return googleIsInstalled;
    }
}

VoiceAdapter.java:

package [ your package name ];

import android.content.Context;
import android.graphics.Color;
import android.speech.tts.Voice;
import android.view.LayoutInflater;
import android.view.View;
import android.view.ViewGroup;
import android.widget.BaseAdapter;
import android.widget.TextView;

import java.util.List;

public class VoiceAdapter extends BaseAdapter {

    private Context mContext;
    private LayoutInflater mInflater;
    private List<Voice> mDataSource;

    public VoiceAdapter(Context context, List<Voice> voicesToDisplay) {
        mContext = context;
        mDataSource = voicesToDisplay;
        mInflater = (LayoutInflater) mContext.getSystemService(Context.LAYOUT_INFLATER_SERVICE);
    }

    @Override
    public int getCount() {
        return mDataSource.size();
    }

    @Override
    public Object getItem(int position) {
        return mDataSource.get(position);
    }

    @Override
    public long getItemId(int position) {
        return position;
    }

    @Override
    public View getView(int position, View convertView, ViewGroup parent) {

        // In a real app this method is not efficient,
        // and "View Holder Pattern" shoudl be used instead.
        View rowView = mInflater.inflate(R.layout.list_item_voice, parent, false);

        if (position%2 == 0) {
            rowView.setBackgroundColor(Color.rgb(245,245,245));
        }

        Voice voiceUnderScrutiny = mDataSource.get(position);

        // example output of Voice.toString() :
        // "Voice[Name: pt-br-x-afs#male_2-local, locale: pt_BR, quality: 400, latency: 200,
        // requiresNetwork: false, features: [networkTimeoutMs, notInstalled, networkRetriesCount]]"

        // Get title element
        TextView voiceTitleTextView =
                (TextView) rowView.findViewById(R.id.voice_title);

        TextView qualityTextView =
                (TextView) rowView.findViewById(R.id.voice_quality);

        TextView networkRequiredTextView =
                (TextView) rowView.findViewById(R.id.voice_network);

        TextView isInstalledTextView =
                (TextView) rowView.findViewById(R.id.voice_installed);

        TextView featuresTextView =
                (TextView) rowView.findViewById(R.id.voice_features);

        voiceTitleTextView.setText("VOICE NAME: " + voiceUnderScrutiny.getName());

        // Voice Quality...
        // ( https://developer.android.com/reference/android/speech/tts/Voice.html )
        // 100 = Very Low, 200 = Low, 300 = Normal, 400 = High, 500 = Very High
        qualityTextView.setText(  "QLTY: " + ((Integer) voiceUnderScrutiny.getQuality()).toString()  );
        if (voiceUnderScrutiny.getQuality() == 500) {
            qualityTextView.setTextColor(Color.GREEN); // set v. high quality to green
        }

        if (!voiceUnderScrutiny.isNetworkConnectionRequired()) {
            networkRequiredTextView.setText("NET_REQ?: NO");
        } else {
            networkRequiredTextView.setText("NET_REQ?: YES");
        }

        if (!voiceUnderScrutiny.getFeatures().contains("notInstalled")) {
            isInstalledTextView.setTextColor(Color.GREEN);
            isInstalledTextView.setText("INSTLLD?: YES");
        } else {
            isInstalledTextView.setTextColor(Color.RED);
            isInstalledTextView.setText("INSTLLD?: NO");
        }

        featuresTextView.setText("FEATURES: " + voiceUnderScrutiny.getFeatures().toString());

        return rowView;
    }
}

activity_main.xml:

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:focusable="true"
    android:focusableInTouchMode="true"
    tools:context=".MainActivity">

    <EditText
        android:id="@+id/textToSpeak"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:ems="10"
        android:inputType="textPersonName"
        android:text="textToSpeak..."
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent" />

    <android.support.v4.widget.SwipeRefreshLayout
        android:id="@+id/swipeRefresh"
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:layout_marginBottom="8dp"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.0"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/progressView">

    <ListView
        android:id="@+id/voiceListView"
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:layout_marginBottom="8dp"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp">

    </ListView>

    </android.support.v4.widget.SwipeRefreshLayout>

    <TextView
        android:id="@+id/progressView"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="UTTERANCE_PROGRESS:"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/textToSpeak" />

</android.support.constraint.ConstraintLayout>

list_item_voice.xml:

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="wrap_content"

    android:layout_centerInParent="true"
    android:paddingBottom="8dp"
    android:paddingLeft="16dp"
    android:paddingRight="16dp"
    android:paddingTop="8dp"
    >

    <TextView
        android:id="@+id/voice_title"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="NAME:"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent" />


    <TextView
        android:id="@+id/voice_installed"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginTop="8dp"
        android:fontFamily="monospace"
        android:text="INSTALLED? "
        android:textAlignment="textStart"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.5"
        app:layout_constraintStart_toEndOf="@+id/voice_network"
        app:layout_constraintTop_toBottomOf="@+id/voice_title" />

    <TextView
        android:id="@+id/voice_quality"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="QUALITY:"
        app:layout_constraintEnd_toStartOf="@+id/voice_network"
        app:layout_constraintHorizontal_bias="0.5"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/voice_title" />

    <TextView
        android:id="@+id/voice_features"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginBottom="8dp"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="FEATURES:"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/voice_quality" />

    <TextView
        android:id="@+id/voice_network"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="NET_REQUIRED?"
        app:layout_constraintEnd_toStartOf="@+id/voice_installed"
        app:layout_constraintHorizontal_bias="0.5"
        app:layout_constraintStart_toEndOf="@+id/voice_quality"
        app:layout_constraintTop_toBottomOf="@+id/voice_title" />

</android.support.constraint.ConstraintLayout>