Android Text To Speech & Speech To Text

Hello Friends, today I am going to discuss about two important coding in android which are TextToSpeech and Speech to Text.

TextToSpeech

Text To Speech in android is one engine which convert text into speech or sound.

TextToSpeech is in built class in android so first we have to create instance of that class and initialize it. TextToSpeech.OnInitListener we can use for initialization and we can also confirm whether we have successfully initialized or error.

We can also use shutdown() method to release TextToSpeech engine instance.

To create instance of TextToSpeech we use can use following constructor or code :

TextToSpeech tts=new TextToSpeech(this, new TextToSpeech.OnInitListener() {
            @Override
            public void onInit(int status) {
                if(status==TextToSpeech.SUCCESS)
                {
                    Toast.makeText(context,"Initialization Success.",Toast.LENGTH_LONG).show();
                    tts.setPitch(0.5f);
                    tts.setLanguage(Locale.ENGLISH);
                    tts.setSpeechRate(0.5f);
                }else{
                    Toast.makeText(context,"Initialization Fail.",Toast.LENGTH_LONG).show();
                    binding.btnSpeak.setVisibility(View.GONE);
                }
            }
        });

In Above code after Success full initializing TextToSpeech engine we have used following methods :

  1. setPitch()  : it will set the quality of sound means how much high or low sound.
  2. setLanguage() :  it will set the language of speech.
  3. setSpeechRate() : it will set speed of speech .

There are many other methods available but these three are enough for now. You can also read other methods from here.

Once above code done success fully we can write a code to speak the string message we want by providing it to TextToSpeech instance (tts) we have created.

Here I am writing code to read message input by user in EditText on Button click event.

@Override
    public void onClick(View v) {
        String msg=binding.etMessage.getText().toString();
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {
            tts.speak(msg,TextToSpeech.QUEUE_ADD,null,null);
        }else{
            tts.speak(msg, TextToSpeech.QUEUE_ADD,null);
        }
    }

In above code you can see I have used 2 version of speak() method

1)

speak(CharSequence text, int queueMode, Bundle params, String utteranceId)

This method is available only for Lollipop and higher versions. Where bundle and utteranceId is not required so we pass null for it.

2)

speak(String text, int queueMode, HashMap params) 

This method is available for below Lollipop versions, params are optional so we have pass null for it.

In both methods message(1st parameter) and queueMode(2nd parameter) are given, where queueMode we can pass any of following constants.

QUEUE_ADD

Queue mode where the new entry is added at the end of the playback queue.

QUEUE_FLUSH

Queue mode where all entries in the playback queue (media to be played and text to be synthesized) are dropped and replaced by the new entry.

Speech To Text

For Speech to Text in android in built class is available named RecognizerIntent

Using implicit intent we can implement Speech to Text with RecognizerIntent.ACTION_RECOGNIZE_SPEECH  constant

It will start an Activity which will ask use to speak and send it through speech recognizer.

So we have to override onActivityResult to receive the result.

Here is code which you can call from anywhere you want input from speech so it makes user to easy entry.

 private void startSpeechToText() {
        Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
        intent.putExtra(RecognizerIntent.EXTRA_PROMPT,
                getString(R.string.speech_prompt));
        try {
            startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
        } catch (ActivityNotFoundException a) {
            Toast.makeText(getApplicationContext(),
                    getString(R.string.speech_not_supported),
                    Toast.LENGTH_SHORT).show();
        }
    }

Here

 intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
             RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);

is required option which will inform recognizer which model to prefer when performing .

we can pass one of two Model available  

  1. LANGUAGE_MODEL_FREE_FORM : Use a language model based on free-form speech recognition
  2. LANGUAGE_MODEL_WEB_SEARCH : Use a language model based on web search terms. 

Next we have implement onActivityResult () method as below to receive text input by user :

@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        super.onActivityResult(requestCode, resultCode, data);
        switch (requestCode) {
            case REQ_CODE_SPEECH_INPUT: {
                if (resultCode == RESULT_OK && null != data) {

                    ArrayList result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
                    binding.txtSpeechInput.setText(result.get(0));
                }
                break;
            }
        }
    }

Here you can see we are getting ArrayList<String> which contains all message input by user but we will get only message from index 0.

Thanks you for reading. Please write comment and share with your friends if you like.

Related Posts

Leave a Reply