Zego + Mini Game ASR Integration (Android)

SudMGP provides interactive mini-games like "Pictionary", "Guess the Word", and "Number Bomb" that support voice interactions, enhancing playability and social aspects. The integration steps are straightforward, and this document outlines the steps to integrate the ASR feature of the SudMGP SDK.

I. Background

Mini-games can have voice interaction capabilities, where the App needs to obtain specific format PCM data from Zego RTC and pass it to the SudMGP SDK in a specified way. Taking hello-sud-plus-android as an example, the source code can be found at: https://github.com/SudTechnology/hello-sud-plus-android hello-sud-plus-android encapsulates the integration of the SudMGP SDK with SudMGPWrapper. We recommend clients to integrate the SDK using SudMGPWrapper.

II. Integration Steps

SDK GitHub link: https://github.com/SudTechnology/sud-mgp-android Please use the latest version for Maven integration. For example, using version V1.3.2.1154:

III. Starting ASR in the Mini Game

When the mini-game enters the ASR scene, it will automatically start the ASR capability. At this point, it will send the MG_COMMON_GAME_ASR state to the App with isOpen == true, as seen in SudFSMMGListener.onGameMGCommonGameASR.

IV. App Starts Listening to RTC Audio Streams

Once the App receives the MG_COMMON_GAME_ASR state with isOpen == true: Call the Zego interface to start capturing audio PCM data, initiating local PCM data collection by Zego Implement the IZegoAudioDataHandler interface object and set the original audio PCM data callback using ZegoExpressEngine.startAudioDataObserver(int bitmask, ZegoAudioFrameParam param) and ZegoExpressEngine.setAudioDataHandler(IZegoAudioDataHandler handler).

    1. Calling ZegoExpressEngine.startAudioDataObserver and ZegoExpressEngine.setAudioDataHandler:
      @Override
      public void startPCMCapture() {
      ZegoExpressEngine engine = getEngine();
      if (engine != null) {
       /* Enable PCM data capture */
       ZegoAudioFrameParam param = new ZegoAudioFrameParam();
       int bitmask = 0;
       param.channel = ZegoAudioChannel.MONO;
       param.sampleRate = ZegoAudioSampleRate.ZEGO_AUDIO_SAMPLE_RATE_16K;
       bitmask |= ZegoAudioDataCallbackBitMask.CAPTURED.value();
       engine.startAudioDataObserver(bitmask, param);
       /* Set the original audio data callback */
       engine.setAudioDataHandler(zegoAudioDataHandler);
      }
      }
      
      startAudioDataObserver() is used to set the PCM data format: The audio slices passed to pushAudio are obtained from RTC as PCM data PCM data format must be: sample rate: 16000, sample bit depth: 16, number of channels: MONO The length of PCM data slices can be adjusted based on the effect. Longer length provides better accuracy but longer delay, while shorter length reduces delay but sacrifices accuracy Zego's audio slices default to 10ms, but the length can be adjusted for better results when passed to pushAudio.
    1. Implementing the IZegoAudioDataHandler interface object:
      private final IZegoAudioDataHandler zegoAudioDataHandler = new IZegoAudioDataHandler() {
      @Override
      public void onCapturedAudioData(ByteBuffer data, int dataLength, ZegoAudioFrameParam param) {
       super.onCapturedAudioData(data, dataLength, param);
       ISudAudioEventListener listener = mISudAudioEventListener;
       if (listener != null) {
          AudioPCMData audioPCMData = new AudioPCMData();
          audioPCMData.data = data;
          audioPCMData.dataLength = dataLength;
          listener.onCapturedPCMData(audioPCMData);
       }
      }
      };
      
      The onCapturedAudioData() callback method returns the local PCM data captured by RTC, with further processing explained in the next section.
    1. Passing the RTC Captured PCM Data to the SDK The onCapturedAudioData() callback method returns local PCM data slices, and the following method is used to pass the PCM data to the SDK:
      // Audio stream data
      public void onCapturedAudioData(AudioPCMData audioPCMData) {
      sudFSTAPPDecorator.pushAudio(audioPCMData.data, audioPCMData.dataLength);
      }
      
      The pushAudio interface can be called in a working thread. Zego's audio slices default to 10ms, but the length can be adjusted for better results when passed to pushAudio.

V. App Stops Listening to RTC Audio Streams

When the mini-game exits the ASR scene due to a hit or timeout, it will send a state notification to the App to stop capturing PCM data. Once the App receives the MG_COMMON_GAME_ASR state with isOpen == false, call the Zego interface ZegoExpressEngine.setAudioDataHandler(null) and ZegoExpressEngine.stopAudioDataObserver() to stop local PCM data collection by Zego.

@Override
public void stopPCMCapture() {
ZegoExpressEngine engine = getEngine();
    if (engine != null) {
        /* Set the audio data callback to null */
        engine.setAudioDataHandler(null);
        engine.stopAudioDataObserver();
}
}

VI. Playing Games with ASR Only

When playing games with ASR only, the App only needs to handle the MG_COMMON_GAME_ASR state to enable/disable local PCM data collection. There is no need to send mg_common_key_word_to_hit to the game as in text-based hits.

VII. Text-Based Hits for Mini Games with ASR

Mini-games with ASR scenes usually allow text input for hits alongside voice interaction. The game will notify the App of the hit scene using the mg_common_key_word_to_hit state. The App will receive this through the callback interface:

SudFSMMGListener.onGameMGCommonKeyWordToHit(ISudFSMStateHandle handle, SudMGPMGState.MGCommonKeyWordToHit model)

Text-based hit scenes in mini-games can be categorized into two types:

  1. Games where the App holds the keyword, like "Pictionary" and "Guess the Word". If model.word is not empty, the App needs to locally compare and determine the hit. After determining the hit, the App notifies the game through the sudFSTAPPDecorator.notifyAPPCommonSelfTextHitState method.
  2. Games where the App does not hold the keyword, like "Number Bomb". If model.word is empty, the App needs to send the text to the game each time for the game to determine the hit.
    public void sendMsgCompleted(String msg) {
     if (msg == null || msg.isEmpty()) {
         return;
     }
     // Number Bomb
     if (sudFSMMGDecorator.isHitBomb() && HSTextUtils.isInteger(msg)) {
         sudFSTAPPDecorator.notifyAPPCommonSelfTextHitState(false, null, msg, null, null, null);
         return;
     }
     String keyword = gameKeywordLiveData.getValue();
     if (keyword == null || keyword.isEmpty()) {
         return;
    }
     // Pictionary, check if the keyword is hit. Here, we use a contains check. Implement based on specific business needs.
     if (msg.contains(keyword)) {
         sudFSTAPPDecorator.notifyAPPCommonSelfTextHitState(true, keyword, msg, null, null, null);
         gameKeywordLiveData.setValue(null);
     }
    }
    

results matching ""

    No results matching ""