Zego + Mini Game ASR Integration (iOS)

SudMGP provides interactive mini-games like "Pictionary", "Guess the Word", and "Number Bomb" that support voice interactions, enhancing playability and social aspects. The integration steps are straightforward, and this document outlines the steps to integrate the ASR feature of the SudMGP SDK.

I. Background

Mini-games can have voice interaction capabilities, where the App needs to obtain specific format PCM data from Zego RTC and pass it to the SudMGP SDK in a specified way. Taking hello-sud-plus-ios as an example, the source code can be found at: https://github.com/SudTechnology/hello-sud-plus-ios hello-sud-plus-ios encapsulates the integration of the SudMGP SDK with SudMGPWrapper. We recommend clients to integrate the SDK using SudMGPWrapper.

II. Integration Steps

SDK GitHub link: https://github.com/SudTechnology/sud-mgp-ios Please use the latest version for CocoaPods integration. For example, using version V1.3.4.1290:

    1. Integrate SudMGP SDK: Standard version:
      pod 'SudMGPWrapper', '~> 1.3.4.1'
      
      Lite version:
      pod 'SudMGPWrapper_Lite', '~> 1.3.4.1'
      
    1. Integrate the speech recognition library:
      pod 'MicrosoftCognitiveServicesSpeech-iOS', '1.23.0'
      

      III. Starting ASR in the Mini Game

      When the mini-game enters the ASR scene, it will automatically start the ASR capability. At this point, it will send the MG_COMMON_GAME_ASR state to the App with isOpen as true, as seen in the SudFSMMGListener protocol's method:
- (void)onGameMGCommonGameASR:(nonnull id <ISudFSMStateHandle>)handle model:(MGCommonGameASRModel *)model;

事件回调

IV. App Starts Listening to RTC Audio Streams

Once the App receives the MG_COMMON_GAME_ASR state with isOpen == YES: Call the Zego interface to start capturing audio PCM data, initiating local PCM data collection by Zego Implement the ZegoAudioDataHandler protocol and set the original audio PCM data callback using ZegoExpressEngine.startAudioDataObserver:(ZegoAudioDataCallbackBitMask)observerBitMask param:(ZegoAudioFrameParam *)param and ZegoExpressEngine.setAudioDataHandler:(nullable id)handler.

  1. Calling ZegoExpressEngine.startAudioDataObserver/ZegoExpressEngine.setAudioDataHandler: ```objc /// Start capturing original audio
  2. (void)startPCMCapture { ZegoExpressEngine *engine = [ZegoExpressEngine sharedEngine]; if (engine != nil) {

      // Enable PCM data capture
      ZegoAudioFrameParam *param = [[ZegoAudioFrameParam alloc] init];
      param.channel = ZegoAudioChannelMono;
      param.sampleRate = ZegoAudioSampleRate16K;
      ZegoAudioDataCallbackBitMask bitmask = ZegoAudioDataCallbackBitMaskCaptured;
      [engine startAudioDataObserver:bitmask param:param];
      // Set the original audio data callback
      [engine setAudioDataHandler:self];
    

    } } ``` startAudioDataObserver is used to set the PCM data format: The audio slices passed to pushAudio are obtained from RTC as PCM data PCM data format must be: sample rate: 16000, sample bit depth: 16, number of channels: MONO The length of PCM data slices can be adjusted based on the effect. Longer length provides better accuracy but longer delay, while shorter length reduces delay but sacrifices accuracy Zego's audio slices default to 10ms, but the length can be adjusted for better results when passed to pushAudio.

  3. Implementing the ZegoAudioDataHandler protocol: ```objc

  4. (void)onCapturedAudioData:(const unsigned char )data dataLength:(unsigned int)dataLength param:(ZegoAudioFrameParam )param { // Local audio data capture, callback after streaming if (self.mISudAudioEventListener != nil && [self.mISudAudioEventListener respondsToSelector:@selector(onCapturedPCMData:)]) {

      NSData *pcmData = [[NSData alloc] initWithBytes:data length:dataLength];
      [self.mISudAudioEventListener onCapturedPCMData:pcmData];
    

    } } ``` The onCapturedAudioData callback method returns local PCM data captured by RTC, with further processing explained in the next section.

  5. Passing the RTC Captured PCM Data to the SDK

The onCapturedAudioData callback method returns local PCM data slices, and the following method is used to pass the PCM data to the SDK:

/**
 * Audio stream data
 */
- (void)onCapturedPCMData:(NSData *)data {
    [self.sudFSTAPPDecorator pushAudio:data];
}

The pushAudio interface can be called in a working thread. Zego's audio slices default to 10ms, but the length can be adjusted for better results when passed to pushAudio.

V. App Stops Listening to RTC Audio Streams

When the mini-game exits the ASR scene due to a hit or timeout, it will send a state notification to the App to stop capturing PCM data. Once the App receives the MG_COMMON_GAME_ASR state with isOpen == NO, call the Zego interface [ZegoExpressEngine setAudioDataHandler:nil] and [ZegoExpressEngine stopAudioDataObserver] to stop local PCM data collection by Zego.

/// Stop capturing original audio
- (void)stopPCMCapture {
    ZegoExpressEngine *engine = [ZegoExpressEngine sharedEngine];
    if (engine != nil) {
        /* Set the audio data callback to nil */
        [engine setAudioDataHandler:nil];
        [engine stopAudioDataObserver];
    }
}

VI. Playing Games with ASR Only

When playing games with ASR only, the App only needs to handle the MG_COMMON_GAME_ASR state to enable/disable local PCM data collection. There is no need to send mg_common_key_word_to_hit to the game as in text-based hits.

Playing Mini Games with Text-Based Hits

In mini-games with voice recognition scenarios, text input can often be used simultaneously for hits. The game informs the App about the hit scene by sending the mg_common_key_word_to_hit state, and the App receives this notification through the SudFSMMGListener protocol:

- (void)onGameMGCommonKeyWordToHit:(nonnull id <ISudFSMStateHandle>)handle model:(MGCommonKeyWrodToHitModel *)model;

Text-based hit scenarios in mini-games are divided into two categories:

    1. For games where the App holds the keyword, like "Pictionary" and "Guess the Word", where model.word is not empty, the App needs to locally determine if the hit is successful. After determining the hit, the App notifies the game through the method [SudFSTAPPDecorator notifyAppComonDrawTextHit].
    1. For games where the App does not hold the keyword, like "Number Bomb", where model.word is empty, the App needs to send the text to the game each time for the game to determine if there is a hit. ```objc
  • (void)handleGameKeywordHitting:(NSString *)content { // Number Bomb if (self.sudFSMMGDecorator.isHitBomb) {
      if ([self isPureInt:content]) {
          /// Keyword hit
          [self.sudFSTAPPDecorator notifyAppComonDrawTextHit:false keyWord:@"" text:content];
      }
      return;
    
    } // Pictionary if (self.sudFSMMGDecorator.keyWordHiting == YES && [content isEqualToString:self.sudFSMMGDecorator.drawKeyWord]) {
      /// Keyword hit
      [self.sudFSTAPPDecorator notifyAppComonDrawTextHit:true keyWord:self.sudFSMMGDecorator.drawKeyWord text:self.sudFSMMGDecorator.drawKeyWord];
    
    } } ```

results matching ""

    No results matching ""