Departing from HARK

Initially we had settled on HARK because it appeared to be the most complete and current software package that had ROS modules and performed all the functions we were looking for. After several weeks of developing the HARK modules we decided to move to another solution for various reasons.

Lack of supporting documentation
- While there was a lot of documentation on how to build basic features of a hark network there was not enough documentation on how to tune the system to get adequate performance. HARKtool4, HARK's calibration tool was especially difficult to use in getting adequate transformation and impulse list response files.
Voice recognition is a separate package
- HARK on its own does only audio processing and outsources voice processing to another program called Julius. This program doesn't have a large English language model limiting our use of it.
Built for stationary system
- The calibration of the HARK involves using HARKtool which consists of recording audio files in a circumference around the microphone array, or calculating the calibration using some built in tools. This calibration gives hark a model for background noise and room reverberations for which localization and sound separation depend on. Knowing this it would seem as though HARK is not suitable for a dynamic environment or one in which our micarray is moving through areas with inconsistent audio qualities.
Other Solutions were more robust
- In looking for a replacement for Julius we decided to test out the Google speech API (GSA). We first tested it using HARK as a front end for sound localization and source separation and then sending the output to the GSA. We found that the results were very inconsistent however audio processing on GSA alone was very robust and so we opted to leave HARK as its primary functions for our purposes were not performing as we hoped. We decided that the trade-off of not having sound localization but having a functioning and resilient speech recognition system was affordable and we have opted to adopt GSA as our sole audio processing system.

Departing from HARK

Departing from HARK

Lack of supporting documentation

Voice recognition is a separate package

Built for stationary system

Other Solutions were more robust

results matching ""

No results matching ""