Findy Phonie: An Android Phone Finder

thumbnail

header

We’ve been on an investigative spree lately, and one of the products of our research is a little experiment in signal processing on Android. This Android mobile app runs in the background and responds to the sound of a whistle, so that, should you lose your phone, you can simply whistle, and have the phone make a noise, to help you find it. Detecting a whistle is a fairly involved task, and developing an app to do so yielded some of the following insights.

We’ve been on an investigative spree lately, and one of the products of our research is a little experiment in signal processing on Android. This Android mobile app runs in the background and responds to the sound of a whistle, so that, should you lose your phone, you can simply whistle, and have the phone make a noise, to help you find it. Detecting a whistle is a fairly involved task, and developing an app to do so yielded some of the following insights.

Sampling

To start working with sounds, we used a discrete Fourier transform to convert the signal into the frequency domain. We used jTransforms for this.

Sampling and converting audio is easy:

    short[] audioData = new short[mFFTSize];
    float[] fftData = new float[mFFTSize];

    mAudioRecord.startRecording();
    if (mAudioRecord.getRecordingState() != AudioRecord.RECORDSTATE_RECORDING) {
        Log.e(TAG, "Audio session couldn't be opened");
    }

    //Get required amount of audio samples
    mAudioRecord.read(audioData, 0, mFFTSize);

    //Convert shorts into floats needed by the library
    for (int i = 0; i < mFFTSize; ++i) {
        //Some examples converted audio to -1,1 range but it's not required
        //fftData[i] = audioData[i] / 32767.0f; //Short to -1,1
        fftData[i] = audioData[i];
    }

    //Do the FFT. Operation is done inplace with this library.
    mFFT.realForward(fftData);

Now the fftData array contains the imaginary and real pairs. The second half of the data is a mirror of the first because the input was real numbers only so this can be ignored.

    //Loop over first half and add the R + I pairs to get a magnitude 
    //for each FFT bucket. Don't need to bother with square roots and 
    //accurate magnitudes.
    for (int i = 0; i < mFFTSize / 4; ++i) {
        int i2 = i*2;
        fftData[i] = Math.abs(fftData[i2]) + Math.abs(fftData[i2 + 1]);
    }

Graphing

We needed to look into the data produced by this sampling mechanism, so we could create an algorithm to detect the whistles. For this, we used GraphView.

This was the output from ambient room noise. It fit with what we expected from ambient noise: a large DC component and gradually smaller high frequency component. The low part was useless, so we cut out everything below 500 Hz. To find the bucket index from a frequency:

frequency / ((float) mSampleRate / mFFTSize));  

This was the output from a whistle: an obvious narrow spike, many times louder than the other frequencies. Below was the output from general noise:

Analysis

To detect a whistle programmatically, the plan was to find a spike confined to a certain frequency -- 500Hz -- a number of times louder than the amplitude at any other frequency. This approach mostly worked, but could be triggered by other sounds in the 500 Hz range. To make the test more specific, we needed to check if the tip of the spike was very narrow. This was done by fitting a triangle of two lines around the spike. If any buckets crossed this line, there was a good chance that the spike was just noise, rather than a whistle. Below is a test fitting the parameters. The main frequency test is the blue box, and the triangle test is in yellow:

To make the test even more specific, and preclude more false positives, the number of zero crossings had to be counted. Very noisy sounds can produce narrow spikes, but also produce many more zero crossings than a tone. We looked for 500 to 2800 crossings per second to confirm the whistle:

    //Get num zero crossings per second
    private int countZeroCrossings(short[] audioData) {
        boolean lastSign = audioData[0] > 0;
        int crossings = 0;

        for(int i = 1; i < audioData.length; ++i) {
            if(lastSign && audioData[i] < 0) {
                crossings++;
                lastSign = false;
            }
            else if(!lastSign && audioData[i] > 0) {
                crossings++;
                lastSign = true;
            }
        }

        float sampleDuration = (float) mFFTSize / mSampleRate;
        return (int) (crossings * (1.0f/sampleDuration));
    }

Conclusion

Thanks to the techniques above, detecting whistles whilst the phone is on the desk can be done with almost perfect accuracy. However, if the phone is hidden away, the microphone is covered up, or the phone is enclosed in a case, the detection rate drops sharply. This is to be expected: after all, trying to listen to a conversation whilst stuffed in a bag with your fingers in your ears is equally difficult. Likewise, the false positive trigger rate increases substantially if music is playing, or a microwave is microwaving. Our next attempt at noise detection might rely on neural networks, or some other technique to make the process ‘smarter’.