Fluency
Stuttering affects approximately 1% of the global population. Despite ongoing research, the underlying causes of stuttering remain only partially understood.
We are excited to introduce Fluency, an app paired with a connected device designed to combat stuttering more effectively. Fluency utilizes Delayed Auditory Feedback (DAF), a proven technique to reduce stuttering, as well as through rhythm, music, and sound. Our motivations for this project stems from personal connections to the subject matter, coupled with a deep interest in exploring how different sensory inputs affect user behaviors.
CONTRIBUTORS
Mohini Banerjee
Patrick Zhou
MY ROLE
UX Designer
UI Designer
Fluency Specific Features
1.Customizable Audio Delay and Pitch
- Users can personalize their experience by adjusting the audio delay times and pitch, tailoring the therapy to their specific needs.Music
2. Music Therapy Integration
- We leverage music therapy, encouraging speech processing through different neural pathways. By manipulating users' spoken words into a choral arrangement, user’s own speech is played back to them in a harmonized format.
3. Visual Feedback
- As users speak, they can read the transcribed text in real time on the screen, providing visual cues for fluent speech. Research has shown that reading aloud can reduce stuttering, and having the text available visually can reinforce fluent speech patterns and allow for on-the-spot self correction.
Logic behind the interface
We use python to capture speech and categorize speech patterns by training a machine learning model to identify and categorize, in real-time, the 3 stuttering types: Repetitions, Prolongations, and Blocks. The model generates a comprehensive report for users accessible through the app’s dashboard.
Actual spectrograms of speech patterns we collected from users for data processing.
From the above spectrograms, we extract features that capture characteristics of stuttered speech such as pitch, energy, formant frequencies, and timing information. The categorization of stuttering patterns is split up into the signal processing and machine learning components to distinguish between stuttered and fluent speech by retrieving and saving the audio file, applying a Hamming window to 20ms chunks and taking the Fourier transform of these chunks.
Diagram of Neural Network used to train our model to classify speech.
The Mel Filter Bank coefficient approach is applied to reduce the dimensionality to represent the data by individual frequency components. Using the extracted features, we train a machine learning model to classify segments of speech.