Fluency

Stuttering affects approximately 1% of the global population. Despite ongoing research, the underlying causes of stuttering remain only partially understood.

We are excited to introduce Fluency, an app paired with a connected device designed to combat stuttering more effectively. Fluency utilizes Delayed Auditory Feedback (DAF), a proven technique to reduce stuttering, as well as through rhythm, music, and sound. Our motivations for this project stems from personal connections to the subject matter, coupled with a deep interest in exploring how different sensory inputs affect user behaviors.

CONTRIBUTORS

Mohini Banerjee
Patrick Zhou

TOOLS

Adobe CC
Figma
Python

MY ROLE

UX Designer
UI Designer

Fluency Specific Features

1.Customizable Audio Delay and Pitch
- Users can personalize their experience by adjusting the audio delay times and pitch, tailoring the therapy to their specific needs.Music

2. Music Therapy Integration
‍- We leverage music therapy, encouraging speech processing through different neural pathways. By manipulating users' spoken words into a choral arrangement, user’s own speech is played back to them in a harmonized format.

3. Visual Feedback
- As users speak, they can read the transcribed text in real time on the screen, providing visual cues for fluent speech. Research has shown that reading aloud can reduce stuttering, and having the text available visually can reinforce fluent speech patterns and allow for on-the-spot self correction.‍

Logic behind the interface

We use python to capture speech and categorize speech patterns by training a machine learning model to identify and categorize, in real-time, the 3 stuttering types: Repetitions, Prolongations, and Blocks. The model generates a comprehensive report for users accessible through the app’s dashboard.

Actual spectrograms of speech patterns we collected from users for data processing.

From the above spectrograms, we extract features that capture characteristics of stuttered speech such as pitch, energy, formant frequencies, and timing information. The categorization of stuttering patterns is split up into the signal processing and machine learning components to distinguish between stuttered and fluent speech by retrieving and saving the audio file, applying a Hamming window to 20ms chunks and taking the Fourier transform of these chunks.

Diagram of Neural Network used to train our model to classify speech.

The Mel Filter Bank coefficient approach is applied to reduce the dimensionality to represent the data by individual frequency components. Using the extracted features, we train a machine learning model to classify segments of speech.