An experimental voice-to-text tool that stylizes spoken words based on how they are said, turning speech into a visual echo of your voice.

Tools: JavaScript, HTML, CSS, AFFIN Json, Web Speech Recognition

Click here to try the tool yourself!

Interactive Web Tool

SPEAK TO SEE

WHAT IS SPEAK TO SEE?

When typesetting, we usually  control every stylistic decision. But what if our speech styled itself, based on
the data of how we speak, not our taste?

WHAT IF WE LET SPEECH DECIDE?

The whole project is rooted in this very question. Instead of
a designer styling text based on their preferences, the tool turns spoken words into styled

text based on how they’re said. The time between words, the sentiment they carry, and even the 
length
of pauses.

MATH + LOGIC BEHIND

The tool stores timestamp data, performs real-time calculations on pause length and sentiment, and applies styling based on those variables.

Timestamps are stored for chunks of heard speech


→ Pause durations are calculated in milliseconds


→ Sentiment scores are pulled from a JSON dictionary


→ Font size and letter spacing are generated through conditional formulas


→ Fading is handled by a gradual opacity function that simulates memory loss