created, $=dv.current().file.ctime
& modified, =this.modified
tags:speechlanguage
For work I’m using openai whisper asr to automatically transcribe lecturers. I’ll briefly scan the output, which is speech directly. What I notice is that these transcriptions of organic text sometimes feel more “dream-like” than something that is written only. It’s like the inversion of a script in a film, where the dialog feels literate and written (“nobody speaks like that”.)
Maybe on the page there tends to be more explicitness (not as a rule, you may still be vague) while in dialog between agents a kind of nebulous “thought cloud” emerges off the page, implicitly, between all of the speakers. The voice will meander, processing thoughts in real time without the same degree of pause and reflection. It’s all in real time, including mental edits because organic speech in our culture doesn’t prioritize pause.
What is that feeling? Perhaps a density or cleverness to words and thoughts.
I’m curious about this gulf between the written word (not uttered), reading text (as in forms of acting), and organically constructed language (a transcribed conversation between friends.) I’ve always felt most comfortable with writing my thoughts in general. I would perhaps still say though the most rewarding form of these three is a dialog between a small group of understanding participants.
I’ve noticed when I speak it’s like it takes a minute for me to “rev” up. If prompted to respond, I’ll be disoriented and unsure where to start. I might reach a groove in time but my initial response is stammering. Some part of this is my uncertainty with my conversation partner and I’ll pull in my language. I don’t ever want to be great.
- why do people sound different in text messages than in real life?
- whose voice and are there differences in how we interpret voices
- why does a writing voice sound different than a speaking voice
- we speak of an author’s voice or a writer finding their voice
- why do some people sound generic and they cannot escape this?
- I was looking at this tech blogger’s blog, and I was reading the pages and it felt like I was reading nothing
Breaks
I was watching a tutorial because I had to do some infrastructural work and needed a refresher. It was a standard, straight to business screen recording of the task.
The video creator goes to expand a list of cloud tables, which fails to respond. He shakes his mouse as Loading and other Sensations.
But I comment here about what happened. At that moment there was silence in the recording, but there was also this sublime ambience that emerged or was just revealed to be present the entire time but we were so fixated on the task. You could hear a gentle wind in the microphone and a birdsong. He must be outside or by a window.
In a different case I have this podcast saved which the podcasters later fixed but the original was this same kind of unintentional art. What happened was the different speaker audio streams got out of sync in the edit.
This resulted in really awkward interruptions where the guests spoke over one another (generally conversation on podcasts has some decorum, little speaker overlap - here almost every conversation started with cutting someone off and then the original speaker barreling through) There was this point where all you heard was movements of a chair, fidgets and breathing that are normally hidden by another speaker.
A lone responsive giggle, but without the joke. Islands of conversation colliding.