Skip to content

Agent transcribes its own voice and gets interrupted during user conversation #11

Open
@Naseem56

Description

@Naseem56

Title:
Agent transcribes its own voice and gets interrupted during user conversation

Description:
During a conversation session, the agent seems to transcribe its own generated speech as if it were user input. This causes unexpected interruptions and confusion in the dialog flow.

Expected Behavior:
The agent should ignore its own audio during speech recognition, and only process genuine user input.

Actual Behavior:
Agent’s voice is picked up by the speech recognition system and mistakenly transcribed as user input.

Logs:

20:04:02.544 INFO: Server: {"type":"ConversationText","role":"assistant","content":"Hello! I'm Sarah from TechStyle customer service. How can I help you today?"}
20:04:03.476 INFO: Server: {"type":"AgentAudioDone"}
20:04:03.862 INFO: Server: {"type":"UserStartedSpeaking"}
20:04:05.132 INFO: Server: {"type":"ConversationText","role":"user","content":"Can you hear me? Hello. I'm Sarah from"}
20:04:05.174 INFO: Server: {"type":"EndOfThought"}
20:04:06.333 INFO: Server: {"type":"ConversationText","role":"assistant","content":"It seems like there's been a little mix-up with names!"}
20:04:06.342 INFO: Server: {"type":"AgentStartedSpeaking","tts_latency":0.0413104,"ttt_latency":1.161309916,"total_latency":1.712456407}
20:04:07.157 INFO: Server: {"type":"UserStartedSpeaking"}
20:04:07.159 INFO: Server: {"type":"AgentAudioDone"}
20:04:08.111 INFO: Server: {"type":"ConversationText","role":"user","content":"It seems l

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions