Owl Meeting - Professional Speech Recognition Tool | Voiceprint · Model Switching

A Toolbox for Efficiency

Smart Recognition

Assign dedicated recognition models to different speakers with automatic switching for higher accuracy.

Match optimal model based on voiceprint
Significantly reduce post-editing work

Complete Management

Manage history records, speaker voiceprint library, and recognition results with easy access.

Persistent voiceprint storage
Quick history search

Powerful Editing Features

Custom dictionary auto-correction, batch delete and replace, AI correction

Custom dictionary text replace, pinyin replace, and delete specified words
Click text to jump to corresponding audio position, listen while editing

Fast on CPU

No GPU Required,
Fast on CPU

Ultimate Performance

Built on multiple efficient ASR models, transcribe 30min audio in just 1min on CPU *

Say Goodbye to Complexity

One-click install, one-click model download, graphical interface

Full Privacy Control

All data processed locally, no internet required, sensitive info never leaks.

* Data based on i5-11400H

Online

Offline

History

5 segs | 2 speakers All Speakers ▾

Search

小白 00:00:00 - 00:00:23

嗯，那么今天我们就简单的进行一下新生招聘的讨论吧...

小北 00:00:24 - 00:00:34

嗯，地点的话我们现在可以有三个选择...

小白 00:00:34 - 00:00:40

操场的话，这段时间太热了，我怕人流量有点少

小北 00:00:41 - 00:00:50

确实，那考虑室内体育馆怎么样？

Local LLM (Ollama)

Deep Local LLM Integration,
Go Beyond "Transcription"

Easy to Use

Support one-click launch, and intuitive model management, making complex large models simple and easy to use

Pro-grade AI Templates

Pre-configured for one-click translation, summarization, and correction

Extensible Custom Prompting

Tailor AI intelligence to your unique workflow with custom prompts

Online

Offline

History

Voiceprints

Dict

AI

5 segs | 2 speakers

Source Trans Fix Custom

2speakers.wav

00:00:00 / 00:00:51

小白 00:00:00 - 00:00:23

嗯，那么今天我们就简单的进行一下新生招聘的讨论吧...

Well, then we can discuss the recruitment of new graduates today...

小北 00:00:24 - 00:00:34

嗯，地点的话我们现在可以有三个选择...

Well, the location options we have now are three choices...

小白 00:00:35 - 00:00:45

我觉得我们可以把重点放在计算机学院那边...

I think we can focus on the Computer Science College...

FAQ

Is internet required?

No. Model inference and data storage are both performed locally.

Is a GPU required?

Not required. The built-in speech-to-text models are optimized for CPU inference. Even a 10-year-old CPU can process 30 minutes of audio in about 3 minutes. When deploying large models with Ollama, a better GPU allows more advanced models.

Can I record microphone and system audio at the same time?

Yes. Dual-channel simultaneous recognition is supported.

Which languages are supported and how accurate is it?

Mandarin (97%), Chinese dialects (90%), English (95%), Korean, Japanese, Italian (97%), Spanish (96%), Portuguese (95%), German (95%), French (95%), Russian (94%), Ukrainian (93%), Polish (93%), Dutch (93%), plus 25 other European languages.

Does it support editing features?

Yes. It provides powerful editing features, including automatic custom-dictionary processing, click-to-play listen-and-edit mode, and batch modify/delete with automatic dictionary updates.

What formats are supported?

Supports major audio formats including MP3, WAV, FLAC, AAC, M4A, OGG, AIFF, ALAC, CAF, PCM, ADPCM, and WebM. Video or multi-channel audio can be converted with built-in tools before recognition.

Is it available for free?

Using the software may trigger an activation code requirement. However, if you have any reason to request free access, you can explain it in the feedback section and leave your email.

Cross-platform Ready

Windows

Supports Win 10 and above.

Download Now

macOS

Optimized for Intel and Apple Silicon, ultimate efficiency.

Coming Soon

Capture EveryInspiration Smartly