RVMedia 12.0 has been released.
The trial version can be downloaded from https://www.trichview.com/download/
The full version: can be found in the protected section of the forum. This update is free for customers who ordered/renewed RVMedia in 2024-2026, and for customers with RVMedia subscription.
Main new features after the last released version (v11.2):
A reminder of other changes made after version 11.0:
See the RVMedia version history: https://www.trichview.com/help-media/version_history.htm
This release adds support for FFmpeg 8.
RVMedia now supports FFmpeg versions 1 through 8.
The RVMedia installation now includes an FFmpeg 8.1.1 build for Windows 64-bit with Whisper support (see below). This build is compatible with the LGPL license.
Options that require the GPL license have been removed, since they can only be used in open-source applications.
Whisper is an free open-source speech recognition and transcription AI model developed by OpenAI. It is designed to convert spoken language into text.
RVMedia can use a Whisper version integrated in FFmpeg 8+.
The Whisper code is included in FFmpeg. However, a model file is also required.
Speech-to-text conversion is performed entirely on the user’s computer and does not require any online services or API keys. All that is needed is a speech recognition model file.
The RVMedia installation includes the smallest available English-only model. While it is not very suitable for real-world use, it allows you to test speech recognition functionality and can run even on relatively low-end computers.
Additional models can be downloaded here: https://huggingface.co/ggerganov/whisper.cpp/tree/main.
Larger model files provide better recognition accuracy, but they also require more powerful hardware. Ideally, the user should have a modern high-performance GPU. However, even without a GPU, the smaller models can be used on the CPU.
The available model files are divided into:
In addition to the main models that perform speech recognition, FFmpeg can optionally use VAD (Voice Activity Detection) AI models.
These models detect when speech starts and ends in the audio stream, allowing the main recognition model to run only when necessary. This provides two important benefits:
The drawback of this approach is that it requires significantly more audio to be buffered before recognition can begin. As a result, recognized text becomes available with greater latency.
Speech recognition is integrated into RVMedia in two places.
First, the TRVCamera component can perform speech recognition when it receives video with audio using FFmpeg (note that this requires FFmpeg 8 or later built with Whisper support). In this case, speech recognition runs simultaneously with receiving video. It can be enabled or disabled at any time while the video is being received.
Speech recognition settings are available in TRVCamera.FFmpegProperty.SpeechToText: TRVFFmpegSpeechToTextProperty . Recognized text is returned through the TRVCamera.OnSpeechRecognized event.
Second, speech recognition is available in the TRVAudioPlayer component, in addition to its audio playback and recording capabilities. In this case, the audio data may come from any RVMedia audio source, including:
In TRVAudioPlayer, speech recognition works independently of audio playback, but is tied to the recording functionality. If TRVAudioPlayer is recording audio to a file, speech recognition can be enabled in addition to recording. However, the component can also perform speech recognition without recording audio to a file.
Speech recognition settings are available in TRVAudioPlayer.SpeechToTextProperty: TRVFFmpegSpeechToTextProperty . Recognized text is returned through the TRVAudioPlayer.OnSpeechRecognized event.
This update significantly optimizes the decoding of frames received from local webcams. This applies to Windows and Linux, where RVMedia performs frame decoding itself. (On macOS, RVMedia uses the operating system’s built-in decoding facilities.) As a result, CPU usage is significantly reduced, and in some cases a higher frame rate can be achieved.
Support for MJPEG modes of local cameras has also been added on Linux. These modes are typically more efficient than other camera formats.
Demos\Recording\SpeechToText\
A new speech recognition demo has been added (in 3 versions: for VCL, for Lazarus, for FireMonkey).
List of sample cameras
The list of public cameras used in many demo projects has been updated. Non-working cameras have been removed, and new cameras have been added.
Compiled RVMedia demo projects (VCL for Windows) can be downloaded from https://www.trichview.com/download/mediademo.html
This update improves clipboard image support in the Windows version of TRichView. Previously, the components…
We are pleased to announce a new release of ReportWorkshop. This update is free for…
RichView 24.0.3 includes a new set of images for dialog boxes of RichViewActions and ScaleRichView…
This update completes the work on the new background definition system (well, almost).The new properties…
We are pleased to announce a new release of RVMedia. RVMedia now supports RAD Studio…
We are pleased to announce new releases of TRichView, ScaleRichView, and ReportWorkshop. All components now…