[vɔks]

Development of a voice interface for use in special-purpose machines and modern manufacturing systems

Date: 2024/2025
Programs: Visual Studio Code | Siemens TIA Portal | EPLAN
Technologies: Python | Poetry | Github | OpenAI Whisper | Suno AI Bark | Argos Open Tech Translate | OPC UA | WiFi | Ethernet | Siemens PLC

About the project
The [vɔks] project developed a voice-controlled interface for use in special-purpose machines and modern manufacturing systems. The aim was to create a novel, user-friendly, and multilingual interface for human-machine interaction, particularly to support international teams in operation, maintenance, and monitoring. Voice control enables intuitive communication with machines and systems and can help to reduce the complexity of tasks for operators. A software prototype was developed based on state-of-the-art AI technologies and ran entirely locally — without the need for a connection to external servers. This also adequately addressed data protection concerns in industrial environments. The project combined technological innovation with practical applicability and made a significant contribution to the implementation of Industry 4.0 concepts.

Technical Approach
The prototype was built on a modular software architecture in Python and integrated advanced AI technologies such as OpenAI Whisper (speech-to-text), Argos Open Tech Translate (neural machine translation), and Suno AI Bark (text-to-speech). Communication with machine controls was handled via the open OPC UA standard. All processing was carried out locally to ensure the protection of sensitive data. The technological framework was implemented iteratively — initially as a proof of concept, and then further refined through evolutionary prototyping. Programmable logic controllers from Siemens were used to simulate industrial environments. A strong emphasis was placed on user-friendliness, which is why experts in UX/UI and production engineering were actively involved throughout the development process.

Retrospective
[vɔks] demonstrated a possible approach and showcased the potential of voice-controlled interfaces in industrial applications. The iterative development process enabled targeted implementation and allowed technological and conceptual challenges to be identified and addressed at an early stage. In the course of the project, it became clear that the local integration of AI components and user-centred adaptation were particularly important for success. The final evaluation of the prototype under realistic conditions revealed both its functionality and its potential for optimisation, for example in terms of response speed or the handling of complex queries. Overall, [vɔks] represents a successful step towards intuitive, secure and data protection-compliant voice-based human-machine communication.

Study projects

Personal projects