1. Introduction

This project implements a professional video-to-audio conversion application for macOS using SwiftUI and FFmpeg. The application provides a modern, intuitive interface for extracting audio from video files with support for 19 different audio formats, customizable metadata, and real-time conversion progress tracking.

The project addresses the need for a native macOS solution for audio extraction from video files, offering capabilities beyond basic conversion tools. By leveraging FFmpeg's powerful encoding capabilities through a SwiftUI interface, the system provides professional-grade audio extraction with precise control over bitrate, format, and metadata.

The implementation demonstrates practical applications of native macOS development, combining AVFoundation for media handling, Process management for FFmpeg integration, and SwiftUI for a modern user interface. The application supports batch processing, drag-and-drop functionality, and real-time video preview.

Core Features:

  • Support for 19 audio formats (MP3, AAC, FLAC, WAV, OGG, Opus, and more)
  • Real-time video preview with custom player controls
  • Drag-and-drop file selection
  • Customizable audio metadata (title, artist, album, year, genre)
  • Real-time conversion progress tracking
  • Batch file processing with individual status monitoring
  • 10-language localization support
  • Dark/Light theme switching
  • Video thumbnail generation
  • Professional UI with modern design patterns

2. Methodology / Approach

The application employs a multi-layered architecture combining SwiftUI for the user interface, AVFoundation for media handling, and FFmpeg for audio encoding. The system uses the MVVM (Model-View-ViewModel) pattern to separate business logic from UI presentation.

2.1 System Architecture

The video-to-audio conversion pipeline consists of:

  1. File Management: Drag-and-drop or file picker for video selection
  2. Video Analysis: AVFoundation extracts duration, thumbnails, and metadata
  3. Format Selection: User chooses from 19 supported audio formats
  4. Metadata Editing: Optional custom metadata for output files
  5. FFmpeg Processing: Process management for audio extraction
  6. Progress Monitoring: Real-time parsing of FFmpeg output
  7. File Output: Temporary storage with user-directed save location

2.2 Implementation Strategy

The implementation uses SwiftUI for reactive UI updates and Combine framework for state management. The VideoConverter class manages FFmpeg processes, parsing standard error output to extract progress information. A custom VideoPlayerView provides frame-accurate preview with keyboard controls and hover interactions.

Processing Flow:

Video Input → Duration Analysis → Format Selection → Metadata Configuration → FFmpeg Conversion → Progress Tracking → Audio Output

FFmpeg Integration:

  • Process spawning with custom arguments
  • Real-time output parsing for progress calculation
  • Error handling with localized messages
  • Support for various audio codecs and containers

2.3 Localization Architecture

The application implements a comprehensive localization system supporting 10 languages:

  • English (en)
  • Turkish (tr)
  • German (de)
  • Dutch (nl)
  • Spanish (es)
  • Italian (it)
  • Portuguese Brazil (pt-BR)
  • Russian (ru)
  • Japanese (ja)
  • Arabic (ar)

The LocalizationManager class provides reactive language switching with automatic UI updates through Combine publishers.

3. Mathematical Framework

3.1 Progress Calculation

The conversion progress is calculated by parsing FFmpeg's time output and comparing it to total video duration:

$$\text{Progress} = \frac{t_{\text{current}}}{t_{\text{total}}}$$

where $t_{\text{current}}$ is the current processing time and $t_{\text{total}}$ is the total video duration in seconds.

3.2 Time Formatting

Video and audio durations are formatted based on length:

  • Short format: MM:SS (for videos under 1 hour)
  • Long format: HH:MM:SS (for videos 1 hour or longer)

3.3 File Size Calculation

The application estimates output file size based on bitrate and duration. The default bitrate of 320 kbps is used for high-quality audio extraction in lossy formats like MP3 and AAC.

4. Requirements

4.1 System Requirements

macOS 13.0 or later
Xcode 15.0 or later
FFmpeg installed on system
Swift 5.9 or later

4.2 FFmpeg Installation

The application requires FFmpeg to be installed on the system. Install via Homebrew:

brew install ffmpeg

Supported FFmpeg Paths:

  • /opt/homebrew/bin/ffmpeg
  • /opt/homebrew/Cellar/ffmpeg/*/bin/ffmpeg
  • /usr/local/bin/ffmpeg
  • /usr/bin/ffmpeg

4.3 Supported Video Formats

The application accepts all standard video formats:

  • MP4, MOV, AVI, MKV, WebM
  • FLV, WMV, MPEG, MPG, 3GP
  • MXF, TS, MTS, M2TS, DV
  • F4V, OGV, 3G2, and more

4.4 Supported Audio Output Formats

19 audio formats are supported:

  • Lossy: MP3, AAC, M4A, OGG, Opus, WMA, MP2, AMR
  • Lossless: FLAC, WAV, TTA, AIFF, AIF
  • Professional: AC3, EAC3, DTS, AU, MKA, ADTS

5. Installation & Configuration

5.1 Environment Setup

# Clone the repository
git clone https://github.com/kemalkilicaslan/Video-to-Audio-Converter.git
cd Video-to-Audio-Converter

# Install FFmpeg (required)
brew install ffmpeg

# Open project in Xcode
open Video-to-Audio-Converter.xcodeproj

5.2 Project Structure

Video-to-Audio-Converter/
├── Video-to-Audio-Converter/
│   ├── ContentView.swift                    # Main UI
│   ├── VideoConverter.swift                 # FFmpeg integration
│   ├── VideoPlayerView.swift                # Video preview
│   ├── LocalizationManager.swift            # Localization system
│   ├── Theme.swift                          # Theme management
│   ├── Video_to_Audio_ConverterApp.swift    # App entry point
│   ├── Assets.xcassets/                     # App resources
│   │   └── icon.png
│   └── Localizable.strings/                 # 10 language files
│       ├── en.lproj/
│       ├── tr.lproj/
│       ├── de.lproj/
│       ├── nl.lproj/
│       ├── es.lproj/
│       ├── it.lproj/
│       ├── pt-BR.lproj/
│       ├── ru.lproj/
│       ├── ja.lproj/
│       └── ar.lproj/
├── README.md
└── LICENSE

5.3 Xcode Configuration

Build Settings:

  • Deployment Target: macOS 13.0
  • Swift Language Version: Swift 5
  • Architecture: Apple Silicon, Intel

Required Capabilities:

  • File access permissions
  • User-selected files (read-only)
  • Temporary directory access

5.4 FFmpeg Path Configuration

The application automatically searches for FFmpeg in standard locations. If FFmpeg is installed in a custom location, update the findFFmpeg() method in VideoConverter.swift:

private func findFFmpeg() -> String? {
    let possiblePaths = [
        "/opt/homebrew/bin/ffmpeg",
        "/usr/local/bin/ffmpeg",
        "/your/custom/path/ffmpeg"  // Add custom path
    ]
    // ...
}

6. Usage / How to Run

6.1 Building the Application

Via Xcode:

  1. Open Video-to-Audio-Converter.xcodeproj
  2. Select target: "My Mac"
  3. Click Product → Run (⌘R)

Via Command Line:

xcodebuild -project Video-to-Audio-Converter.xcodeproj \
  -scheme Video-to-Audio-Converter \
  -configuration Release \
  build

6.2 Basic Workflow

Step 1: Add Video Files

  • Drag and drop video files into the application
  • Or click "Select Files" button
  • Multiple files can be added for batch processing

Step 2: Select File for Preview

  • Click on a file card in the sidebar
  • Video preview will appear in the main panel
  • Player controls allow scrubbing and playback

Step 3: Choose Audio Format

  • Click "Convert" button on the preview panel
  • Select desired audio format from the grid (19 options)
  • Optionally add metadata (title, artist, album, year, genre)
  • Click "Confirm" to start conversion

Step 4: Monitor Progress

  • Real-time progress bar shows conversion status
  • Percentage indicator updates continuously
  • Cancel conversion anytime with "Cancel" button

Step 5: Save Output

  • Upon completion, "Save" button appears
  • Choose save location and filename
  • Original video remains unchanged

6.3 Advanced Features

Video Player Controls:

Action Keyboard Shortcut Mouse Action
Play/Pause Space Click center button
Rewind 10s Click rewind button
Forward 10s Click forward button
Volume Up Drag volume slider
Volume Down Drag volume slider
Mute/Unmute M Click speaker icon

Format Selection:

// MP3 (Recommended for compatibility)
Bitrate: 320 kbps
Codec: libmp3lame
Quality: High

// FLAC (Lossless)
Codec: FLAC
Compression: Level 8

// AAC (Modern streaming)
Bitrate: 320 kbps
Profile: AAC Low Complexity

Metadata Editing:

Title: Song or video name
Artist: Creator or performer
Album: Collection name
Year: Release year (YYYY)
Genre: Music or content genre

6.4 Localization

Changing Language:

  1. Click language dropdown in header (flag icon + language name)
  2. Select from 10 supported languages
  3. UI updates immediately without restart

Supported Languages:

  • 🇬🇧 English
  • 🇹🇷 Türkçe
  • 🇩🇪 Deutsch
  • 🇳🇱 Nederlands
  • 🇪🇸 Español
  • 🇮🇹 Italiano
  • 🇧🇷 Português
  • 🇷🇺 Русский
  • 🇯🇵 日本語
  • 🇸🇦 العربية

6.5 Theme Switching

Toggle Dark/Light Mode:

  • Click theme button (sun/moon icon) in header
  • Theme persists across app restarts
  • Follows modern macOS design guidelines

7. Application / Results

7.1 User Interface

Main Interface - Light Mode:

Main Interface Light Mode

The application features a modern, clean interface with intuitive controls and real-time preview.


Main Interface - Dark Mode:

Main Interface Dark Mode

Dark mode provides a comfortable viewing experience with high contrast and reduced eye strain.


Drop Zone:

Drop Zone

Drag and drop interface for easy file selection with animated visual feedback.


File List with Thumbnails:

File List

Sidebar displays video thumbnails, durations, and conversion status for each file.


Video Preview Panel:

Video Preview

Full-featured video player with custom controls and file information display.


Format Selection Sheet:

Format Selection

Choose from 19 audio formats with metadata editing capabilities.


Metadata Editor:

Metadata Editor

Customize audio file metadata including title, artist, album, year, and genre.


Conversion Progress:

Conversion Progress

Real-time progress tracking with percentage indicator and cancel option.


Conversion Success:

Conversion Success

Success state showing converted file information with save and retry options.


Language Selection:

Language Selection

10 language support with instant UI updates without restart.


7.2 Interface Components

1. Header Bar

  • Application title with gradient icon
  • Language selector dropdown
  • Theme toggle button

2. Drop Zone (when no files selected)

  • Animated circular icon
  • "Drag files here or click to select"
  • Large "Select Files" button

3. Split View (with files selected)

  • Left sidebar: File list with thumbnails
  • Right panel: Video preview and controls
  • Divider for resizable layout

4. File Cards

  • Video thumbnail (70×52px)
  • Filename and duration
  • Format badge
  • Status indicator (converting/success/error)
  • Context menu actions

5. Video Preview Panel

  • Full video player with custom controls
  • File information chips (format, size, duration)
  • Conversion controls and status
  • Action buttons (Convert, Save, Retry)

7.3 Conversion Workflow Example

Example: Converting MP4 to MP3

Input File:
- Format: MP4
- Duration: 03:45
- Size: 125.4 MB

Conversion Settings:
- Output Format: MP3
- Bitrate: 320 kbps
- Metadata: Custom tags

Output File:
- Format: MP3
- Duration: 03:45 (identical)
- Size: 9.2 MB (~7.3% of original)
- Quality: High-fidelity audio

7.4 Performance Metrics

Metric Value Notes
Conversion Speed 1-10x realtime Depends on codec and system
Memory Usage 50-150 MB Varies with video file size
CPU Usage 40-80% Single-core FFmpeg process
Supported File Size Up to 2 GB Hard limit for stability
Thumbnail Generation <1 second Async processing
UI Responsiveness 60 FPS SwiftUI rendering
Progress Update Rate 10 Hz 100ms intervals

7.5 Format Comparison

Lossy Formats (320 kbps):

Format File Size Quality Compatibility Use Case
MP3 ~9 MB Excellent Universal General purpose
AAC ~8 MB Excellent Modern devices Streaming
OGG ~8 MB Excellent Open source Linux/Web
Opus ~7 MB Excellent Modern Low-latency

Lossless Formats:

Format File Size Quality Compatibility Use Case
FLAC ~40 MB Perfect Good Archival
WAV ~60 MB Perfect Universal Professional
AIFF ~60 MB Perfect macOS/iOS Apple ecosystem

7.6 Error Handling

Common Error Messages:

Error Cause Solution
"FFmpeg not found" FFmpeg not installed Install via Homebrew
"Unsupported format" Invalid output format Choose from supported list
"File too large" File exceeds 2 GB Split video or use smaller file
"Invalid video format" Corrupted or unsupported video Try different source file
"Permission denied" File access restricted Check file permissions
"Conversion cancelled" User cancelled process Normal operation

7.7 Localization Coverage

All UI elements are fully localized across 10 languages:

Localized Components:

  • Button labels and tooltips
  • Error and success messages
  • Format names and descriptions
  • Menu items and dialogs
  • Placeholder text
  • Time and file size formatting

Example Localized Strings:

English: "Conversion successful"
Turkish: "Dönüştürme başarılı"
German: "Konvertierung erfolgreich"
Spanish: "Conversión exitosa"
Japanese: "変換が成功しました"

7.8 Accessibility Features

  • Full keyboard navigation support
  • VoiceOver compatibility
  • High contrast theme support
  • Scalable UI elements
  • Tooltip descriptions for all controls
  • Progress announcements

8. Tech Stack

8.1 Core Technologies

  • Language: Swift 5.9+
  • UI Framework: SwiftUI
  • Reactive Programming: Combine
  • Media Framework: AVFoundation, AVKit
  • Process Management: Foundation.Process
  • Audio Processing: FFmpeg (external)

8.2 SwiftUI Components

Component Purpose File
ContentView Main application UI ContentView.swift
VideoPlayerView Custom video player VideoPlayerView.swift
ModernFileCard File list item ContentView.swift
FormatSelectionSheet Format picker modal ContentView.swift
LanguageDropdownMenu Language selector ContentView.swift
ThemeToggleButton Dark/light mode toggle ContentView.swift

8.3 Architecture Pattern

MVVM (Model-View-ViewModel):

View (SwiftUI)
    ↓
ViewModel (ObservableObject)
    ↓
Model (Data structures)

Key Classes:

Class Type Responsibility
ContentView View Main UI layout
VideoConverter ViewModel FFmpeg process management
VideoPlayerViewModel ViewModel Video playback state
LocalizationManager Service Language management
Theme Model Color scheme definitions
FileConversionStatus Model Conversion state tracking
FileMetadata Model Audio metadata storage

8.4 FFmpeg Integration

Process Execution:

Process()
    .executableURL = "/opt/homebrew/bin/ffmpeg"
    .arguments = ["-i", input, "-vn", codec, output]
    .standardError = Pipe()  // Progress monitoring

Supported Codecs:

Format FFmpeg Codec Container Quality
MP3 libmp3lame .mp3 Lossy
AAC aac .m4a/.aac Lossy
FLAC flac .flac Lossless
Opus libopus .opus Lossy
Vorbis libvorbis .ogg Lossy

8.5 Localization System

Structure:

LocalizationManager (ObservableObject)
    ├── currentLanguage: String
    ├── supportedLanguages: [(code, name, flag, localizedKey)]
    └── localizedString(_ key: String) -> String

String Extension:
    var localized: String

Language Files:

Each language has a Localizable.strings file with 100+ translated keys.

8.6 State Management

Published Properties:

@Published var progress: Double
@Published var isConverting: Bool
@Published var errorMessageKey: String?
@Published var currentLanguage: String
@Published var isDarkMode: Bool

State Flow:

User Action → ViewModel Update → Published Property Changed → SwiftUI View Refresh

8.7 Dependencies

Dependency Version Purpose
SwiftUI Native UI framework
Combine Native Reactive programming
AVFoundation Native Media handling
AVKit Native Video player
Foundation Native Core functionality
UniformTypeIdentifiers Native File type handling
FFmpeg 6.0+ Audio encoding (external)

9. License

This project is open source and available under the Apache License 2.0.

10. References

  1. Apple Developer AVFoundation, SwiftUI, and Combine Frameworks Documentation.
  2. FFmpeg Command Line Tools and Audio Encoding Guide.

Acknowledgments

This project utilizes FFmpeg for audio encoding and conversion capabilities. Special thanks to the FFmpeg development team for providing powerful, open-source multimedia processing tools. The application's UI design is inspired by modern macOS design guidelines and GitHub's design system. Localization contributions help make this tool accessible to users worldwide.


Note: This application requires FFmpeg to be installed on your system. Install via Homebrew (brew install ffmpeg) before first use. The application is designed for macOS 13.0 or later and leverages native Apple frameworks for optimal performance. All video processing occurs locally on your device—no data is sent to external servers. For production deployment, ensure FFmpeg licensing compliance for your use case. The application supports conversion for personal, educational, and commercial purposes within the bounds of copyright law and FFmpeg's licensing terms.