A speech data company built around custom collection

We build custom speech datasets for voice AI teams training models across regional accents, multilingual speakers, and challenging everyday environments. From hard-to-source dialects to noisy background conditions, we help teams capture the exact speech patterns that off-the-shelf datasets consistently miss.

Why we exist

Speech data projects break down in predictable places.

Mismatched recruitment, fluctuating audio quality, fragmented vendor workflows, and rigid delivery structures consistently delay machine learning pipelines.

We exist exclusively to eliminate these bottlenecks. We focus entirely on sourcing hard-to-find speakers and managing the complete collection-to-delivery workflow under a single, rigorous quality standard.

How we work

The operational philosophy.

01

Specificity over scale

We deliver the precise linguistic data you need rather than overwhelming your team with bulk volume that misses the core specification.

02

Catch production issues live

Our internal reviewers sample active audio files during live recordings so that data faults are resolved before final delivery.

03

Single coordinated workflow

Collection, transcription, and validation happen within a single continuous chain to eliminate multi-vendor handoff gaps.

Team

The team running your data collection project

Spirelight combines commercial project design, recruitment operations, platform engineering, transcription workflows, QA, and delivery management in one team.

01 / 08

Andreas Kromann

CEO · Commercial lead

Andreas works with clients to turn model requirements into concrete data collection projects.

He defines the project scope, speaker targets, recruitment approach, and delivery expectations before production starts.

Emil Thorsson

CFO · Operations

Emil supports operations, documentation, compliance coordination, and project delivery.

He helps structure the process so recruitment, consent, production, and handoff stay aligned.

Gustav Aggeboe

CTO · Platform architecture

Gustav leads the platform architecture behind Spirelight.

He builds the systems used to manage recording, transcription, QA, metadata, contributor workflows, and dataset delivery.

Joyi Ulfat

Senior Project Manager

Joyi manages project execution across contributors, reviewers, and delivery teams.

She keeps production moving, follows up on daily progress, and helps ensure each project meets its agreed requirements.

Mateo Thelen

Project Manager

Mateo coordinates contributors, recording workflows, and production tasks.

He helps translate project requirements into daily execution and keeps the different parts of the workflow aligned.

Pekka Larjovuori

Crowd Source Expert

Pekka supports recruitment strategy and contributor operations.

He helps source speakers for projects with specific language, dialect, regional, or profile requirements.

Victor Melchior

Sales · Market expansion

Victor leads sales and market research, mapping where Spirelight's speech data work fits new clients and regions.

He runs country expansion research and opens conversations in the markets we move into next.

Yusif Aliyev

Legal · Policy & compliance

Yusif leads legal, internal policy, and compliance at Spirelight.

He maintains the internal policies and compliance processes that keep consent, data handling, and contracts in order.

Get started

Tell us what training data you need

Tell us the languages, speech type, speakers, recording setup, transcript format, and metadata you need. We return within 48 hours with an initial workflow and data plan.

10,000+
contributors in our recruitment network
50+
languages and dialects recruited for
6
in-house roles in one workflow
48h
target response for project briefs