Google has officially launched Magika 1.0, its stable version of an AI-powered file detection tool. The system has been rebuilt entirely in Rust to enhance performance and memory safety. Magika now identifies over 200 file types, compared to its earlier capacity of around 100. It has significantly improved its ability to differentiate between similar formats, such as JSON versus JSONL, TSV versus CSV, and JavaScript versus TypeScript.
Under the hood, Magika utilizes ONNX Runtime for inference and Tokio for parallel processing. This allows the tool to scan approximately 1,000 files per second on a modern laptop core, with the capacity to scale further with additional CPU cores. Google has indicated that this performance makes Magika suitable for security workflows, automated analysis pipelines, and general developer tooling.
The project is fully open source, and installation is as simple as a single curl or PowerShell command. It also supports Python and TypeScript integrations and offers a native Rust command-line client. Moreover, the team used a 3TB training dataset and relied on Gemini to generate synthetic samples for rare file types, enabling Magika to handle formats that don’t have large, publicly available corpora.
The project is available on GitHub, along with comprehensive documentation. Google’s decision to rebuild the system in Rust highlights its commitment to performance and safety. This release is expected to benefit software developers, cybersecurity professionals, and data analysts who rely on accurate file type detection for their workflows.