Google is moving more AI work onto the device itself
Google’s latest Gemma 4 release signals a more ambitious push toward local AI that runs directly on phones and other hardware rather than relying on the cloud. According to The Decoder, the new open-source model family can process text, images, and audio entirely on-device, and it can also use tools through built-in “agent skills” such as Wikipedia access, interactive maps, and QR code generation.
That combination matters because it changes the practical meaning of mobile AI. Many consumer systems already present themselves as assistants, but their core processing often still depends on remote servers. Gemma 4 is positioned differently. The appeal is not only speed or convenience; it is the ability to keep data on the device while still enabling a broader range of actions.
The timing fits a wider industry trend. As models become more efficient and mobile chips improve, companies are trying to shift more intelligence to edge hardware. That can reduce latency, lower server costs, and make privacy claims more credible. Google is now trying to turn that technical direction into a developer platform and a consumer-facing distribution channel at the same time.
Smaller models target mainstream smartphones
The Decoder says Gemma 4 arrives in four variants. Two of them, E2B and E4B, are built specifically for smartphones. The “E” refers to effective parameters, or the parameters active during inference. Quantized, the E2B model takes up about 1.3 GB on-device, while E4B needs around 2.5 GB.
Those footprints are notable because they point to a practical deployment strategy rather than a showcase model meant only for premium hardware. The report says E2B and E4B can run on phones with 6 GB and 8 GB of RAM respectively. If that holds in everyday use, it widens the addressable installed base considerably and makes local multimodal AI less dependent on flagship devices.
Google also says the phone variants run up to four times faster than the previous generation while cutting battery drain by up to 60 percent. Arm’s own benchmarks, cited by The Decoder, show even larger processing gains on newer Arm chips. The exact real-world experience will vary by device, but the message is clear: model architecture and hardware optimization are starting to matter as much as raw size.
The bigger story is agentic capability without the cloud
What separates this release from a simple efficiency update is the emphasis on tool use. Gemma 4 is not described merely as a compact multimodal model. It is framed as an agentic system that can autonomously call on specific tools through bundled skills. In practical terms, that means a model running locally can do more than answer questions from a prompt; it can retrieve information, work with maps, or generate useful outputs without sending the interaction to a remote service.
That architecture has strategic implications. On-device agents promise a different balance between functionality and privacy. If the model, the user’s inputs, and the tool orchestration all stay on hardware the user controls, companies can offer a more private AI experience while still supporting richer workflows.
It also opens a path for customization. The Decoder reports that developers can create and share custom skills through GitHub. That suggests Google is not just shipping a model family but trying to seed an ecosystem around portable, local AI behaviors.
Google is pairing open release with broad distribution
Gemma 4 is released under the Apache 2.0 license, which The Decoder describes as commercially friendly. That matters because licensing can determine whether a model family becomes a serious development foundation or remains mostly a research curiosity. A permissive license reduces friction for experimentation, adaptation, and commercial deployment.
Google is also distributing the experience through the free Google AI Edge Gallery app for Android and iOS. The Decoder says that since Gemma 4 launched, the app has climbed to fourth among the most-downloaded free productivity apps in Apple’s iOS App Store, behind Claude, Gemini, and ChatGPT. Even if rankings fluctuate, that data point suggests a meaningful level of early consumer curiosity about local AI experiences.
The report adds that Gemma 4 builds on the same research base as Google’s proprietary Gemini 3 model and that the smartphone variants will serve as the foundation for Gemini Nano 4 on Android. That linkage is significant. It implies Google is treating open and proprietary model lines as part of the same larger stack, with Gemma acting both as a developer platform and as a proving ground for mobile deployment.
Why this release matters in the competition for AI platforms
The AI market is increasingly splitting into several overlapping contests: frontier cloud models, enterprise deployment, developer ecosystems, and now device-native intelligence. Gemma 4 gives Google a stronger position in the last two categories. By combining open weights, mobile optimization, tool use, and a consumer app, the company is trying to make local AI more tangible to both builders and end users.
The move also reflects a competitive necessity. If AI is to become a default layer across phones and other personal devices, the companies that control efficient local models and the surrounding developer experience will gain an important advantage. Cloud access will remain central for larger workloads, but not every interaction needs a data-center-scale response.
Gemma 4 therefore points toward a more hybrid future. Some AI tasks will stay remote because they require bigger models or broader compute. Others will increasingly run where the user already is: on the handset, inside the operating system, and close to sensitive personal data.
For Google, the release is an attempt to shape that future early. For developers, it offers a more practical local foundation. For users, it suggests that “AI on your phone” may soon mean something more literal than a branded shortcut to the cloud.
This article is based on reporting by The Decoder. Read the original article.


