I Built an ML Survey Analyzer Without the Cloud — Here's How

Ashton's Blog

•

2025-07-23

Survey Clipboard Icon overlaying AI neural network mesh pattern.

Nowadays, building AI-supercharged applications will almost always involve an API. Whether it's HuggingFace, OpenAI, Perplexity—you name it—there are plenty of solutions out there to accomplish the AI needs of any digital product. While fetching from an endpoint is convenient, it doesn't fit every set of requirements out there.

What if the application needs to be able to run offline? What if security requirements don't allow data to leave the client? What if there is no budget to fit in the costs of the APIs? What if there are no VMs to host an Ollama instance? If any of these requirements sound familiar, you'll want to hear how I managed them myself.

I built an application that satisfies all of these requirements when I developed an ML Survey Analyzer application, where all computation occurs on the client machine. You may be wondering, how exactly is this possible? The solution has two facets: a desktop client built with Electron, and a back-end embedded Python API that utilizes a lightweight sentence transformer model directly within the application.

The Python API, which uses the ONNX Runtime, is called from Electron via a Node child process, which enables inter-process-communication from Python to the renderer process (HTML). If you'd like to check out the repository, it is linked just below:

https://github.com/ashtonegeorge/cpsc410

What the App Does

The ML Survey Analyzer was designed to help instructors and departments understand large volumes of qualitative survey feedback without relying on manual sorting or cloud-based NLP platforms. It performs sentence embedding, clustering, and topic extraction—all locally.

Once survey responses are embedded using the transformer model, the app performs clustering using HDBSCAN, a density-based algorithm that doesn’t require you to predefine the number of clusters. This allows the system to discover natural groupings of similar responses. From there, it extracts representative keywords from each cluster to describe the underlying theme. This results in readable summaries of free-text feedback—even when analyzing hundreds of responses.

Why I Built It This Way

I had three non-negotiable goals:

No network dependency: The app should be fully functional offline, useful in classrooms, meetings, or at home where internet connection is not guaranteed.
No cloud uploads: Especially in educational or healthcare contexts, sending student or patient feedback to the cloud—even anonymously—can be a non-starter.
Full ownership of the stack: I wanted a solution where I understood every piece of the puzzle, from tokenization to UI interaction.

By building the interface in Electron and running a lightweight local ML pipeline using ONNX in Python, I was able to meet all three goals while still delivering meaningful insights through modern NLP techniques.

What I Learned

ONNX is underrated: Exporting models from PyTorch to ONNX was easier than expected – although it certainly took a bit of debugging – and inference speed on CPU was solid.
Electron and Python play surprisingly well together: Using Node's child_process module, I maintained a clean interface between the UI and backend logic.
Embedded Python Interpreters: While simply bundling the Python virtual environment in my Electron build worked on my computer, I quickly realized that the virtual environment uses Symlinks to Python packages/executables on my local machine. To use a Python in a child_process on a machine without a Python installation, you must bundle an embedded interpreter and install packages without Symlinks.

Final Thoughts

There’s a growing demand for “cloudless AI” solutions—apps that respect privacy, work offline, and give users full control. While cloud APIs make prototyping faster, building a robust, standalone ML tool feels like forging your own path with full visibility. That’s something I think more developers will gravitate toward as privacy, security, and cost concerns grow.

If you're interested in building something similar—or just want to analyze student or customer feedback locally—feel free to dig into the repo and fork it for your needs.

My other posts

Getting an Internship as a Student In a Rural Area

2025-07-30

3 Lessons I Learned About Software Engineering During My Internship at Siemens

2025-08-09