Crowd-Sourced Computing on the Web - Proof of Concept

Years ago services like [email protected] introduced us to crowd-sourcing computing cycles in order to analyze big data. Unfortunately, these programs require that you install some software on your computer and keep it running. Lately, I’ve been noodling over applying this same concept to the web, and I thought I’d share my thoughts.

When somebody is viewing a webpage, most of the time the browser is relatively idle, and probably their machine as well. For years now, modern browsers have implemented the Web Worker API, exposing a way to tap into those unused computer cycles. The missing step is connecting big data to those computing cycles, and possibly rewarding web surfers or site owners for the used cycles.

Article updated on April 5, 2013

If you are seeing the not supported error in your browser (after the first worker starts), try FireFox. It seems the latest WebKit treats the dataURI of a dynamically inserted JavaScript files differently than those included when the page loads.

How do it…

For starters we need a marketplace where somebody who has a lot of data to analyze can go to buy cycles. This is the hard part of the problem, and what I hope one of you might solve. The gist is that a large amount of segmentable data would be uploaded and a function for analyzing the segments.

Sites or users who have opted in to the marketplace include two JavaScript files. The first is the static code to manage the web workers and process data URIs:

(function() {
	var i = 1,
		failCount = 0,
		MAX_FAIL = 10;

	function fetchMoreData(sData) {
		if (! window.WebAnalyzerStop) {
			// fetch more data using JSONP
			var el = document.createElement('script');
			el.async = true;
			el.src = '{PATH_TO_DATA_URL}?data=' + encodeURIComponent(sData);

	function WebAnalyzerProcessData(sDataUri) {
		try {
			var oWorker = new Worker(sDataUri);

			oWorker.onmessage = function(e) {
				// worker finished. data should be reported back to the server
				//	where it can be recorded and more data can be loaded for processing

			oWorker.onerror = function() {
				// worker broke, probably should report somewhere,
				// 	but also get more data for crunching

				// if something is seriously wrong, let's just stop
				if (MAX_FAIL > failCount) {

			console.log("WebWorker " + i++ + " started!");
		catch(e) {
			if (window.console) {
				console.log('WebWorkers are not fully supported by your browser');

	// exposed globally so data loading JSONP requests can start new workers
	window.WebAnalyzerProcessData = WebAnalyzerProcessData;

The second is would be a call to the marketplace server that would return a JavaScript dataURI, and call the WebAnalyzerProcessData function (this is a JSONP system with a known callback function name):

(function() {
    var WebAnalyzerDataUri = "data:text/javascript;charset=US-ASCII,var%20i%20%3D%20500000000%2C%20n%20%3D%200%2C%20dTime%20%3D%20(new%20Date()).getTime()%3Bwhile%20(0%20%3C%20i)%20%7Bn%20%2B%3D%20i%3Bi%20-%3D%201%3B%7DdTime%20%3D%20(new%20Date()).getTime()%20-%20dTime%3BpostMessage(%22Calculated%20%22%20%2B%20n%20%2B%20%22%20in%20%22%20%2B%20dTime%20%2B%20%22ms%22)%3B";

    if (window.WebAnalyzerProcessData) {

If you heard your fan kick on (it will after ~30 seconds), it is because I am demonstrating the system right now. The web worker is adding the numbers between 1 and 500000000, then calling the data URL again (this is a demo, so the data URL does the same thing, but ideally it would be dynamic).

How it works…

The static code manages creating web workers and fetching new data from the dynamic data URL (provided by the marketplace). We expose a global callback function WebAnalyzerProcessData that the dynamic data script can send dataURI encoded JavaScript too. Inside a try/catch block a new web worker is created from this dataURI and the message and error events are subscribed to. The try/catch block is used, because some older browsers do not yet support web workers and IE does not support dataURI blobs. Whether the worker finishes on an error or a message, we fetch a more data, stopping only when MAX_FAIL is reached or somebody sets window.WebAnalyzerStop = true;. Messages are passed to the marketplace using the data query parameter so the analysis can be recorded.

The dynamic data URL needs to extract a chunk of data and combine it with the data analysis function, creating a JavaScript file. This file is then encoded (I used encodeURIComponent, but base64 or another compressing encoding can be used as well) as a string and returned dynamically. This dataURI is passed into WebAnalyzerProcessData, where another web worker will be created to process the next chunk of data.

If a popular site were to participate millions of chunks of data could be processed every day, helping solve some big data problems. Besides having to build the marketplace, there is onus on the big data provider to provide the marketplace with segmentable data and a function that can process chunks of data. The marketplace would need to combine all the data and provide it back to the big data provider.

There’s more…

This technology really excites me, because there are many large data problems that cannot yet be feasibly solved with a modern computer, but could be solved with millions of computers crushing away at it continuously. Additionally, are computers sit idle most of the time, and this would be a great way to put some of those CPU cycles to good use. If the marketplace is well-implemented then it should be profitable by charging the big data providers to use its system, and be able to pay bounties to sites or users.

That said, if the marketplace is not well curated or is intentionally malicious, it could also become the biggest bot-net in the world or possibly be used to attack some of the intractable problems like encryption, and who knows, maybe even Skynet.

Overall though, I think the benefits outweigh the risks, and hope some entrepreneur takes on the task of building this system.