Big Data/Analytics Zone is brought to you in partnership with:

Zoran has posted 30 posts at DZone. View Full User Profile

Neuroph on Hadoop: Massive Parallel Neural Network System?

  • submit to reddit
On the Apache mailing list there is an interesting Google Summer of Code project proposal—to implement neural networks with back propagation learning on Hadoop. The idea is to create support for a massivley parallel neural network system that will be able to work with huge amounts of data. Possible applications would be typical neural network problems involving:

  • clasification
  • prediction
  • recogniion
  • association
  • statistical modeling

All of these could be useful in personalization and improving search technologies!

Apache Hadoop is a Java software framework that supports data-intensive distributed applications and enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.

The proposal is inspired by the design of existing neural network framework Neuroph, since it is inuitive, easy to use and provides great flexibility for extensions:

"This architecture is inspired from that of the open source Neuroph neural network framework ( This design of the base architecture allows for great flexibility in deriving newer NNs and learning rules. All that needs to be done is to derive from the NeuralNetwork class, provide the method for network creation, create a new training method by deriving from LearningRule, and then add that learning rule to the network during creation."

An interesting approach could be to extract an existing Neuroph interface and to provide implementations of it on top of Hadoop. That way, all neural network models, and learning rules that are currently supported by Neuroph, and that will be developed in future, could be easily ported. This approach would also provide a lightweight development environment, where all algorithms could first be tested and tweaked.

There are some positive comments on the proposal at the moment,  but we'll see whether it is going to be accepted!

One interesting thing to note is that the IDE for Hadoop is based on the NetBeans Platform and since we have recent announcements that Neuroph will also be ported to the NetBeans Platform, it looks like some powerfull tools are coming out in this area. This is a very good example of how the NetBeans Platform can provide synergy between different tools and projects.

Published at DZone with permission of its author, Zoran Sevarac.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)


Toni Epple replied on Sun, 2010/04/11 - 11:23am

Hi Zoran, that's excellent news. Would be great to see this kind of synergy between independent projects.

Greetings from Bergen



Zoran Sevarac replied on Sun, 2010/04/11 - 4:12pm

Hi Toni,

Yes indeed. As a matter of fact there are few more projects based on NetBeans Platform that are of interest to me. First is Gephi ( which can provide nice graph visualisation for neural networks, and the second is Maltego ( which is a data mining tool and can be an interesting application for neural networks. Looks like both are tools of high quality. 



Zoran Sevarac replied on Wed, 2010/04/28 - 6:59am

Great news! This project proposal has been accpted for the GSoC!

Carla Brian replied on Mon, 2012/05/07 - 9:37am

It allows for the distributed processing of large data sets across clusters of computers by using a simple programming model.  This is really nice. - Incredible Discoveries

Mateo Gomez replied on Tue, 2012/06/19 - 3:00am

freakingly awesome...this is very innovative netbeans

  corn salsa recipe


Matt Coleman replied on Tue, 2012/07/17 - 1:35am in response to: Toni Epple

great news indeed especially for the fans

graphic design buffalo

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.