A quick introduction on how to create a plugin in elasticsearch that allows you to define new REST endpoints.
Friday, September 9, 2011
Monday, August 29, 2011
Machine Learning Ex2 - Benchmarks
In my previous post, I implemented the algorithm for linear regression using gradient descent in Scala using two different methods: standard builtin mathematical methods and Scalala, a Scala linear algebra library.
Shortly after writing the solution I started to wondering if using Scalala had any performance impact on the runtime cost of the solution. While Scalala does have the overhead of object creation, it also makes heavy use of specialized classes, which should provide a considerable improvement.
I decided to do some naive benchmarking. These benchmarks are nowhere near scientific, but should provide a general sense of the solution's runtime. Since I was benchmarking the two Scala solutions, I decided to look at also the MATLAB/Octave and R solutions.
Shortly after writing the solution I started to wondering if using Scalala had any performance impact on the runtime cost of the solution. While Scalala does have the overhead of object creation, it also makes heavy use of specialized classes, which should provide a considerable improvement.
I decided to do some naive benchmarking. These benchmarks are nowhere near scientific, but should provide a general sense of the solution's runtime. Since I was benchmarking the two Scala solutions, I decided to look at also the MATLAB/Octave and R solutions.
Sunday, August 21, 2011
Machine Learning Ex2 - Linear Regression
Implementing linear regression using gradient descent in Scala based on Andrew Ng's machine learning course.
Tuesday, August 9, 2011
Steal this database? Don't mind if I do.
A while back, Meetup.com issued an pseudo-challenge: steal their database. Nothing that would result in the FBI knocking on your door mind you, but a look into their streaming API. Meetup.com streams all their public events and RSVPs via HTTP streaming or HTML5 websockets, so all the is required to steal their database is a connection to a stream and the ability to save the content.
Sunday, July 24, 2011
Next NoSQL Meetup: Real time processing with In-Memory-Data-Grid and NoSQL
The NoSQL NYC Meetup has been enjoying quite a year. Great talks from all spectrums of the NoSQL world: document stores, graph databases, key-value stores, you name it. What I have learned from all these talks is that nothing wows a crowd more than a live demo. Marko Rodriguez's demonstation of Gremlin using OpenRDF Sail data not even on his computer was particularly fascinating.
That is why I am excited to see what Shay Hassidim from GigaSpaces brings this week. He is promising live demos working with big data. The type of data that makes NoSQL systems a necessity and not merely a premature optimization. The talk will cover data access patterns, MapReduce, Cassandra, and MongoDB.
If you are in New York City, stop by!
http://www.meetup.com/nosql-nyc/events/25379481/
That is why I am excited to see what Shay Hassidim from GigaSpaces brings this week. He is promising live demos working with big data. The type of data that makes NoSQL systems a necessity and not merely a premature optimization. The talk will cover data access patterns, MapReduce, Cassandra, and MongoDB.
If you are in New York City, stop by!
http://www.meetup.com/nosql-nyc/events/25379481/
Subscribe to:
Posts (Atom)