Stock Prediction

Over the last weekend, I spent some hours tuning and enhancing my new prediction model for stock BAC. The result is quite promising. I still need to do more experiments before replacing my existing model on
Stay tuned for more infos.

Take a look at the following chart, you can see some clusters of buy signals generated before the peak.


You can check out my stock prediction model at

Stock Clustering Part I

When come to building a well balanced stock portfolio, it is desired to have different mixture of stocks, potentially from different sectors to minimize the risk. Clustering is a great tool to discover these different types of clusters. You can use different features to categorize these stocks, eg. historical prices, returns, etc.

I know I could have used python or R to run the clustering. But as a big data explorer, I decided to use Mahout and Spark to do the job. I will post my findings and experiences of using these tools in the upcoming posts.

Here are the initial clusters found using the pre-clustering canopy algorithm in Mahout to determine the initial number of K clusters. I will perform the K-Means clustering algorithm next.






The tedious part of using Mahout clustering is you have to transform your data into sequence file format of mahout vector writables. I will share my codes soon.

Please check out my stock prediction model too 🙂

To be continued…