Concepton

a device that is generating concepts


Leave a comment

NASDAQ Network – Part I

Yes, this is what I am doing at 1am when market shifts remind that there is a hidden beauty within pseudo chaotic signals that rule our life. Then I go back to the work, which is trying to systematize stock market into a single network and through the static and dynamic properties of the network to understand it better. What can we find there? Lots of cool stuff. Hidden connections, clusters, internal and cross sector connections. What can we possibly derive out of it? Investment considerations, robustness properties of the market, pathologies and more…

I am going to post findings in a small chunks – the way I am actually working on this 🙂

So how shall we start? Certainly from the raw data. We need a trading records of financial instruments for some period of time. I’ve got 2 sets of data, based on ranges  –

  • 1 week of  3.5k NASDAQ company stocks and exchange-traded fund (ETF) indexes with 5 minutes granularity
  • 10 years of daily granularity data for NASDAQ stocks

Next step is to choose the set (or subset, based on required scope), gather the data and probably to cleanup/arrange formats. Once the data is ready, we need to run cross-correlation. This would giver us a matrix of NxN with correlation coefficients (R square) between each stock. From this point and on we will call each Company Stock of ETF a “Node” and connection between two companies an “Edge“. This is because, as I said, we are going to build a network and those are the terms of basic components.

Application of Rsq threshold is going to reduce significantly the amount of stocks that are correlated. How much? Well, exponentially. This is important, since it is reducing the load on our system and makes the analysis faster. In addition it gives us the required focus of investigation:

Amount of Edges (stock connections - axis Y) as function of applied threshold on StdDev (axis X). Exponential drop, so here it is presented in logarithmic scale.

Amount of Edges (stock connections – axis Y) as function of applied threshold on Rsq (axis X). Exponential drop, so here it is presented in logarithmic scale.

I prefer to work with highly correlated signals >0.9 Rsq on 5 minutes granularity data and a bit lower for annual scale signals. This is enough data to dig into for a single person during the night. For example 0.9 Rsq cleanup is giving on my data set 364 nodes (companies and funds) and 5778 edges (connections between them).

To start work easily, need some visualization SW. I like Gephi. We import the Edge table, when Rsq values are defined as “Weights” inside So how it looks like?

Cross Correlation Network of 364 NASDAQ company stocks and funds at 16 Apr 2014 with correlation higher than 0.9 based on 5 minutes granularity sampling. Colors are based on Market Sector, size of node based on capital value.

Cross Correlation Network of 364 NASDAQ company stocks and funds at 16 Apr 2014 with correlation higher than 0.9 based on 5 minutes granularity sampling. Colors are based on Market Sector, size of node based on capital value.

Beautiful, is not it?

Rsq>0.95 gives much more focused picture:

Cross Correlation Network of 116 NASDAQ company stocks and funds at 16 Apr 2014 with correlation higher than 0.95 based on 5 minutes granularity sampling. Colors are based on Market Sector, size of node based on capital value

Cross Correlation Network of 116 NASDAQ company stocks and funds at 16 Apr 2014 with correlation higher than 0.95 based on 5 minutes granularity sampling. Colors are based on Market Sector, size of node based on capital value

Continue reading

Advertisement


Leave a comment

Market2Book (M2B) project kick-off

m2b

Happy to announce, there is a kick off for the new project that I am preparing to for several years. Market2Book is a consumer companies’ Equity Research for production optimization and stock market investment optimization. I am happy to work with team of BI and Advanced Analytics gurus that are making the magic behind the vision.

System components

system

www.Market2book.com (also redirection from http://www.markettobook.com) new landing page is hopefully a sign of coming progress…  More (obviously) to come.

This is a WIP project. Some modules, like Lead generation, Risk modeling, Signal Presentation are done manually/scripts, some got automation in developed and in evaluation stage (such as Model generation, Model Simulation, Data processing), while some are planned for future development (e.g. Order application, Portfoliio management).

What can I say – Good luck!