a device that is generating concepts

Short 2 Long ETF Modeling

Leave a comment


The obvious relation between Long and Shot ETFs is expected as absolutely opposite, while empiric data analysis is showing interesting duality phenomena of their dependency during rising and falling markets. Drill down into this dependency helps to model ETF price and gain dependencies, see price distortion on gain/loss tails, get insight into ETF behavior during financial crisis and more. This pretty dense work might require explanations and emphasis of details. Welcome to invite me for a drink 🙂

As I am engineer and not coming from financial world and doing it for fun, please take this in proportions and of cause do not do any investments based on the model without consulting with professional!   Feedback/critics is welcome. What can I say more? Enjoy! 😉

Empiric Data

Let’s take as an example a couple of ProShares ETFs – QLD (UltraLong) and QID (UltraShort) on NASDAQ.


Pic 1. QLD and QID time domain

What we can see that the imaginary portfolio has average of zero over time, and some peaks:


Pic 2. Short + long bundle profit

But interesting picture we can see if we plot a short versus long values:


 Pic 3. Long vs Short (QLD vs QID) – we can see clusters

We can see that we’ve got lots of clusters with different variations of inverse proportions. To examine it empirically we can split clusters and find a regression:


Pic 4. High Rsq 1/x clusters of Long vs Short ETF data

Initially I thought that the dependency is simple inverse (and that is what we can see with very good Rsquare):


Where “PL” is Prive for Long ETF , “PS” is a price for complementary Short ETF and “C” is some constant

Just to make sure we’ve got an inverse proportion I took a log out of both sides and plot them one versus another:

PL=C/PS –> ln(PL) = ln(C) – ln(PS)

Now I had a nice linear dependencies:


Pic. 5 log(long) vs log(short). Different colors represent different clusters with linear dependency

Then I tried to investigate 2 types of weird behaviors:

  1. During the time of 2008 crisis
  2. Change in log “skews”


Pic 6. Two types of behaviors: Changing skews and chart “blowing up” during crisis

At the start of the crisis there was a “blowing up” behavior of the chart. And from linear log it became chaotic with low Rsquare of linear regression. During those times the long to short dependency was heavily violated.

Can we point the next time we see a high variance around linear regression and big square errors, that the crisis is coming and predict the explosiveness of market?

Skews and S/L dependency equation

Looking at the clusters of lines we can see that the skew varies i.e. for sure different than “-1”.

In case after log we’ve skew variation of lines, it has to come from the power. E.g. we can write corrected equation:

PL=C/PS^k (eq. 1)

This empiric equation based on short to long behavior analysis.

Taking log from both sides we would get: ln(PL) = ln(C) – k*ln(PS) which are our perfect lines that we’ve seen including slope and intercepts.

How is it matching a “zero opposite portfolio profit” concept?

GL[n]+1 = PL[n]/PL[n-1] = {C[n]/PS[n]^k[n]} / {C[n-1]/PS[n-1]^k[n-1]} = {C[n]/ C[n-1]} / {PS[n]^k[n]/ PS[n-1]^k[n-1]}

If C[n] = C[n-1] and k[n] = k[n-1] = 1, we would get:

GL[n]+1 = PS[n-1]/ PS[n] = 1/(GS[n]+1) –> (GL[n]+1)(GS[n]+1) = 1 –> GL[n] GS[n]+ GL[n]+ GS[n] = 0

i.e. GL[n]+ GS[n] > 0 that cannot be (otherwise there would be an arbitrage of opposite pairs portfolio).

But this is what we actually see. If we will plot actual GL[n] versus 1/(GS[n]+1)-1 we will get the line. i.e. Prices are set to fit the C[n] = C[n-1] and k[n] = k[n-1] = 1 which is mismatch between ETF pricing and zero opposite portfolio gain baseline. Which can be explainable as ETFs are “marketed” as very short term financial instruments.


Pic 7. Actual data: GL[n]+ GS[n] = -GL[n] GS[n] > 0

To build the correct price setup model we would need to analyze and define a varying k and/or C.  We see that actual C[n] and k[n] are always changing across time to keep long term zero gain equilibrium. We can see C[n] goes down, while k[n] is very unstable around “-1” (filtered with Rsq>0.99):

Example of k for log(DDM) versus log(DXD) linear regression:


Pic 8. Log Short2Long Linear regression slope over time

Example of C for log(DDM) versus log(DXD) linear regression:


Pic 9. Log Short2Long Linear regression intercept over time

* Note: from here and on, all calculations and examples are based on DDM and DXD data, but it was verified on other ETF pairs as well.

We can see that C is changing and going down over time.

Let’s assume C[n] = C[n-1]*Cx[n] (name it “Cx”) and k[n] = k[n-1]  (name it “k”).

GL[n]+1 = PL[n]/PL[n-1] = {C[n]/PS[n]^k[n]} / {C[n-1]/PS[n-1]^k[n-1]} = {C[n]/ C[n-1]} / {PS[n]^k[n]/ PS[n-1]^k[n-1]}

GL[n] =Cx[n]/( GS[n] +1)^k-1 (eq. 2)

Taking as a baseline zero gain principle: GS[n] = -GL[n] and name it “G”. Then:

1-G=Cx/(G+1)^k (eq. 2a)

Modeling Cx[n]

from eq 2a: Cx[n] = (1-G)(1+G)^k

This function for actual G and k (2 weeks linear regression and Rsq > 0.999) versus G looks like:


Pic 10. Cx model as function of bi-weekly gain on Short

While red points are k>1 (we will see later that it means bearish market) and green dots are k < 1 (bullish market).

That means C[n] is permanently decreasing when market is rising or falling: C[n] = C[n-1]*Cx[n] < C[n-1]

So this is how actual decreasing C[n] looks like over time for (2 weeks linear regression and Rsq > 0.999) k>1 (red) and k<1 (green):


 Pic 11. Log Short2Long Linear regression intercept over time (filtered by high Rsq, slope-based colorization) and k-driven behavioral duality

Quadratic function regression model brings (based on Short2long linear regression Rsq >0.999 filter):


Pic 12. Cx model as function of bi-weekly gain on Short (quadratic function regression)

Cx[n] =  (1-G)(1+G)^k = 1.0016 – 0.7595*G^2 (eq. 3)

Rsq = 0.97

* Note A: Actually Cx slightly depends on k, but for simplification of model, we can assume that it is independent

* Note B: Chosen sample for model has an impact. For example later I will fine tune it with bigger amount of points by reducing the Rsq filter from >0.999 to >0.997 of Log Short2Long linear regression

If we would take approximate equation of Cx[n] and assume k=1, we will get:

GL =Cx/ (1+GS)^k -1 = (1.0016-0.7595 GS^2)/ (1+GS)-1

Linear regression between actual GL and calculated by this model as function of GS shows good correlation (total Rsq = 0.995) around zero, while differ at high gains (as we will see, comes from distortions at big gains/losses):


Pic 13. Actual Gain on Long versus modeled one (function of Gain on Short)

Examination of “k”

Let’s assume k[n] = k[n-1]  and name it “k” (and not constantly equal to “1”), taking as a baseline zero gain principle, GS[n] = -GL[n]  and name it “G”:

GL[n]+1= Cx/(GS[n]+1) ^k[n] –> 1-G=Cx/(G+1)^k (eq. 2a)


k = log(1+ GS)(Cx) – log(1+ GS)(1 – GS) (eq. 4)

k = log(1+ GS)[ (1.0016 – 0.7595*GS^2)/(1 – GS)] =

log(1+ GS)( 1.0016 – 0.7595*GS^2) – log(1+ GS)(1 – GS)

The same way we can model k as function of Gain on Long:

k = log(1- GL)(Cx) – log(1- GL)(1 + GL) (eq. 5)

This is interesting function that behaves nicely for GS between 1 (100%) and -1 and “explodes” when GS>=1 / GL=<-1 (market total fallout) or briefly goes to zero when it is close to -1 (strong growth). Plotting the function:


Pic 14. Simulation of k as function of Gain (short)

In addition the function shows “singularity” point around Gs = 0 and “duality” of “k”


Pic 15. Simulation of k as function of Gain (short) – zoom in on singularity point

And, finally, we can see from the model is that for positive GS (market is bearish), k is more than 1, while for negative (market is rising), k is less than 1.

To examine empiric difference of “k” coefficients, let’s take a linear regression of 10 day moving window log(ETF) data and filter portions with Rsq > 0.999 (i.e. those that are on perfect line). If we will plot their slope we will see that there is a dual distribution:

Actual “k” of QLD vs QID:


Pic 16. Actual slop of log(QLD) vs log(QID) linear regression (filtered by Rsq > 0.999)

Actual “k” of DDM vs DXD:


Pic 17. Actual slop of log(DDM) vs log(DXD) linear regression (filtered by Rsq > 0.999)























We can see that instead of having a single power of “1” we’ve got duality of power around “1” [0.92, 0.98] and [1.04, 1.1].

Indeed, if we will take a look, what is special about clusters with each skew type, we can see that skew [1.04, 1.1] is for mid -period falling market while [0.92, 0.98] is for rising edge (red dots assigned for k = [1.04, 1.1]):


Pic 18. Dependency between k and time-domain of ETF. Falling Short for high k (red)

Due to moving average and high Rsq data cleanup, short-period changes are not seeing as swapping skew (close to “-1”).

Taking high Rsq data we assumed Cx = 1 at the beginning and then developed a model of slope. To double check the model, let’s have calculated k (based on equation) checked versus actual one. When Cx = 1 (not the actual case):


Pic 19. Log Short 2 Long regression Slope model for Cx = 1 versus actual slope (Rsq = 0.74, poor intercept match)

While full model of k versus actual one, shows the duality very well. Cx used here was based on Rsq > 0.997 sample (based on eq. 4):

k = log(1+ GS)[ (1.0011496 – 0.7028197*GS^2)/(1 – GS)]


Pic 20. Log Short 2 Long regression Slope model for Cx(2 week gain on Short) versus actual slope (Rsq = 0.825, good match, duality)

We shall also check actual data of slope (2 weeks regression) versus actual 2 weeks gain on short, while adding on top of it charts for modeled slope versus the gain:


Pic 21. Log Short 2 Long regression Slope (actual) versus 2 week gain on Short (actual). Red points are high Rsq (>0.998). On top: charts of model of the dependency while Cx = 1 (blue) and Cx as function of Gain (red).

So we can see pretty accurate model of Short to Long dependency.

Short and Long opposite Portfolio Gain

Going back to Importance of the model, we can examine Gain of portfolio that contains both Short and Long ETFs at once (see pic 2). Distribution of such portfolio is of cause around zero, but looking at positive tail of the gain we can see that it comes from tails of distorted ETF pricing (based on Cx=1 and k=1) which is probably used (match to an actual data):


Pic 22. Distribution of Short+Long portfolio Gain (left) and dependency between actual 2 weeks Gain on Short versus Cx=1;k=1 model (top right) or versus modeled Cx and k (bottom right). Red points are common and showing Short price distortion on gain/loss tails.

2008 Crisis chart explosion investigation

We have just seen that corners of gain are distorted, while during the crisis (when gains/losses are big) the ETF short/long dependency control becomes unstable.


Pic 23.Short (red) and Long (blue) ETFs versus time. Highlighted 2008 crisis period

Looking at 2008 crisis period, we can see that there are big values of gains/losses, but what is unique about this period is that Long+Short pair portfolio gains and losses were abnormal:


Pic 24. Right upper: Log(Long ETF) versus Log(Short ETF)and red dots during 2008 crisis period. Left: Short + Long pair biweekly gain. Right lower: Long biweekly gain versus Short biweekly gain. (DDM vs DXD)

Looking at Short vs Long gains, you can see that during crisis, points are out of “standard behavior” chart even when gains are not big.

This is very clear on long/mid-term gain (here 2 weeks), while it is much less “detectible” on daily gain basis. Here the Short+Long gain might be close to zero, while being still within the crisis. Though even on daily basis the crisis has added the biggest “noise” to GL vs Gs correlation:


Pic 25. Right upper: Log(Long ETF) versus Log(Short ETF)and red dots during 2008 crisis period. Left: Short + Long pair Daily gain. Right lower: Long daily gain versus Short daily gain. (DDM vs DXD)


We have seen that despite ETF pricing is based on constant and equal “1” c and k, actual pricing has to be based on varying c[n] and k[n] in short/long dependency PL=C[n]/PS^k[n], while C[n] and k[n] depend on ETF gain, while for small gains are close to “1” and for big gains shows significant deviation. C[n] has been found close to c[n] = 1.0011496 – 0.7028197*GS^2 while k[n] has been calculated based on zero gain principle k[n] = log(1+ GS[n])(Cx[n]) – log(1+ GS[n])(1 – GS[n]). We have seen unstable ETF pricing during financial crisis. I still did not come with proper proposal for more stable and correct pricing that would hold in any conditions, but I feel that the direction that is taken in this work might assist in path finding.


Author: Andrey Gabdulin Product Development

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s