I have to confess I have not used the Excel spreadsheet version of the DF test I gave the link for. I just found this one day while on the Web and thought it might be useful for Excel-based folk. I use the MatLab test which is quite sophisticated and sorts out all the details about what regresses on what etc for you!
Some more on co-integration to follow...
Signal Processing for Finance (Part 3)
Fund managers have always had to justify their fees to investors, but never more so than when stories got around that monkeys sticking pins in the financial pages of the Wall Street Journal had built portfolios that performed just as well, if not better, than the ‘experts’. So are the city suits no better then monkeys? The answer is ‘Yes’, and ‘No’. The problem that the fund managers face (and that the monkeys exploit) is that all the skill and experience in the world will not allow them to break the laws of mathematics.
Harry Markowitz was an economist who won the Nobel prize in economics for his Modern Portfolio Theory (MPT). MPT says that risk and return are related in a a simple way and that portfolios can be optimised for the best return/risk ratio if we know the correlation between the returns on the different assets making up the portfolio.
Except that it doesn’t work. The reason that the monkeys do as well as the fund managers is due to one of the consequences of the Law of Large Numbers. The global properties of any matrix can be characterised by a set of numbers called the eigenspectrum of the matrix. Two matrices with similar spectra will behave very similarly. The problem with correlation matrices is that the eigenspectrum of one that has been carefully optimised cannot readily be distinguished from one that has just been thrown together. This means that the monkey’s portfolio stands every chance of doing as well as the fund manager’s.
The big problem behind this whole approach is that correlation between financial price data is meaningless. Correlation is a concept from a branch of mathematics called linear algebra and it measures similarity between things. One interpretation is that is measures the angle between two straight regression lines drawn on the data. In doing the regression calculation, the mean value of the data is subtracted from each data point. And here lies the rub. Only data that has the property of being stationary has a fixed mean that can be subtracted. Indeed, one of the properties demanded of data in the definition of stationarity is that the mean be the same everywhere! But financial time series, which can be closely modelled as random walks, are non-stationary by definition, and so their means are not really defined. They are different wherever you look. This is why the whole idea of correlation breaks down when applied to financial data. Yes – you can put the numbers in to the formula and turn the handle and something sensible-looking comes out. But the result is meaningless in any real sense.
Another problem with correlation is that when two things are very closely correlated, it is tempting to try ot read some causal relationship between them. This is sometimes true – the close correlation between umbrella sales and rainy weather is causally related – but very often it is misleading. Two data sets that display extraordinary correlation are the sea levels in Venice and UK bread prices. But one in no way causes the other!
Non-stationary series like financial price data need quite different tools. The most powerful to emerge for the analysis of financial data in particular is the concept of cointegration. A data set is said to be integrated to order n, written I(n), if the data is stationary after differencing n times. Most financial data such as stock prices are I(1), that is, the first differences of stock prices are stationary.
Two assets are said to be cointegrated if there is a linearly weighted combination of them which is stationary. For example, if the price data series for AstraZeneca minus 3.85 times the price data series for GlaxoSmithKline is stationary, then the two prices series are cointegrated. This has very powerful ramifications.
One of the most important consequences of cointegration is that it is possible to prove mathematically that a genuine causal relationship must exist between two data series that are cointegrated.
"I wanted to post a thought i had...
In rereading this post, CB you described your process as taking the OHLC stream, using the ULLMA on it, then doing a HA Calc, and finally smoothing it again with the ULLMA.
The OHLC you refer to i'm guessing is the "matched filter" OHLC that we create from the original OHLC ( pulling out the noise)?
That's right. We smooth the OHLC prices using a filter that leaves the "noise" stationary. This filter has very low lag and excellent price fidelity. So you can get a nice HA indicator but without the 4-bar lag of the conventional indicator.
I am working on putting this into a robot rather like Don Steinitz's HAS MTA robot. But I hope it will perform better because of the <0.5-bar lag.
Original Data Series (I[N} - Trend Filter = Stationary Noise (I)
So.... our Original Data Series - Stationary Noise = our Trend Filter....
The Trend Filter is then smoothed by the Smoothing Technique, HA is applied to it....and it's smoothed again.
To compare to your first picture, the lines you have as your "trends" are the trend filters smoothed, while the bars you have is the original data series?
One thing i have noticed, and maybe i am doing something wrong ( i know you don't use excel...but for 5 minute data it should be easy to see) is that the extrated trend filter, not smoothed, compared to the original data series, is more "volatile"....please see my pic and comment. DataSeries is just the work i did to get the lines. Stationary vs Non Stationary is a pic of the lines. Series 1 refers to the original data series. Series 2 refers to the Original Data series - the Second Diff.
Thanks for your time,
In the post #42 CB wrote : "Here we have to deal with not additive noise (which is easy) but MULTIPLICATIVE noise which is VERY different".
I think it's the reason why you can't denoise substracting the "differentiate version" from the signal.
The Hodrick-Prescott filter works perfectly to get the trend (the Dickey-Fuller test show that the residue is stationary), but it's non-causal !
If someone have an idea to utilizes it even so...
The first figure shows the bid GBP/JPY M15 (open price) with his "optimal trend" in red.
The second figure shows the residue (MarketData - Trend).
Attached is what i've been working on ( Jurik DF).
Non-causality means working with futures data wich is impossible. In other terms you can apply a non-causal filter (or smoother) "nearly" perfect (non-causal filters need an infinite knowledge of futures and pasts values in theory) on a time serie with finite length but the result is unusable because you obtain the perfect trend only for past values.
The Hodrick-Prescott filter works so but I'm actually working on this problem and my results seems to be interesting. I'll post it soon (today if I can).
Apart from this your system looks like interesting. I see that you are using the JMA, the problems for me with JMA is that the source code of the filter is unavailable and I don't like the "black-boxes".
It would be interesting to substract the JMA output from the market data and perform the Dickey-Fuller test to check the presence of an unit root or not to know if the residue is stationary. If it's the case, this would want to say that the JMA is a good "de-trender".
Ahhh... Now we are getting towards some intersting stuff!
The HP filter is a second-backward-difference matrix filter. I have implemented this using a fast Ridder-root method to avoid the matrix inversion. (My DLL can fit the HP series to 100,000 points in 40 microseconds -- try doing that by matrix inversion!!)
This filter is interesting on a number of fronts. It is not strictly non-causal! A NC filter requires peeks at future data (i.e. has a two-sided impulse response about zero time). The HP filter will fit the (provably) optimum trend through the data for any given smoothing parameter without need for unseen data. The result is absolutely stationary (by definition of the equation for the filter).
The problem with the HP filter from a trading perspective is end-point bias. This occurs due to the symmetry of the 2BD matrix which is simply (del squared)(transpose) x (del squared) -- ie obviously symmetric.
I have a VERY interesting trading system where I have tweaked the matrix diagonal end-points to reduce the bias. I have used this for futures trading with considerable success. We have also applied this to extraction of temperature profiles in neutropenic patients undergoing chemotherapy (we have a forthcoming paper on this in the Journal of Medical Engineering -- look out for this on-line -- it contains some useful stuff for you).
BTW, don't get too hung up on Jurik. If you just take a root-Nyquist filter and tweak the coefficients for minimum phase lag via the Remez exchange algorithm (a trivial exercise in MatLab) then you will get a filter every inch equal to Jurik or better.
Thanks for your time,
To help us understanting better your system i tried to make a complete synthesis from your posts coupled by a complete set of questions Thanks in advance for your time and your light and don't hesitate to correct me if I'm wrong or if I forgot anything.
Also I put some MatLab references of the algorithms for those are interested in your thread.
Basic synoptical :
#1 : Select a financial instrument.
#2 : Determine the "integration order" differentiating the time serie at the given timeframe several times until the "Dickey-Fuller Test" cannot reject the hypothesis of an unit root.
#3 : Find the best timeframe (can be "fractional") using the Shannon's "mutual information" method.
#4 : Design a "derivative filter" that match the integration order and compute two differents versions of the filter, one smoother than the other.
#5 : Use the "Radar Clutter Filter" like a "Trend Strength" indicator to disallow the trading in ranging markets and prevent from whipsaws.
#6 : The filters crossover determines the trading signals which the "trading range indicator" must approves.
#7 : Run the system in realtime for a time length (or a number of trades per day) defined by the "Chi-Squared Distribution" to check the reliability before playing with real money.
Basic questions :
From point #3 :
a) In the litterature "Stochastic Volatility" is related to "Maximum likelihood" but you told it's useless because the market isn't a gaussian process (that's true naturally), however the mutual information method "compare" two time series and return a coefficient. But precisely, what are you comparing to define if a timeframe is tradable ?
b) You told that the coefficient must be away from 0.5 (either up or down), but you told too that you use "an algorithm that computes things upside-down", so the treshold you indicate (0.5) is valid for the original mutual information method implementation or for your modified version ?
c) How away from 0.5 the coefficient must be ? In others terms what's the treshold (either up or down) according to you ?
From point #4 :
a) I suppose the principal filter you called "ULL-MA" to be a first order derivative because Forex time series are generally I(1) ? But maybe it's fractional ? So you don't use in TopCat the HP-Filter you tweaked for Futures trading ?
b) ULL-MA is applied only to High and Low so you do some computations to combine them ?
c) I don't understand the terms "Phase-Cyclic", what does it means ? There is no relationship whith "Hilbert analytic-signal" technics ?
d) What's the differences between the master filter (blue) and the trigger one (red) ? Are have they ULL-MA twice with differents settings ?
From point #5 :
a) I know that's the "Radar Clutter Filter" is used to cancel unwanted reflections, it can be implemented as an "adaptive lattice filter" or the "Moving Target Indicator" (MTI). Could you clarify much more this point ? Which implementation are you using ? Have you customize it ?
From point #7 :
a) Like the point #5 have you more details about this statistic ?
"Augmented Dickey-Fuller Test" from the "GARCH Toolbox"
"Hodrick-Prescott Filter" from the "GARCH Toolbox"
"Chi-Squared Distribution" from the "Statistics Toolbox"
"Mutual Information" from the "Information Theory Toolbox 1.0" (http://www.mathworks.com/matlabcentr...bjectType=FILE)
"Radar Clutter Filter" (various toolboxes including others radar tools)
http://www.mathworks.com/matlabcentr...bjectType=FILE (Require the "Signal Processing Toolbox")
Please don't take into consideration my ultimate post, there is a lot of mistakes, sorry... I will work more on these subjects and post my results soon if possible (if I find ).
Actually I de-trend with the Savitzky-Golay filter but it's hard to get a good result.
How is your EA doing? You have not posted in a while...
Just some light reading material for those interested. Hope to "bump" the thread along. Been deserted as of late.
I am still alive!!! I am buried under a mountain of courseworks and exam papers at the moment, both from here and from our parther universities in the Far East.
Will get back to "the job" when I re-surface!
topping this thread....hoping for some new life :-)
I have not forgotten the thread! I have my last Exam Boards, committees,
blah blah blah etc this week and next, then I can return to some serious
Little reading material on the subject of noise filtering for those interested. Bumping the thread as well
Fourrier versus wavelet analysis : http://www.amara.com/IEEEwave/IW_wave_vs_four.html
intresting thanks for posting
Just hope this excellent thread with solid mathematics discussion won't die.
I am new to DSP and their filters, but I think they are definitely interesting. Any experience of successful application of them that someone can share?
After the mother of all exam marking sessions, assessment committees, exam boards, graduation ceremonies, etc (and IMHO a damned well deserved holiday!!), I'm back.
Many thanks to all those good folks who PM'd me almost every day wanting to keep the thread alive.
So... where to go from here. As the thread has evolved, we might have wandered a little off topic. I am assuming that the good folk reading this are here because (1) you are mathematically and computer literate and (b) highly motivated.
The thread had started to wander into stuff like Fourier analysis and spectral methods. These are great but MUST be used with caution! I will talk more about these and their correct use for FX in the future. But to get the thread back onto the main tack....
Let us firstly remember that financial time series are NON-STATIONARY for the reasons we have discussed, and NON-LINEAR in that they are non-Gaussian. Hence, we should put RIGHT OUT OF OUR MINDS STRAIGHT AWAY any notions that arise from linear systems theory. This includes ALL the filters you will find in text books such as moving averages etc. It also includes ALL ideas from linear algebra, such as correlation.
It literally makes me weep sometimes when I see even brand new text books in economentrics banging on about "correlations between assets". Correlation is an idea that arises from LINEAR algebra. To ask if two non-stationary time series are correlated is like asking "Who married the fridge?". The question is quite meaningless. Asking if two non-stationary series are CO-INTEGRATED is quite a different matter and a highly valid question. If you ask this, you can REALLY learn something about the future!!! More later.
This may well go some way to explaining why the hedge fund I consult with is up 34% since last August while the rest of the world appears to be going to hell in a handbasket. If you base risk measures on correlation , you are DOOMED as Northern Rock, Bear Stearns, etc etc etc etc etc etc are finding out the hard way.
So, to approach FX from a hard, numerate, scientific standpoint, we argue as follows:
 ARE MARKETS RANDOM? If so, then we are screwed. No amount of signal processing, linear or otherwise can tell you anything at all about a random signal. So we stop right here and take up another hobby like goatkeeping and sell the cheese instead of trading the markets.
If they are not random, then we are in with a chance. The Efficient Market Hypothesis (EMH) that is STILL is the damned textbooks :-(( says the market is random and that all the information is in the current price and you can't predict the future. Luckily, everyone except economists have realised that the EMH is DEFINITELY FALSE. These is loads of empirical evidence to show that it is not true. So we have a chance.
 As we have argued before in this thread, what we then have is a genuine signal that we can exploit, buried in a ton of noise due to market makers buggering about and a host of other factors. So the main task in hand is to get a good (non-linear) filter to separate out our precious signal from all the noise.
Once we have that signal we can really move. I have two main trading techniques. One is high frequency trading which is perfectly suited to FX (indeed it is only applicable to FX) and uses the tick-by-tick data to uncover meta-beliefs held by market makers. That is, by looking at trade prices and MM reactions to them (up/down/same) and the time taken to revise their quote, we can learn a lot about what THEY think WE think that THEY think. More on this in future.
My other main market technique - be it FX or anything else - is to exploit co-integrated time series. These series have a genuine causal link (even if that cause is unknown) - see Grainger's Nobel Prize address on this topic. So I look for deviations from the norn and "pairs trade" - that is, go long one currency and short the other, with a very high degree of certainty that the pair will revert to their long-term cointegral relationship. With a little bit of mathematical savvy, one can remain pretty much market-neutral and just reap the profit.
There - so now you know the secret (or mine anyway). I all but live off this income (I still go into the Uni to get out from under my wife's feet when she is doing her own thing).
If there is interest in this, we will discuss details and methods.
Have a great day,
So happy to see you return and to know that this thread is alive again.
I am definitely interested in this topic. I am very impressed by your mathematics background (and also your humor ). Please do tell us more.
Wish you have fun in trading, in mathematics, and in life.
High Frequency Trading reference
For those folk interested in the high frequency trading strategies, some very useful background can be found in...
Reading this will save me re-hashing what is already known.
I wish to read more of what you have to say as you appear to be a very experienced and open minded man. Not to mention the fact that you actually know what you are talking about and you are very smart.
A very successful trade
I want to share with you a very successful trade I did over the last couple of months. This one went right as planned and looking out my window, I have a really nice "ocean blue" brand new car on my drive to remind me!!
While this one was not FX, the principle can be used identically. I have developed a working variant of the Pairs Trade, at one time one of the best kept secrets on Wall St.
If you look at charts of the FTSE100 and Crude Oil, you will notice something quite curious. Since the FTSE100 has about a 15% weighting in oil, the index is heavily influenced by the oil price and generally moves with it. But since mid-May the FTSE started to fall while oil rocketed upwards.
To measure how things move together or otherwise, most folk use correlation and, indeed, this is the method detailed in the current books on pairs trading. But as we have seen, correlation is meaningless for non-stationary time series because the first step in the corr computation is to subtract the mean of the series. But the mean is everywhere different in a non-stationary series (that is the definition of non-stationary). So it doesn't work. NOTE: Anyone who has forgotten this stuff, please re-read post #83 in this thread.
So, instead we use CO-INTEGRATION. Unlike correlation, two data sets that are co-integrated can be proved to have a genuine causal connection. Grainger (who developed cointegration) quotes the example that two of the best correlated data sets ever found are the mean sea level in Venice and UK bread prices. But there is no causal link :-)) While the correlation is about 0.97, the two sets are NOT cointegrated.
So I found that there was a cointegral relationship between the FTSE and the oil price and determined the weighting coefficients. This told me that thet FTSE and oil do indeed move together (like the drunkard and his dog in the Grainger papers). While they can wander apart at times, in the long term they always revert to be being closer together again.
So when the maths told me that the FTSE and oil had got too far apart, I kinda knew that (i) the FTSE must rise, (ii) oil must fall, or (iii) both. So I BOUGHT the FTSE and SOLD oil. Note, that by getting the weighting right, the resultant "portfolio" is essentially market neutral as if they both rise or fall together, almost no money is made or lost. So the risk is very tightly controlled.
Then, sure enough, in the middle of July, the oil price comes tumbling down from 140 to 125 and the FTSE goes up from 5150 to 5500. So I made a profit on the LONG position on the FTSE and on the SHORT position in oil.
Moral - if you can find two assets that are cointegrated, and they wander apart, you have a good chance of making money when they revert to the long-term relationship. This could equally well be applied to currencies.
All the best
Can anyone explain to me how to put the 40MA with 0.3 lag thing? Thanks a lot.
@codebreaker, wonderful observation. i'll put this in my treasure chest. thanks!
Great, what about Fuzzy,...
I have a question regarding a great point you have addressed. Since the rates in forex or any other markets are not Random,...can we manage them with fuzzy? Take clustering a currency data in a way that each cluster has a centroid, and the price movement would be like going from one centrorid (a price) to another.
What's your point of view? This is my first post, I do beleive your topic and this discussion very very useful.
I have thought about building a FAM matrix for currencies but it is difficult to see how to optimally handle the data. What would we represent? For stationarity, I would go for price change which we could classify as [down] [none] [up] in a simple scheme. But then how would we drive the matrix and what would be the set functions?
If you have any ideas on this, I would be interested to hear them. But I am not sure this is a goer!!
Great System - Thanks for Sharing
This system looks very interesting. Hope you don't mind if I ask a few questions. I was in the process of developing an FX trading system using filters to determine underlying trends when I happened upon your posts.
So far I've found that filters do a reasonable job extracting trends, but your system has several components that mine does not.
Basically, if I understand what you're doing, it's as follows:
Your Main Filter and Range Gate Trigger seem to do the following:
(1) Matched Filter of raw Open, High, Low, and Close prices
(2) Heiken Ashi of (1)
(3) Matched Filter of the Heiken Ashi High and Low from (2)
When the Range Gate Trigger crosses the Main Filter, it's the signal to go long or short.
My two questions are:
(1) What's the difference between the Main Filter and the Range Gate Trigger? Is the Range Gate Trigger just a slightly 'faster' version of the Main Fitler? Or are they filtering a different underlying price -- Close versus (High+Low)/2?
(2) You also have something you referred to as a "Radar Clutter Filter", but I don't see any description of that -- unless I missed it. Would you mind describing this a bit -- how you determine whether the market is trending or ranging?
I hope you guys don't mind but I asked Twee to take this thread out of the Rookie forum. It's pretty obvious it belongs almost anywhere but there!
Anyway, carry on gentlemen. Fascinating stuff!
Sorry for the long silence! I got married last month to a rather great lady and have had my hand full :-)) with other things!
Wowser! This credit crunch has livened things up a bit! I know we thrive on volatility but this is crazy. One of the reasons I have always been keen that Culpability Brown remain formerly as Chancellor and now as Prime Minister is that he is sooooo good for business. Far from abolishing "boom and bust" he has been the biggest cause for a century. So long live Gordon and the big swings...
Now, how to capture them. I would strongly urge folk on this thread to read the "Swingtum" theory of Heping Pan in China. You can Google his papers easily enough. When you have read and inwardly digested, I think we can use this to advantage. I have been researching and applying some of his ideas and they look fruitful.
On another note, does anyone monitor the Credit Default Swap spreads? I always take a peek. The spreads on a safe bank should be around 150bp. Back in March the spreads on Lansbanki, Kaupthing etc widened to more than 500! So I switched all cash out of these (I had a fair bit...) Just as well in the event....
Siman Tov &
hi codebreaker great to see you back
you speak in your contribution often about dointegration.
so i treid to figure something out, but its not easy to find information of currency pairs and their cointegration. The only thing i found is that gp/usd end eur/usd are cointegrated over the last years. I have not the statistical and mathematical ability to prove such things. So can you be so nice and explain more about this and expecially how i can figure out which pairs are cointegrated? is it also possible to do this only by eye viewing or do i need some complex mathematical ability. if this is the case i cant do it. So maybe you can say what you know about this
© Forex Factory