• Home
  • Forums
  • Trades
  • News
  • Calendar
  • Market
  • Brokers
  • Login
  • Join
  • User/Email: Password:
  • 6:03pm
Menu
  • Forums
  • Trades
  • News
  • Calendar
  • Market
  • Brokers
  • Login
  • Join
  • 6:03pm
Sister Sites
  • Metals Mine
  • Energy EXCH
  • Crypto Craft

Options

Bookmark Thread

First Page First Unread Last Page Last Post

Print Thread

Similar Threads

Neural Network trading using Matlab and Metatrader 8 replies

How can I represent motion around fibs with continuous variables? 0 replies

relation between profitability and extent of usage 3 replies

Relation between Nikkei 225 Index and Yen? 3 replies

relation between interest rate and forex 1 reply

  • Trading Discussion
  • /
  • Reply to Thread
  • Subscribe

Relation between binary and continuous variables? (In MatLaB)

  • Post #1
  • Quote
  • First Post: Feb 14, 2011 5:07am Feb 14, 2011 5:07am
  •  mfurlend
  • | Joined Apr 2010 | Status: Member | 165 Posts
I have an EA that I regularly backtest and optimize. I want to determine a relationship between the probability of a given trade being profitable and a certain statistical measure called kurtosis at the time the trade was initiated.

I programmed my EA to generate a .csv file containing each trade's profit/loss alongside the proper kurtosis value.

Excerpt:
Inserted Code
[b][u]PROFIT[/u][/b][b][u]KURTOSIS[/u][/b]
-41.822.97530011
-61.823.12553938
-45.822.32907776
-100.821.9039837
-14.822.35415757
-30.822.23130752
-75.822.76931501
-62.821.60114237
28.183.29114443
12.182.99822995
6.182.27906841
-76.821.59931087
1.186.21691918
54.183.15778944
24.184.98645291
2.182.84275644
33.182.36483006
18.182.17924754
-160.821.86165136
51.181.90035084
...
etc

I am not interested in the discrete sum of profit/loss, so I replaced all p/l values with binary values: 1 if the trade was profitable and 0 if it was not profitable.

Excerpt:
Inserted Code
[b][u]PROFIT[/u][/b][b][u]KURTOSIS[/u][/b]
02.97530011
03.12553938
02.32907776
01.9039837
02.35415757
02.23130752
02.76931501
01.60114237
13.29114443
12.99822995
12.27906841
01.59931087
16.21691918
13.15778944
14.98645291
12.84275644
12.36483006
12.17924754
01.86165136
11.90035084

Now what I want to do is to determine the relationship (if any).
Normally I would use a polynomial regression, but this makes no sense when one of the variables is binary.

I have read of something called logistic regression, but I can't seem to figure out how to do it with MatLab.


If someone could give me some instructions and possibly some tips on interpreting the results I would really appreciate it. If you know of another way to achieve a relation between the two variables I'd like to hear that too.
  • Post #2
  • Quote
  • Edited Feb 16, 2011 12:45am Feb 15, 2011 9:31pm | Edited Feb 16, 2011 12:45am
  •  OldQuant
  • | Joined Jan 2011 | Status: Member | 2,078 Posts
Quoting mfurlend
Disliked
I have an EA that I regularly backtest and optimize. I want to determine a relationship between the probability of a given trade being profitable and a certain statistical measure called kurtosis at the time the trade was initiated.
Ignored
A couple of things, off the top of my head, with no warranty of usefulness.

What you have done, in effect, is transformed your problem into the comparison of two samples: the kurtosis measures (which I'll shorthand as m4 for reason I'm sure you know) for the "0" group and the "1" group.

Now you have an expectation that the samples - and so, the m4_0 and m4_1 populations as well - are different (whether certain levels of m4 correlate with, or are cointegrated with, profits or losses is probably not your concern).

So, you have a classic comparison of the properties of two populations problem. Lots of stat books, and well as R, (MatLab I assume does; I don't use it), stata, et. al, have those kind of tests cook-booked.

The one thing you might want to look up is the standard error of the sample kurtosis for normal. I know that Kendall and Stewart gave the standard error calcs for all normal moments and cumulants in their classic "Advanced Theory of Statistics"

(BTW: they didn't mean to use 'advanced' in the sense of complicated or high-falut'n. Rather, as proper Englishmen, they used it is the sense of "this has what has been brought forth to us." Should have called it the "Received" Theory of Statistics.

Anyway, this is kind of an interesting idea. I have no a priori as to what you're going to find, and will be interested in anything you post.
 
 
  • Post #3
  • Quote
  • Edited at 10:49pm Feb 15, 2011 10:09pm | Edited at 10:49pm
  •  jamjamjam
  • | Joined Apr 2010 | Status: Member | 96 Posts
This is a typical 'binary' regression problem; aside from logistic regression, there are dozens of machine learning type of regressions you could apply.
It's common to transform the input data using some type of normalization constraint (val->stdev for ex).

A simple example can be run in R:

http://psychweb.psy.umt.edu/denis/da...c_R/index.html
 
 
  • Post #4
  • Quote
  • Feb 15, 2011 11:05pm Feb 15, 2011 11:05pm
  •  mbkennel
  • Joined Nov 2009 | Status: Member | 245 Posts
Yes, logistic regression is an option. what you want. The underlying model for logistic regression is that the logarithm of the odds, i.e. log(prob(true)/prob(false) is a linear function of the inputs. It is the simplest generalization of linear regression to binary targets.

But if you are trying to see if there is ANY statistically significant relationship, then what you want may be even simpler.

Try a t-test, or a Mann-Whitney test on the continuous values, split by profitable or not profitable.

If you don't get any statistically significant difference, then the value of the continuous variable is not likely to be useful (on its own) in predicting the binary tag (1/0).

How many observations do you have?

Have you tried something really simple? Look at the profit---is there any statistically significant correlation between the input and the profitability? Try ordinary Pearson correlation and Spearman rank correlation (doesn't assume Gaussianity).

Think first, then compute.


glmfit in Statistics Toolbox in MATLAB will do logistic regression (and other stuff too).
 
 
  • Post #5
  • Quote
  • Last Post: Feb 16, 2011 11:14am Feb 16, 2011 11:14am
  •  mfurlend
  • | Joined Apr 2010 | Status: Member | 165 Posts
Quote
Disliked
How many observations do you have?
I intend to run this process in 3 dimensions, with the 3rd dimension being different optimizations of another variable. This will, of course, yield results with different amounts of trades - so anywhere between 10,000 and 500.

Here is a similar process that I applied to 3 continuous values; # bars, kstd, and profit factor. Each point is a different optimization result.

Initial scatterplot:
http://furlender.com/forex/scatterplot.avi

Fitted with polynomial regression:
http://furlender.com/forex/polyfit.avi

I want to achieve something similar to this, except with one of the values being binary.

I will definitely try the statistical tests you mentioned first, and if a relationship is found it looks like the way to go for further information is logistic regression.

Thanks everyone for your answers!

P.S: OldQuant, when I'm done I'll post the results since you're interested
 
 
  • Trading Discussion
  • /
  • Relation between binary and continuous variables? (In MatLaB)
  • Reply to Thread
0 traders viewing now
Top of Page
  • Facebook
  • Twitter
About FF
  • Mission
  • Products
  • User Guide
  • Media Kit
  • Blog
  • Contact
FF Products
  • Forums
  • Trades
  • Calendar
  • News
  • Market
  • Brokers
  • Trade Explorer
FF Website
  • Homepage
  • Search
  • Members
  • Report a Bug
Follow FF
  • Facebook
  • Twitter

FF Sister Sites:

  • Metals Mine
  • Energy EXCH
  • Crypto Craft

Forex Factory® is a brand of Fair Economy, Inc.

Terms of Service / ©2022