[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RT] Pre-processing for Neural Networks (apology for length)

To: <realtraders@xxxxxxxxxxxxxxx>
Subject: [RT] Pre-processing for Neural Networks (apology for length)
From: "Robert Hodge" <r-hodge@xxxxxxxxxxxxxxx>
Date: Tue, 18 Jan 2000 08:01:50 -0800

PureBytes Links

Trading Reference Links

As a neural net newbie I'd like a little nudging in the "right" direction
from more experienced members of this group who are willing to share their
knowledge.

I am trying to work out what pre-processing to perform on the input to my
neural nets (using Neuroshell Trader Pro)

According to Klimasauskas in Mark Jurk's Computerized Trading this process
should be followed:

1) Identify domain specific transformations that could improve the signal
content of each input (eg technical indicators)
2) Make data form 1) stationery by subtracting mean from each data point.
3) Normalise data from 2) by applying a shaping transformation (eg use log
of stock prices). Keep original 2) datastream as candidate input.
4) Extract a training set of data from current constructed data streams THAT
UNIFORMLY COVERS THE INPUT SPACE.
5) Select a small synergistic sets of inputs to train the network...
6) Convert "CONFLICTING" data sets to expected values...

OK. OK. Now here's what I'm thinking for step...

1) I think these "transformations" should be inputs which summarise as well
as possible the current "state" of that input. This sounds like wavelet
territory to me as I believe they are ideal at "compressing" time-series
information into a few co-efficients. [WORRY 1: if I have 4 intermarket or
non price inputs going into an S&P forecasting net each of these inputs will
require to be broken into several sub-inputs of X wavelet coefficients, yes?
Doesn't this break the "rule" of aiming for less than 10 inputs per neural
net?]

OR perhaps I could accept a large number of candidate inputs and pass them
through some kind of decorrelation/dimension reduction algorithm before they
reach the net? (I don't know what I'm talking about there)

OR foregoing wavelets: I could use something like a MACD (where long and
short term are encoded in one moving average difference. Sound like a useful
track to take?.

2) Take average of whole data series. Subtract average from each point.
[WORRY 2: if distribution of input values is highly skewed I may end up
mis-training the network because I have given it "biased" data that will not
generalise into the future well. Even after step 3) it will still be there.
Should I not de-trend the the original data fed into 1) and then apply
transformations?

3) Normalise data. If I de-trended before step 1) then it should either be
already "normalised" or should normalise with log of changes (??). Otherwise
not sure what I can do to normalise...

4) Sounds tough...Sounds like I need some technique that would tell me what
clusters there are in my N inputs (N dimensions). Dunno how to do this...
not good at imagining N-dimensional space...Common sense says that I could
look at the original inputs to my net and if they seem to trend in one
direction a lot then I should avoid using the trending bit for training the
network.

5) Haha. Only select inputs that work! Now there's the rub...

6) I'm not sure what this means but let me guess (based on section earlier
in chapter). Where I have inputs that are potentially contradictory (eg
shorter period v longer period stochastic) I should reduce the 2 inputs to
one input by working out the expected value of their forecast and feed this
into the network instead...



I'd appreciate any input you can make to my comments, especially the
nonsense and/or howlers that make you laugh. It's clear I have a long way to
go but if you can help make the journey that little bit less zig-zaggy I'd
appreciate it.


Thanks,

Robert

Prev by Date: [RT] Re: Software Alternatives to TradeStation/Custom indicators..
Next by Date: [RT] Re: BMI
Previous by thread: [RT] BMI quote DDM0H
Next by thread: [RT] Re: Implications of an inverted yield curve
Index(es):
- Date
- Thread