Rolling Average vs Gradient Descent with Softmax & Cross Entropy
Fragment of a discussion from Talk:Rolling Averages
Jump to navigation
Jump to search
Then one step further, you don't use VCS any more, instead add the logits of velocity bins, accel bins and distance bins, etc., all together. This structure is essentially estimating the probability as a multiplication of probability when e.g. velocity is high and distance is close.
If the movement profile relating to velocity, distance, etc. is independent, this approach will be mostly same as traditional segmented VCS, with more data points.
Note that this approach is essentially a neural network without hidden units, or multiclass logistic regression.