Awesome enty
Fragment of a discussion from Talk:BeepBoop
Jump to navigation
Jump to search
It feels like some cascade model (widely used in ads & recommendation), by putting one deep model on top of some simple & very fast model, with the input of the former being the output of the latter. This architecture simplifies computation by magnitude of degrees, but also restricted the power of the deep model. A best practice then is to make the simple model fitting the output of the deep one, and retrain deep model on top of new input, and repeat...