Sparse coding based anomaly detection has shown promising performance, of which the keys are feature learning, sparse representation, and dictionary learning. In this work, we propose a new neural network for anomaly detection (termed AnomalyNet) by deeply achieving feature learning, sparse representation and dictionary learning in three joint neural processing blocks. Specifically, to learn better features, we design a motion fusion block accompanied by a feature transfer block to enjoy the advantages of eliminating noisy background, capturing motion and alleviating data deficiency. Furthermore, to address some disadvantages (e.g., nonadaptive updating) of existing sparse coding optimizers and embrace the merits of neural network (e.g., parallel computing), we design a novel recurrent neural network to learn sparse representation and dictionary by proposing an adaptive iterative hard-thresholding algorithm (adaptive ISTA) and reformulating the adaptive ISTA as a new long short term memory (LSTM). To the best of our knowledge, this could be one of first works to bridge the `1-solver and LSTM and may provide novel insight in understanding LSTM and model-based optimization (or named differentiable programming), as well as sparse coding based anomaly detection. Extensive experiments show the state-of-the-art performance of our method in the abnormal events detection task.