The success of deep learning is largely due to the availability of big training data nowadays. However, data privacy could be a big concern, especially when the training or inference is done on untrusted third-party servers. Fully Homomorphic Encryption (FHE) is a powerful cryptography technique that enables computation on encrypted data in the absence of decryption key, thus could protect data privacy in an outsourced computation environment. However, due to its large performance and resource overheads, current applications of FHE to deep learning are still limited to very simple tasks. In this paper, we first propose a neural network training framework on FHE encrypted data, namely PrivGD. PrivGD leverages the Single-Instruction Multiple-Data (SIMD) packing feature of FHE to efficiently implement the Gradient Descent algorithm in the encrypted domain. In particular, PrivGD is the first to support training a multi-class classification network with double- precision float-point weights through approximated Softmax function in FHE, which has never been done before to the best of our knowledge. Then, we show how to apply FHE with transfer learning for more complicated real-world applications. We consider outsourced diagnosis services, as with the Machine-Learning-as-a-Service paradigm, for multi- class machine faults on machine sensor datasets under different operating conditions. As directly applying the source model trained on the source dataset (collected from source operating condition) to the target dataset (collect from the target operating condition) will lead to degraded diagnosis accuracy, we propose to transfer the source model to the target domain by retraining (fine-tuning) the classifier of the source model with data from the target domain. The target domain data is encrypted with FHE so that its privacy is preserved during the transfer learning process. We implement the secure transfer learning process with our PrivGD framework. Experiments results show that by fine-tuning a source model for fewer than 10 epochs with encrypted target domain data, the model can converge to an increased diagnosis accuracy by up to 20%, while the whole fine-tuning process takes approximate 3.85h on our commodity server.