In real-life problems, the following semi-supervised domain adaptation scenario is often encountered: we have full access to some source data, which is usually very large; the target data distribution is under certain unknown transformation of the source data distribution; meanwhile, only a small fraction of the target instances come with labels. The goal is to learn a prediction model by incorporating information from the source domain that is able to generalize well on the target test instances. We consider an explicit form of transformation functions and especially linear transformations that maps examples from the source to the target domain, and we argue that by proper preprocessing of the data from both source and target domains, the feasible transformation functions can be characterized by a set of rotation matrices. This naturally leads to an optimization formulation under the special orthogonal group constraints. We present an iterative coordinate descent solver that is able to jointly learn the transformation as well as the model parameters, while the geodesic update ensures the manifold constraints are always satisfied. Our framework is sufficiently general to work with a variety of loss functions and prediction problems. Empirical evaluations on synthetic and real-world experiments demonstrate the competitive performance of our method with respect to the state-of-the-art.