Telecommunication companies possess mobility information of their phone users, containing accurate locations and velocities of commuters travelling in public transportation system. Although the value of telecommunication data is well believed under the smart city vision, there is no existing solution to transform the data into actionable items for better transportation, mainly due to the lack of appropriate data utilization scheme and the limited processing capability on massive data. This paper presents the first ever system implementation of real-time public transportation crowd prediction based on telecommunication data, relying on the analytical power of advanced neural network models and the computation power of parallel streaming analytic engines. By analyzing the feeds of caller detail record (CDR) from mobile users in interested regions, our system is able to predict the number of metro passengers entering stations, the number of waiting passengers on the platforms and other important metrics on the crowd density. New techniques, including geographical-spatial data processing, weight-sharing recurrent neural network, and parallel streaming analytical programming, are employed in the system. These new techniques enable accurate and efficient prediction outputs, to meet the real-world business requirements from public transportation system.