| ||  The capability of learning super big models is becoming crucial in this big data era. For example, one may need to learn an LDA model with millions of topics, or a word embedding model with billions of parameters. However, it turns out that training such big models is very challenging: with the state-of-the-art machine learning technologies, one has to use a huge number of machines for this purpose, which is clearly beyond the capability of common machine learning practitioners. In this research, we want to answer the question whether it is possible to train super big machine learning models using just a modest computer cluster. To achieve this goal, we focus on two kind of innovations. First, make important modifications to the training procedure of existing machine learning algorithms, to make them much more cost-effective. Second, develop a new parameter server based distributed machine learning framework, which specifically targets the efficient training of super big models.