As computationally powerful computer hardware has become available to many users, the number of speech processing devices such as smartphones, tablets and notebooks has increased. As a consequence, speech plays an important role in many applications, e.g., hands-free telephony, digital hearing aids, speech-based computer interfaces, or home entertainment systems. Speech enhancement algorithms attempt to improve the performance of communication systems when their input or output signals are corrupted by noise. To address these issues, we present a hierarchical extreme learning machine (H-ELM) framework, aimed at the effective and fast removal of background noise from a single-channel speech signal, based on a set of randomly chosen hidden units and analytically determined output weights and deployed by leveraging sparse autoencoders. Multi-task learning and transfer learning approaches have conventionally been adopted recently to improve the performances of deep learning models. Adopt these two approaches we build H-ELM model adaptation in this study, to investigate the compatibility of H-ELM and achieve further improvements in the performance. We train the Aurora-4 and adapted by TIMIT to help of the previously trained model. We also use feature mask Ideal Ratio Masking (IRM) to compared feature map on our experiments. The experimental results indicate that both H-ELM and H-ELM model adaptation based speech enhancement techniques consistently outperform the conventional DDAE framework and H-ELM model adaptation can improve the performance adapted to H-ELM TIMIT, in terms of standardized objective evaluations, under various testing conditions. Beside that, the feature mask IRM is slightly better than feature map.