Fault diagnosis technology plays a pivotal role in ensuring production safety and enhancing the operational efficiency of industrial equipment. However, traditional fault diagnosis methods often exhibit limited feature extraction capabilities and poor generalization when applied to real-world scenarios characterized by scarce labeled data and complex, variable working conditions. These constraints result in unsatisfactory diagnostic accuracy and reliability. To address these challenges, this paper proposes a deep transfer learning-based fault diagnosis framework that integrates Multi-Scale feature extraction and Multi-Layer domain adaptation (MS-ML). Initially, vibration signals are processed using a Multi-Scale convolutional neural network (MSCNN) to extract hierarchical features that capture both localized details and global patterns across multiple scales. This architecture significantly enhances the model's capacity to identify complex fault signatures. Subsequently, a Multi-Layer domain adaptation strategy is introduced to minimize distribution discrepancies between source and target domain data, thereby improving the model's generalization performance under diverse operating conditions. The proposed method is rigorously validated on two benchmark datasets: the Jiangnan University dataset and the Paderborn University dataset. Comparative experiments with existing approaches demonstrate that the proposed method not only achieves superior diagnostic accuracy but also exhibits strong cross-domain adaptability, maintaining stable performance in transfer scenarios. These findings provide both theoretical support and technical advancement for the development of intelligent diagnostic systems in industrial applications.