Despite the development of modern medicine being in progress, CVD is still the leading cause of death worldwide (responsible for 32% mortality). Although great achievements have been made by predictive models already, these current available systems seem to be based on single-modality data only that may not calculate a high accuracy when identifying the comprehensive risk in clinical cases. This study fills a crucial unmet need for multi-modal prediction of CVD by creating an integrated model that harmonizes clinical, imaging, genetic, lifestyle and environmental data within the framework of sophisticated transfer learning methods to improve both predictive capabilities in early detection as well as risk stratification accuracy. The research employed a hybrid architecture combining pre-trained ResNet models for medical imaging analysis with BERT transformers to process genetic and textual data, all integrated in conjunction with CNN and fully connected layers. We created and structured a dataset of 100 patient records. The model was trained with 10-fold stratified cross validation technique to treat class imbalance problems efficiently. Built in Python using TensorFlow and PyTorch, the system produced good outcomes whereby it performed better and reached an accuracy that was slightly above those previously obtained with a 93.06% accuracy rate in predicting 10-year cardiovascular disease risk, yielding a Mean Absolute Error (MAE) of 0.0694 These results significantly outperform traditional single-modality methods, demonstrating the value of combining multiple data sets for increased prediction and suggesting exciting prospects for early intervention and personalized patient care.