Enterprise AI Analysis

DTF-STCANet: A Dual Time-Frequency Swin Transformer and ConvNeXt Attention Network for Heart Sound Classification

Cardiovascular diseases (CVDs) are the leading global cause of mortality, necessitating early and accurate diagnosis. Traditional stethoscope auscultation is subjective and prone to human error. This study introduces DTF-STCANet, a novel AI-driven framework for heart sound classification, addressing limitations of single-domain representations and monolithic feature extraction. DTF-STCANet integrates Spectrogram and Continuous Wavelet Transform (CWT) inputs, processed by parallel Swin Transformer and ConvNeXt attention networks, respectively. This dual time-frequency fusion captures both stationary and transient PCG signal characteristics. An attention mechanism refines features, emphasizing discriminative regions. The model achieves a remarkable 99.29% accuracy on the PhysioNet/CinC 2016 dataset, significantly outperforming state-of-the-art methods. The use of a Weighted KNN classifier on the learned embedding manifold further enhances class separation and reduces misclassification rates, particularly false negatives, crucial for clinical screening.

Schedule Your Strategy Session

Executive Impact at a Glance

0 Accuracy

0 Error Reduction

0 False Negatives

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

**Dual Time-Frequency Fusion**: The DTF-STCANet model leverages both Spectrograms (STFT) and Continuous Wavelet Transforms (CWT) of Phonocardiogram (PCG) signals. This dual-input approach is crucial for capturing both stationary (S1, S2 sounds) and transient (murmurs) cardiac events, overcoming the limitations of single-domain representations.
**Heterogeneous Backbone**: The architecture integrates a Swin Transformer (for global contextual dependencies and long-range relationships in Spectrograms) and a ConvNeXt (for local texture patterns and fine-grained acoustic features in CWT images). This parallel processing enhances representational diversity and robustness.
**Attention Mechanism**: A channel-wise attention mechanism is applied post-fusion to refine the concatenated features. This mechanism selectively prioritizes discriminative features and suppresses noise, allowing adaptive feature recalibration crucial for sparse pathological acoustic signatures.
**Weighted KNN Classifier**: Instead of a traditional Softmax layer, a Weighted K-Nearest Neighbor (KNN) algorithm is used for final classification. This geometry-aware decision mechanism operates directly on the learned embedding manifold, creating stronger decision boundaries and improving classification consistency, especially for overlapping or imbalanced feature distributions.

**High Classification Accuracy**: The DTF-STCANet + KNN model achieved an outstanding 99.29% overall accuracy on the PhysioNet/CinC 2016 Challenge dataset, which represents a significant improvement over existing state-of-the-art methods.
**Robustness Across Folds**: Performance was validated using a stratified 10-fold cross-validation, demonstrating stable and consistent results (99.29% ± 0.35 accuracy, 95% CI: [98.67–99.51]), confirming the reliability and generalizability of the model.
**Reduced Misclassification Rates**: The total number of incorrectly classified samples drastically dropped to 35, an 84.2% error reduction compared to the backbone-only model. Crucially, it achieved a very low false-negative rate (22 instances), vital for pathological screening.
**Superior Sensitivity & Specificity**: The model achieved balanced sensitivity (0.9901) and specificity (0.9894), and an F1-score of 0.9901, indicating strong discriminative capacity for both healthy and unhealthy heart sounds.

**Early Diagnosis of CVDs**: The high accuracy and low false-negative rate make DTF-STCANet a powerful tool for early and critical diagnosis of cardiovascular diseases, potentially leading to improved patient outcomes and reduced long-term complications.
**Automated Decision Support**: By integrating AI, the framework overcomes limitations of traditional auscultation, such as reliance on physician expertise, auditory sensitivity, and susceptibility to environmental noise, offering a reliable automated decision support system.
**Interpretability and Stability**: The geometry-aware Weighted KNN classifier provides stable and interpretable classification boundaries, enhancing trust in the AI system's diagnostic recommendations in a clinical setting.
**Addressing Data Heterogeneity**: The dual time-frequency fusion and heterogeneous backbone effectively capture complex, non-stationary PCG signal characteristics from varied clinical and non-clinical conditions, making it adaptable to real-world data.

99.29% Classification Accuracy (PhysioNet/CinC 2016)

Source: Model 4, including Weighted KNN

DTF-STCANet Core Process

Raw PCG Signal Input

→

Time-Frequency Transformation (Spectrogram & CWT)

→

Dual Backbone Feature Extraction (Swin Transformer & ConvNeXt)

→

Feature Concatenation & Attention Refinement

→

Weighted KNN Classification

→

Cardiac Abnormality Detection

Performance Comparison: DTF-STCANet vs. SOTA Models

Method	Accuracy (%)	Specificity (%)	Sensitivity (%)	F1-Score (%)
Ours (DTF-STCANet + KNN)	99.29	98.94	99.01	99.01
RAMM + NRBMI + SVM [14]	98.80	98.30	98.90	99.20
Params [13]	96.40	99.10	86.50	89.30
HS-based Vectors [31]	95.62	97.72	87.61	89.2
Res2Net-CNN [12]	91.03	95.01	74.51	77.52

Clinical Efficacy: Reducing False Negatives in Heart Sound Screening

In a critical clinical scenario for early cardiovascular disease screening, minimizing false negatives is paramount to avoid delayed diagnosis and severe patient outcomes. The DTF-STCANet model, especially with its Weighted KNN classifier, demonstrated exceptional performance in this regard. During cross-validation on the PhysioNet/CinC 2016 dataset (3541 samples), the model achieved only 22 false negatives out of 816 unhealthy recordings, corresponding to a sensitivity of 97.30% for the unhealthy class. This is a significant improvement over other classifiers (e.g., SVM: 25 false negatives, EBT: 112 false negatives). This low false-negative rate, combined with high overall accuracy, establishes DTF-STCANet as a clinically reliable tool for identifying abnormal heart sounds, providing a robust decision support system that can aid healthcare professionals in making timely and accurate referrals.

Calculate Your Potential ROI

Estimate the efficiency gains and cost savings your enterprise could achieve by implementing DTF-STCANet.

Your Industry

Employees Involved in Diagnostic Tasks

Avg. Hours/Week Spent on Manual Analysis

Avg. Hourly Rate for Relevant Staff ($)

Estimated Annual Savings $0

Hours Reclaimed Annually 0

Your DTF-STCANet Implementation Roadmap

A phased approach to integrate advanced AI into your diagnostic workflow, ensuring seamless adoption and measurable impact.

Phase 1: Data Integration & Preprocessing

Securely integrate existing PCG datasets and establish a robust pipeline for Spectrogram and CWT generation. Define normalization and augmentation strategies tailored to your clinical data.

Phase 2: Model Adaptation & Fine-tuning

Adapt the pre-trained Swin Transformer and ConvNeXt backbones to your specific data characteristics. Fine-tune the DTF-STCANet architecture to optimize performance on your internal validation sets.

Phase 3: Integration with Clinical Workflow

Deploy the DTF-STCANet model into your diagnostic systems, ensuring seamless integration with existing EHRs and PACS. Develop user interfaces for clinicians to interpret AI-driven insights.

Phase 4: Continuous Monitoring & Improvement

Implement real-time performance monitoring and feedback loops for continuous model improvement. Establish a framework for regular updates and retraining with new clinical data to maintain diagnostic accuracy.

Get Started with Your Roadmap

Unlock Advanced Diagnostic Capabilities

Ready to transform your healthcare diagnostics? Schedule a personalized strategy session to explore how DTF-STCANet can integrate with your existing systems and improve patient outcomes.

Schedule Your Consultation

Enterprise AI Analysis

DTF-STCANet: A Dual Time-Frequency Swin Transformer and ConvNeXt Attention Network for Heart Sound Classification

Executive Impact at a Glance

Deep Analysis & Enterprise Applications

DTF-STCANet Core Process

Performance Comparison: DTF-STCANet vs. SOTA Models

Clinical Efficacy: Reducing False Negatives in Heart Sound Screening

Calculate Your Potential ROI

Your DTF-STCANet Implementation Roadmap

Phase 1: Data Integration & Preprocessing

Phase 2: Model Adaptation & Fine-tuning

Phase 3: Integration with Clinical Workflow

Phase 4: Continuous Monitoring & Improvement

Unlock Advanced Diagnostic Capabilities

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai