Lightnews — Scholar-powered news

anmol

@avdvh.bsky.social

7/ Challenges & Limitations
No Labels: Hard to evaluate model performance.
Interpretability: Discovered patterns may lack clear meaning.
Scalability: Large datasets require high computing power.
Algorithm Selection: The right method depends on the data.

February 11, 2025 at 1:30 PM

anmol

@avdvh.bsky.social

6/ Real-World Applications
Customer Segmentation: Grouping users based on behaviour.
Fraud Detection: Spotting anomalies in transactions.
Topic Modeling: Extracting hidden themes from text.
Gene Expression Analysis: Finding patterns in biological data.

February 11, 2025 at 1:30 PM

anmol

@avdvh.bsky.social

5/ Dimensionality Reduction Methods
PCA: Reduces features while keeping variance.
t-SNE: Maps high-dimensional data into 2D/3D.
Auto-encoders: Neural networks that learn compressed data representations.

February 11, 2025 at 1:30 PM

anmol

@avdvh.bsky.social

4/ Key Clustering Algorithms
K-Means: Assigns data points to K clusters.
Hierarchical Clustering: Creates nested clusters.
DBSCAN: Detects clusters of varying density.
Gaussian Mixture Models (GMM): Assigns probability distributions to clusters.

February 11, 2025 at 1:30 PM

anmol

@avdvh.bsky.social

3/ Types of Unsupervised Learning
Clustering: Finds natural groupings (e.g., customer segmentation).
Dimensionality Reduction: Simplifies data while retaining patterns (e.g., PCA, t-SNE).
Anomaly Detection: Identifies unusual data points (e.g., fraud detection).

February 11, 2025 at 1:30 PM

anmol

@avdvh.bsky.social

2/ How It Works
The model processes input data (X) without labeled outcomes (Y). It detects similarities, differences, and structures. Two main tasks:
Clustering: Grouping similar data points.
Dimensionality Reduction: Compressing data while preserving key features.

February 11, 2025 at 1:30 PM

anmol

@avdvh.bsky.social

7/ Challenges & Limitations
Data Dependency:Requires large, high-quality labeled datasets
Overfitting: Models may memorise training data instead of generalising
Bias n Variance: Finding the right balance is crucial
Scalability:Training large model demands high computational power

February 10, 2025 at 6:07 AM

anmol

@avdvh.bsky.social

6/ Real-World Applications
Fraud detection in finance.
Medical diagnosis using patient data.
Customer churn prediction in business.
Sentiment analysis in NLP.
Image recognition for security and automation.

February 10, 2025 at 6:07 AM

anmol

@avdvh.bsky.social

5/ Advanced Algorithms
Neural Networks - Inspired by the human brain, excels in complex pattern recognition.
Gradient Boosting (XGBoost, LightGBM) - Powerful for structured data and competitions.
K-Nearest Neighbours (KNN) - A simple but effective method for classification.

February 10, 2025 at 6:07 AM

anmol

@avdvh.bsky.social

4/ Key Algorithms
Linear Regression - Models linear relationships between variables.
Logistic Regression - Used for binary classification.
Decision Trees - Splits data based on conditions.
Random Forests - An ensemble of decision trees.
SVM - Finds the best decision boundary.

February 10, 2025 at 6:07 AM

anmol

@avdvh.bsky.social

3/ Types of Supervised Learning
Classification: Predicts categorical labels (e.g., spam detection, medical diagnosis).
Regression: Predicts continuous values (e.g., stock prices, temperature forecasting).

February 10, 2025 at 6:07 AM

anmol

@avdvh.bsky.social

2/ How It Works
The model is trained on input-output pairs.
It identifies patterns and learns a function that generalizes to unseen data.
Performance is measured using metrics like accuracy, mean squared error, precision, and recall.

February 10, 2025 at 6:07 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news