Resources · Datasets

Curated, Ready-to-Use Labeled Datasets

High-quality annotated datasets across text, image, video, and time series — built to accelerate your model development.

Text & NLP

Coming Soon
Sentiment Analysis Corpus

Product and service reviews across 12 industry verticals, labeled positive, negative, and neutral at sentence and document level. Includes aspect-level annotations for fine-grained sentiment tasks.

200K samples
JSONCSV
sentimentclassificationreviewsaspect-level
Coming Soon
Intent Classification Dataset

User utterances across 40 intent classes for conversational AI and virtual assistant training. Includes paraphrase variants and out-of-scope examples for robust classifier evaluation.

60K utterances
CSVJSON
intentchatbotconversational AIclassification
Coming Soon
Text Summarisation Pairs

Article–summary pairs sourced from news and academic publications, human-validated for faithfulness and coverage. Suitable for abstractive and extractive summarisation model training.

120K pairs
JSONParquet
summarisationNLGnewsacademic

Computer Vision

Coming Soon
Multi-Class Object Detection Dataset

Diverse real-world images annotated with tight bounding boxes across 80 object categories. Includes crowd annotations, occlusion flags, and difficulty ratings per instance.

120K images
COCO JSONPascal VOCYOLO
bounding boxdetectionmulti-classocclusion
Coming Soon
Retail Product Recognition

Product images captured across lighting conditions and angles, annotated with category labels and bounding boxes. Covers 2,000+ SKUs from FMCG, electronics, and apparel sectors.

300K images
COCO JSONCSV
retaile-commerceproduct detectionSKU

Time Series

Coming Soon
Financial Market Events

Five years of equities tick data with annotated microstructure events including momentum bursts, spoofing signals, and liquidity gaps. Suitable for trading signal and market surveillance models.

5-year history
CSVHDF5
financeeventstradingmarket microstructure
Coming Soon
Clinical Vitals Monitoring

ICU patient vital signs (HR, SpO2, BP, RR) with annotated clinical events including deterioration episodes and intervention timestamps. De-identified per HIPAA guidelines.

500K episodes
CSVParquet
healthcarevitalsICUclinical events

Specialized Data

Coming Soon
Autonomous Driving LiDAR

3D point cloud scenes from urban and highway environments with 3D bounding box annotations for vehicles, pedestrians, cyclists, and road furniture across 12 object classes.

8K scenes
PCDLASJSON
LiDARautonomous driving3Dpoint cloud
Coming Soon
Scientific Figure Classification

Figures extracted from peer-reviewed papers annotated by type (graph, diagram, microscopy, table, chart) and sub-type. Useful for academic document AI and paper-understanding models.

90K figures
JSONCSV
scientificdocument AIfiguresclassification

Need a Custom Dataset?

We build bespoke labeled datasets to your exact schema, quality bar, and delivery timeline.

Request a Custom Dataset