Standard Dataset
Large-scale, high-quality and scenario-based AI datasets have become the key to AI algorithm research and technology development. On the one hand, AI academic research field is putting higher requirements on the datasets scale and the data annotation
Datasets List (22)
Dataset ID
MD-Fashion-1
DESCRIPTION
[ Source ] Network collection, which covers typical scenarios such as e-commerce, fashion shows, social networking and offline user-generated content, etc.
[ Annotation ] Classified label, Bounding box.
More than 80 categories of labels, covering gender, clothing types and styles, scenarios, etc.
VOLUME
~2M
APPLICATION SCENARIO
E-commerce,Smart RetailO2O,Social Media,
Visual Entertainment
Dataset ID
MD-Fashion-2
DESCRIPTION
[ Source ] Network collection, which covers typical scenarios such as e-commerce, fashion shows, social networking and offline user-generated content, etc.
[ Annotation ] Classified label.
More than 30 categories of common patterns.
VOLUME
~200K
APPLICATION SCENARIO
E-commerce,Smart RetailO2O,Social Media,
Visual Entertainment
Dataset ID
MD-Fashion-3
DESCRIPTION
[ Source ] Network collection, which covers typical scenarios such as e-commerce, fashion shows, social networking and offline user-generated content, etc.
[ Annotation ] Contour segmentation
Including the main human parts, clothing, accessories, etc.
VOLUME
~500K
APPLICATION SCENARIO
E-commerce,Smart RetailO2O,Social Media,
Visual Entertainment
Dataset ID
MD-Fashion-4
DESCRIPTION
[ Source ] Network collection, which covers typical scenarios such as e-commerce, fashion shows, social networking and offline user-generated content, etc.
[ Annotation ] Segmentation, label.
Regional segmentation of 11 common fabric categories including 80 clothing types.
VOLUME
~200K
APPLICATION SCENARIO
E-commerce,Smart RetailO2O,Social Media,
Visual Entertainment
Dataset ID
MD-Fashion-5
DESCRIPTION
[ Source ] Network collection, which covers typical scenarios such as e-commerce, fashion shows, social networking and offline user-generated content, etc.
[ Annotation ] Classified label, bounding box, key point Includes key points with 80 clothing types.
VOLUME
~1M
APPLICATION SCENARIO
E-commerce,Smart RetailO2O,Social Media,
Visual Entertainment
Dataset ID
MD-Fashion-6
DESCRIPTION
[ Source ] Real scenes,The collection equipment include phones, cameras and tablet PCs. The image resolution is above 4000*3000, and the image format is JPG.
[ Annotation ] Polygon+OCR
VOLUME
~20K
APPLICATION SCENARIO
E-commerce,Smart RetailO2O,Social Media,
Visual Entertainment
Dataset ID
MD-OCR-001
DESCRIPTION
[ Source ] Real scenes signboards in English and Chinese. The collection equipment include phones, cameras and tablet PCs.
[ Annotation ] Polygon + Text.
Including unit letter and sentence.
[ Definition ] Including enterprise name, branch name, slogan, business scope, address and telephone, etc.
VOLUME
~30K
APPLICATION SCENARIO
Retail, Tourism, Catering
Dataset ID
MD-OCR-002
DESCRIPTION
[ Source ] Bill in different scenes. The collection equipment include phones, cameras and tablet PCs.
It covers over 10 kinds of common bills in mainland China,including flight itinerary, train tickets, hotel bill, ticket, taxi receipt, quota invoice, value-added tax invoice, toll invoice and coach ticket invoice, etc.
[ Annotation ] Polygon+OCR
[ Definition ] Over 20 label types, including categories, provinces, quality, code, number, invoice date, enterprise/ certificate number, tax, telephone, car license, ID, boarding time, drop-off time, price, mileage, wait time, surcharge, service charge and official receipts, etc.
VOLUME
~6K
APPLICATION SCENARIO
Retail, Tourism, Catering
Dataset ID
MD-OCR-003
DESCRIPTION
[ Source ] Real scenes,The collection equipment include phones, cameras and tablet PCs. The image resolution is above 4000*3000, and the image format is JPG.
[ Annotation ] Polygon+OCR
VOLUME
~12K
APPLICATION SCENARIO
Mobile
Dataset ID
MD-OCR-004
DESCRIPTION
[ Source ] Real scenes, The collection equipment include phones, cameras and tablet PCs.
[ Annotation ] Polygon+OCR. Including IP address and password.
VOLUME
~1K
APPLICATION SCENARIO
Mobile, Tourism, Catering
Dataset ID
MD-OCR-005
DESCRIPTION
[ Source ] Real scenes, The collection equipment include phones, cameras and tablet PCs. The image resolution is above 4000*3000, and the image format is JPG.
[ Annotation ] Ploygon+OCR
VOLUME
~20K
APPLICATION SCENARIO
Catering, Tourism
Dataset ID
MD-OCR-006
DESCRIPTION
[ Source ] Network collection,covering indoor, outdoor, natural scenes, garden scenes and other typical scenes, etc.
[ Annotation ] Classified label.
Thousands of different animals, contains mammals, aquatic animals, and amphibians distributed in Asia, Europe, Africa, North America, and South America.
VOLUME
~33K
APPLICATION SCENARIO
Tourism, Retail, Catering
Dataset ID
MD-OCR-007
DESCRIPTION
[ Source ] Real scenes. The collection equipment include phones, cameras and tablet PCs.
The image includes commodity wrappage, signboard, signpost, posters, parking lots, bodywork advertising, food packaging, architecture words, signs and book covers, etc.
[ Annotation ] Polygon+OCR
All text include simplified Chinese, English, Arabic numerals, and common symbols (commas, periods, Spaces, etc.)
VOLUME
~38K
APPLICATION SCENARIO
Tourism, Retail, Catering
Dataset ID
MD-OCR-008
DESCRIPTION
[ Source ] Real scenes. The collection equipment include phones, cameras and tablet PCs. The image resolution is above 4000*3000, and the image format is JPG.
It covers over 10 scenes with Arabic, Thai, Vietnamese, Hindi, English and Chinese,including commodity wrappage, signboard, signpost, poster, Electric appliance words, parking lots, costume words, architecture words, signs, menu, book covers, shopping prompt and tourist spots words, etc.
[ Annotation ] Polygon+OCR
VOLUME
~150K
APPLICATION SCENARIO
Tourism, Retail, Catering
Dataset ID
MD-OCR-009
DESCRIPTION
[ Source ] Network collection, covering indoor, outdoor, natural scenes, garden scenes and other typical scenes, etc.
[ Annotation ] Classified label.
Thousands of different animals, contains mammals, aquatic animals, and amphibians distributed in Asia, Europe, Africa, North America, and South America.
VOLUME
~40K
APPLICATION SCENARIO
Tourism, Retail, Catering
Dataset ID
MD-OCR-010
DESCRIPTION
[ Source ] Screenshots of web pages and manuscripts,the image format is JPG.
[ Annotation ] Polygon+OCR
VOLUME
~1K
APPLICATION SCENARIO
News,Tourism
Dataset ID
MD-OCR-011
DESCRIPTION
[ Source ] Common medical documents include medical invoices, billing lists, expense lists, and medical records.
[ Annotation ] Polygon+OCR, Personal privacy information preprocessing.
[ Definition ] Text and category information of edical invoices, billing lists, expense lists, and case report.
VOLUME
~10K
APPLICATION SCENARIO
Healthcare
Dataset ID
MD-OCR-012
DESCRIPTION
[ Source ] Health Examination Report
[ Annotation ] Polygon+OCR, Personal privacy information preprocessing.
VOLUME
~3K
APPLICATION SCENARIO
Healthcare
Dataset ID
MD-OCR-013
DESCRIPTION
[ Source ] Handwritten composition shooting, manuscript screenshots.
[ Annotation ] Polygon+OCR
VOLUME
~1K
APPLICATION SCENARIO
Education
Dataset ID
MD-OCR-014
DESCRIPTION
[ Source ] Real scenes,The collection equipment include phones, cameras and tablet PCs. The image resolution is above 4000*3000, and the image format is JPG.
Special shape text in Chinese and English, such as dense text, vertical text, arc text, art word, difficult sample, etc, covering posters, advertisements, commodity packaging, book pages, magazine pages, newspapers, clothing, LOGO, jerseys, home emblem, seals, shop signs, clothes, home decoration, etc.
[ Annotation ] Polygon + Text
VOLUME
~30K
APPLICATION SCENARIO
Tourism, Retail, Mobile
Dataset ID
MD-Image-001
DESCRIPTION
[ Source ] Network collection,covering indoor, outdoor, natural scenes, garden scenes and other typical scenes, etc.
[ Annotation ] Classified label.
Thousands of different animals, contains mammals, aquatic animals, and amphibians distributed in Asia, Europe, Africa, North America, and South America.
VOLUME
~280K
APPLICATION SCENARIO
Tourism, Entertainment, Education
Dataset ID
MD-Image-002
DESCRIPTION
[ Source ] Network collection, indoor scene
[ Annotation ] Contour segmentation
Nearly 200 kinds of food, including Chinese food, western food, Japanese food, fast food, bread and dessert, etc.
VOLUME
~30K
APPLICATION SCENARIO
Catering, Tourism
