πŸ“Š
Dataset

Forgespectrum 114k

by hardiksharma6555 hardiksharma6555/forgespectrum-114k
Free2AITools Nexus Index
59.6
S: Semantic 50

Query-time baseline · scored live at search

A: Authority 62
P: Popularity 52
R: Recency 99
Q: Quality 50
Tech Context
Vital Performance
Data Integrity 59.6 FNI Score
- Size
- Rows
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hardiksharma6555/forgespectrum-114k
License CC-BY-NC-4.0
Provider huggingface
πŸ“œ

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset_hardiksharma6555_forgespectrum_114k,
  author = {hardiksharma6555},
  title = {Forgespectrum 114k Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/hardiksharma6555/forgespectrum-114k}},
  note = {Accessed via Free2AITools.}
}
APA Style
hardiksharma6555. (2026). Forgespectrum 114k [Dataset]. Free2AITools. https://huggingface.co/datasets/hardiksharma6555/forgespectrum-114k

πŸ”¬Technical Deep Dive

Full Specifications [+]

βš–οΈ Free2AITools Nexus Index V2.0

Semantic (S) 50

Query-time baseline · scored live at search

Authority (A) 62
Popularity (P) 52
Recency (R) 99
Quality (Q) 50

πŸ’¬ Index Insight

FNI V2.0 for Forgespectrum 114k: Authority (A:62), Popularity (P:52), Recency (R:99), Quality (Q:50). Semantic (S) is a query-time baseline scored live at search.

Free2AITools Nexus Index

Data Sources / Provenance

Open data Updated: Live data
⬇️
Downloads
33,477

🎯 Task Categories

image-classification

πŸ‘οΈ Data Preview

πŸ“Š

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

πŸ”— Explore Full Dataset β†—

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

ForgeSpectrum (v3) β€” AI-Generated Image Detection with Reasoning Traces

ForgeSpectrum is a multi-domain corpus for AI-generated / manipulated image detection, annotated by Gemini-2.5-Pro with structured forensic reasoning traces (<fast>/<planning>/<reasoning>/<reflection>/<conclusion> patterns) plus per-image attributes and suspicious-region notes.

v3 β€” what changed

v3 is the cleaned, balanced release:

  • 3 domains: faces, scenes, id_cards (docs and scene_text removed β€” see below).
  • id_cards rebalanced: real IDs sourced from MIDV-2020 (2,938 genuine passport/ID images: Finnish ID, Latvian passport, Russian internal passport, Slovak ID), raising id_cards reals from 94 to 2,913.
  • docs dropped: the synthetic tampered-document fakes were lost from source and are not redistributable; only real docs remained, so the domain was removed.
  • scene_text dropped: corpus contained no real scene-text images (binary task ill-posed).

Splits (`disjoint_v3/`)

Split Images Real Fake Real %
train 67,306 31,992 35,314 47.5%
val 8,880 4,179 4,701 47.1%
test_clean 8,425 4,184 4,241 49.7%
test_divergent 14,658 3,180 11,478 21.7% (agreement-only)
test_protocol2 5,238 2,913 2,325 55.6% (leave-domain-out)

Split protocol: fakes are generator-disjoint across train/val/test (a generator seen in train never appears in val/test) so evaluation measures cross-generator generalization. Reals are split image-level (each domain has a single real-capture source).

Per-domain totals (supervised splits)

Domain Real Fake Fake generators
faces 9,564 17,606 36
scenes 27,878 24,325 31
id_cards 2,913 2,325 5

Files

  • `disjoint_v3/{train,val,test_clean,test_d

Social Proof

HuggingFace Hub
33.5KDownloads
πŸ”„ Updated daily

Source summary: Based on Hugging Face metadata. Not a recommendation.

πŸ“Š FNI Methodology πŸ“š Knowledge Baseℹ️ Verify with original source

πŸ›‘οΈ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

πŸ†” Identity & Source

id
hf-dataset--hardiksharma6555--forgespectrum-114k
slug
hardiksharma6555--forgespectrum-114k
source
huggingface
author
hardiksharma6555
license
CC-BY-NC-4.0
tags
task_categories:image-classification, task_categories:visual-question-answering, language:en, license:cc-by-nc-4.0, size_categories:10k<n<100k, format:imagefolder, modality:image, library:datasets, library:mlcroissant, region:us, deepfake-detection, ai-generated-image-detection, forensics, vlm, reasoning-traces

βš™οΈ Technical Specs

architecture
null
params billions
null
context length
116,736
pipeline tag

πŸ“Š Engagement & Metrics

downloads
33,477
stars
null
forks
null

Data indexed from public sources. Updated daily.