πŸ“„
Paper

EDGAR-CORPUS: Billions of Tokens Make The World Go Round

by Independent / Community 00b2de0153e32de9b86da772ad5cdc2b0cac1002
Free2AITools Nexus Index
68.7
S: Semantic 50

Query-time baseline · scored live at search

A: Authority 84
P: Popularity 60
R: Recency 100
Q: Quality 65
Tech Context
Vital Performance

We release EDGAR-CORPUS, a novel corpus comprising annual reports from all the publicly traded companies in the US spanning a period of more than 25 years. To the best of our knowledge, EDGAR-CORPUS is the largest financial NLP corpus available to date. All the reports are downloaded, split into their corresponding items (sections), and provided in a clean, easy-to-use JSON format. We use EDGAR-CORPUS to train and release EDGAR-W2V, which are WORD2VEC embeddings for the financial domain. We e...

Semantic Scholar 47 Citations
Paper Information Summary
Entity Passport
Registry ID 00b2de0153e32de9b86da772ad5cdc2b0cac1002
License ArXiv
Provider semantic_scholar
πŸ“œ

Cite this paper

Academic & Research Attribution

BibTeX
@misc{00b2de0153e32de9b86da772ad5cdc2b0cac1002,
  author = {Unknown},
  title = {EDGAR-CORPUS: Billions of Tokens Make The World Go Round Paper},
  year = {2026},
  howpublished = {\url{https://api.semanticscholar.org/00b2de0153e32de9b86da772ad5cdc2b0cac1002}},
  note = {Accessed via Free2AITools.}
}
APA Style
Unknown. (2026). EDGAR-CORPUS: Billions of Tokens Make The World Go Round [Paper]. Free2AITools. https://api.semanticscholar.org/00b2de0153e32de9b86da772ad5cdc2b0cac1002

πŸ”¬Technical Deep Dive

Full Specifications [+]

βš–οΈ Free2AITools Nexus Index V2.0

Semantic (S) 50

Query-time baseline · scored live at search

Authority (A) 84
Popularity (P) 60
Recency (R) 100
Quality (Q) 65

πŸ’¬ Index Insight

FNI V2.0 for EDGAR-CORPUS: Billions of Tokens Make The World Go Round: Authority (A:84), Popularity (P:60), Recency (R:100), Quality (Q:65). Semantic (S) is a query-time baseline scored live at search.

Free2AITools Nexus Index

Data Sources / Provenance

Open data Updated: Live data

πŸ“ Executive Summary

"We release EDGAR-CORPUS, a novel corpus comprising annual reports from all the publicly traded companies in the US spanning a period of more than 25 years. To the best of our knowledge, EDGAR-CORPUS is the largest financial NLP corpus available to date. All the reports are downloaded, split into their corresponding items (sections), and provided in a clean, easy-to-use JSON format. We use EDGAR-CORPUS to train and release EDGAR-W2V, which are WORD2VEC embeddings for the financial domain. We e..."

❝ Cite Node

@article{Unknown2026EDGAR-CORPUS:,
  title={EDGAR-CORPUS: Billions of Tokens Make The World Go Round},
  author={},
  note={Indexed by Free2AITools},
  year={2026}
}

πŸ”— Full Paper

Free2AITools indexes the abstract and factual metadata for this paper. Read the complete, authoritative paper on the official source.

Read the full paper on arXiv

πŸ“Š Research Signals

πŸ“ˆ47CitationsSemantic Scholar
πŸ›οΈ84AuthorityFNI pillar
⏱️100RecencyFNI pillar
βœ…65QualityFNI pillar
πŸ—‚οΈautomation workflowField

🏷️ Research Topics

embeddings
πŸ“¦Data Source: semantic_scholar
πŸ”„ Updated daily

Source summary: Based on semantic_scholar metadata. Not a recommendation.

πŸ“Š FNI Methodology πŸ“š Knowledge Baseℹ️ Verify with original source

πŸ›‘οΈ Paper Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

πŸ†” Identity & Source

source
semantic_scholar
author
Unknown
license
ArXiv
tags
paper, research, academic

βš™οΈ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

πŸ“Š Engagement & Metrics

downloads
0
stars
null
forks
null
citations
47

Data indexed from public sources. Updated daily.