Ms Marco Minilm L12 Hindsight Reranker
| Entity Passport | |
| Registry ID | hf-model--jasperhg90--ms-marco-minilm-l12-hindsight-reranker |
| License | MIT |
| Provider | huggingface |
Cite this model
Academic & Research Attribution
@misc{hf_model__jasperhg90__ms_marco_minilm_l12_hindsight_reranker,
author = {JasperHG90},
title = {Ms Marco Minilm L12 Hindsight Reranker Model},
year = {2026},
howpublished = {\url{https://huggingface.co/jasperhg90/ms-marco-minilm-l12-hindsight-reranker}},
note = {Accessed via Free2AITools Knowledge Fortress}
} đŦTechnical Deep Dive
Full Specifications [+]âž
Quick Commands
huggingface-cli download jasperhg90/ms-marco-minilm-l12-hindsight-reranker pip install -U transformers âī¸ Nexus Index V2.0
đŦ Index Insight
FNI V2.0 for Ms Marco Minilm L12 Hindsight Reranker: Semantic (S:50), Authority (A:0), Popularity (P:18), Recency (R:93), Quality (Q:50).
Verification Authority
đ What's Next?
Technical Deep Dive
Hindsight Memory Reranker
A fine-tuned cross-encoder reranking model optimized for ranking documents in Hindsight-formatted agent memory systems.
Model Description
This model is a fine-tuned version of cross-encoder/ms-marco-MiniLM-L12-v2, specifically trained to rerank memory documents formatted according to the Hindsight memory architecture.
The model is trained with Quantization-Aware Training (QAT) and exported to ONNX format with INT8 dynamic activation and INT4 weight quantization for efficient inference.
Key Features
- Date-Range Aware: Distinguishes between ongoing facts (
[End: ongoing]), completed events ([End: <date>]), and timeless facts (no date prefix) - Hindsight-Optimized: Trained on documents with temporal anchors, fact types, and contextual metadata
- Quantized: INT8/INT4 quantization for efficient CPU deployment
- ONNX Export: Ready for production deployment without PyTorch dependency
Usage
With ONNX Runtime
import onnxruntime as ort
from tokenizers import Tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
tokenizer.enable_truncation(max_length=512)
tokenizer.enable_padding(pad_id=0, pad_token="[PAD]")
session = ort.InferenceSession("model.onnx")
def score(query: str, documents: list[str]) -> list[float]:
pairs = [(query, doc) for doc in documents]
encodings = tokenizer.encode_batch(pairs)
input_ids = np.array([e.ids for e in encodings], dtype=np.int64)
attention_mask = np.array([e.attention_mask for e in encodings], dtype=np.int64)
token_type_ids = np.array([e.type_ids for e in encodings], dtype=np.int64)
logits = session.run(None, {
"input_ids": input_ids,
"attention_mask": attention_mask,
"token_type_ids": token_type_ids,
})[0]
return logits.flatten().tolist()
Document Formatting
Documents must be formatted with date ranges before ranking:
[Start: January 01, 2024 (2024-01-01)] [End: ongoing] [World] Ruby Martinez is the Department Head of the Engineering Department at TechCo Global.
Format cases:
| Scenario | Format |
|---|---|
| Ongoing fact (started, still true) | [Start: Month DD, YYYY (YYYY-MM-DD)] [End: ongoing] [Type] Text |
| Completed fact (started and ended) | [Start: Month DD, YYYY (YYYY-MM-DD)] [End: Month DD, YYYY (YYYY-MM-DD)] [Type] Text |
| Timeless fact (no temporal anchor) | [Type] Text |
| End-only (rare) | [End: Month DD, YYYY (YYYY-MM-DD)] [Type] Text |
Where:
- Type:
World(enduring fact),Event(bounded occurrence), orObservation(synthesized insight) - Context (optional): Category prefix before the text (e.g.,
Architecture Decision: ...)
Training Details
Training Data
315 hand-crafted triplets covering diverse domains: software engineering, data engineering, DevOps, personal life, health, finance, hobbies, work culture, and consumer products. All names and organizations are fictional.
Each triplet contains:
query: Natural language questionpositive: The memory unit that correctly answers the querynegative: A plausible but incorrect memory unit (hard negative â topically related but wrong)
Both positive and negative include occurred_start and occurred_end fields to train date-range awareness.
| Split | Samples |
|---|---|
| Train | 315 |
| Eval | 70 |
| Test | 60 |
Example:
{
"query": "What framework does the Memex API use?",
"positive": {
"text": "The Memex application remains a Python-based system utilizing the FastAPI framework. | When: As of March 20, 2026 | Involving: The development team | Following the abandonment of the Rust migration spike...",
"occurred_start": "2026-03-20T00:00:00",
"occurred_end": null,
"type": "world",
"context": ""
},
"negative": {
"text": "A Rust migration of the Memex server was initiated as a proof-of-concept spike but was ultimately abandoned and never completed. | When: Before March 20, 2026 | Involving: The development team...",
"occurred_start": "2026-02-01T00:00:00",
"occurred_end": "2026-03-20T00:00:00",
"type": "event",
"context": ""
}
}
Training Configuration
| Parameter | Value |
|---|---|
| Base Model | cross-encoder/ms-marco-MiniLM-L12-v2 |
| Epochs | 10 |
| Batch Size | 8 |
| Learning Rate | 6e-6 |
| Loss Function | Margin Ranking Loss (margin=1.0) |
| Max Sequence Length | 512 |
| Quantization | INT8 dynamic activation, INT4 weights (QAT via torchao) |
Loss Function
Uses Margin Ranking Loss to ensure positive documents receive higher relevance scores than negative documents:
loss = max(0, margin - score_positive + score_negative)
Evaluation Results
v2 â Date range format (2026-03-22)
| Metric | Baseline (Eval) | Trained (Eval) | Baseline (Test) | Trained (Test) |
|---|---|---|---|---|
| Accuracy@1 | 0.8857 | 0.9571 | 0.8667 | 0.9333 |
| Accuracy@3 | 1.0000 | 1.0000 | 0.9833 | 0.9833 |
| Accuracy@5 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Accuracy@10 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| MRR@10 | 0.9429 | 0.9786 | 0.9208 | 0.9597 |
Note: The v2 baseline scores are higher than v1 because the v1 dataset contained low-quality examples (single-word triplets, trivially short texts) that confused even a strong baseline. The v2 data was fully rewritten with natural memory unit text. The relative improvement (baseline â trained) is the meaningful signal.
v1 â Single date format (previous)
| Metric | Baseline (Eval) | Trained (Eval) | Baseline (Test) | Trained (Test) |
|---|---|---|---|---|
| Accuracy@1 | 0.6286 | 0.6714 | 0.5167 | 0.7167 |
| Accuracy@3 | 0.8143 | 0.8714 | 0.8000 | 0.8167 |
| Accuracy@5 | 0.8571 | 0.9000 | 0.8500 | 0.8833 |
| MRR@10 | 0.7290 | 0.7728 | 0.6681 | 0.7788 |
Model Architecture
- Type: Cross-Encoder for sequence classification
- Base: MiniLM-L12 (12 transformer layers)
- Output: Single relevance score (logits)
- Quantization: INT8 dynamic activation + INT4 weight (via torchao)
Changelog
v2 (2026-03-22)
- Breaking: Input format changed from
[Date: ...]to[Start: ...] [End: ...]date ranges - Added
occurred_endsupport for distinguishing ongoing vs completed facts - Timeless facts (no dates) now have no date prefix instead of a forced date
- Training data fully rewritten: 315 hand-crafted triplets with diverse domains
- Removed low-quality single-word examples from v1 dataset
v1 (initial)
- Single
[Date: ...]timestamp format - 300 training triplets (mixed quality)
Limitations
- Training data is synthetic; the model may not generalize to all real-world retrieval patterns
- Optimized for Memex memory unit format; may not perform well on arbitrary document formats
- English language only
- Small eval/test corpus (140/120 docs) â absolute accuracy numbers may not reflect production performance at larger scale
License
MIT License
â ī¸ Incomplete Data
Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.
View Original Source âđ Limitations & Considerations
- âĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
- âĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
- âĸ FNI scores are relative rankings and may change as new models are added.
- â License Unknown: Verify licensing terms before commercial use.
Social Proof
AI Summary: Based on Hugging Face metadata. Not a recommendation.
đĄī¸ Model Transparency Report
Technical metadata sourced from upstream repositories.
đ Identity & Source
- id
- hf-model--jasperhg90--ms-marco-minilm-l12-hindsight-reranker
- slug
- jasperhg90--ms-marco-minilm-l12-hindsight-reranker
- source
- huggingface
- author
- JasperHG90
- license
- MIT
- tags
- transformers, onnx, bert, text-classification, cross-encoder, reranker, hindsight, agent-memory, quantization, en, arxiv:2512.12818, base_model:cross-encoder/ms-marco-minilm-l12-v2, license:mit, text-embeddings-inference, endpoints_compatible, region:us
âī¸ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
- text-classification
đ Engagement & Metrics
- downloads
- 387
- stars
- 0
- forks
- 0
Data indexed from public sources. Updated daily.