About
HANSEL is a companion to GRETIL, focused on the open exchange of new and newly modified Sanskrit e-text material, especially diplomatic transcriptions of printed editions produced using OCR. It is a key part of the Kalpataru Grove ecosystem of digital Sanskrit projects.
See the Progress Page for latest changes and next steps.
FAQ
Expand All-
- No, HANSEL complements GRETIL by focusing on new or newly modified material. HANSEL has particular requirements of structure and quality that make mass-import of GRETIL impossible. That said, anyone wishing to improve upon a given GRETIL item can do that with HANSEL. In the meantime, GRETIL continues to exist statically online through its Uni Göttingen site, any given mirror thereof, and its official archive on TextGrid (see post on Indology).
-
- Like GRETIL, HANSEL is for Sanskrit literature as a whole, and contributing is as easy as sending an email. Like SARIT, HANSEL has an explicit data model, everything is thoroughly versioned using Git and GitHub, and thorough metadata is required. Like Muktabodha, the HANSEL corpus is readily available in plain-text for day-to-day use, and it facilitates search by metadata. Like DCS, HANSEL anticipates computational applications.
-
- The getting-started videos should answer most questions. Let us know if not!
-
-
If it's your first time getting in touch,
send us a message using the contact form.
Whatever state your project is in,
the library curator will help you from there.
You are encouraged to reach out before hand-typing or performing OCR, as HANSEL is specially designed to accept OCR data of certain sorts, and we can help you work more efficiently.
-
If it's your first time getting in touch,
send us a message using the contact form.
Whatever state your project is in,
the library curator will help you from there.
-
- HANSEL’s sustainability rests on several strengths: its creator is a technically skilled Sanskritist with the time, resources, and commitment to maintain it; its codebase is open-source, well-documented, and built on standard, low-cost technologies; and its independence from any single institution ensures long-term flexibility and resilience.
Backronym and Logo
HANSEL is the "Human-Accessible and NLP-ready Sanskrit E-text Library".
-
-
HANSEL's text data will always exist in forms that the greatest number of students and scholars can use directly, such as plain-text and HTML.
-
-
-
The data will also be structured in anticipation of computational applications. NLP stands for "natural language processing," examples of which include: n-gram analysis, scansion and meter identification, sandhi and compound splitting, TF-IDF, LDA topic modeling, word embeddings, machine-translation, and so on.
-
-
-
In Sanskrit literature, the swan or bar-headed goose (haṃsa) symbolizes both grace and discernment. A world away, in the Grimms' tale, Hänsel and Gretel succeed by working together.
-
Technical
Collection size: 5 items (1.6 MB*)
Web App: v0.9.1 (GitHub)
Text Data: v2026-02-06 (GitHub)
Transform Bundle: v0.10.1 (GitHub)
Stack: Python (Flask), JavaScript, Docker, Digital Ocean
License: CC BY-NC-SA 4.0. Please share and share alike!