Uplug corpus tools (original) (raw)
Various tools for creating annotated parallel corpora including pre-trained tagging and parsing models for various languages, sentence alignment tools and word alignment tools.
Uplug also includes a web-based interface for interactive sentence and word alignment and scripts for indexing and querying parallel corpora using the Corpus Work Bench CWB.
Download 'uplug-main' first and then add other packages.
Project Samples
License
GNU General Public License version 3.0 (GPLv3)
Never Get Blocked Again | Enterprise Web Scraping
Enterprise-Grade Proxies • Built-in IP Rotation • 195 Countries • 20K+ Companies Trust Us
Get unrestricted access to public web data with our ethically-sourced proxy network. Automated session management and advanced unblocking handle the hard parts. Scale from 1 to 1M requests with zero blocks. Built for developers with ready-to-use APIs, serverless functions, and complete documentation. Used by 20,000+ companies including Fortune 500s. SOC2 and GDPR compliant.