Bangla NLP Toolkit
BanglaNLPToolkit is a package for several classic NLP text preprocessing and augmentations for Bangla NLP tasks
keywords: NLP, Deep Learning, PyPi
Brief Description
BanglaNLPToolkit is a package for several classic NLP text preprocessing and augmentations for Bangla NLP tasks.
Key features:
- Bangla Text Normalization.
- Bangla text unicode normalization for text preprocessing using bnunicodenormalizer and csebuetnlp/normalizer.
- Removal of punctuations or replacement of punctuations with desired sign as user desires.
- Bangla Punctuation
- Add punctuations to Bangla texts with no punctuations: Uses deep learning based Named Entity Recognition models for accurate punctuation addition.
- Bangla Text Augmentation
- Text augmentation techniques for generating similar but different texts for augmenting Bangla dataset.
- Uses paraphrasing, cross translation and masked word prediction algorithms for augmented text generation.
- Simple Bangla Tokenizer
- Robust simple word level and sententence level tokenizer for Bangla texts.
Project Link : BanglaNLPToolkit