Add Wikipedia Persian Dataset by pourmand1376 · Pull Request #3629 · LAION-AI/Open-Assistant (original) (raw)
Currently, the Open-assistant model doesn't support Farsi. This is a text-only dataset to learn Farsi (Persian).
One of my friends fine-tuned LLaMa on this dataset and It could understand Farsi grammar and word usage very well. If the Open-assistant team wants to add support to Farsi, this should be the first step.
I have transformed the dataset into the standard that has been mentioned here and uploaded it to my huggingface account.