MN
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
4.2k 288 +3/wk
GitHub
chinese chinese-language chinese-nlp chinese-simplified corpus-data nlp nlp-machine-learning
Trend
3
Star & Fork Trend (17 data points)
Stars
Forks
Multi-Source Signals
Growth Velocity
esbatmop/MNBVC has +3 stars this period . 7-day velocity: 0.1%.
Deep analysis is being generated for this repository.
Signal-backed technical analysis will be available soon.
| Metric | MNBVC | lmql | oasis | Awesome-ChatGPT |
|---|---|---|---|---|
| Stars | 4.2k | 4.2k | 4.2k | 4.2k |
| Forks | 288 | 219 | 450 | 385 |
| Weekly Growth | +3 | -1 | +14 | +1 |
| Language | N/A | Python | Python | N/A |
| Sources | 1 | 1 | 1 | 1 |
| License | MIT | Apache-2.0 | Apache-2.0 | N/A |
Capability Radar vs lmql
MNBVC
lmql
Maintenance Activity 100
Last code push 2 days ago.
Community Engagement 35
Fork-to-star ratio: 6.9%. Lower fork ratio may indicate passive usage.
Issue Burden 70
Issue data not yet available.
Growth Momentum 44
+3 stars this period — 0.07% growth rate.
License Clarity 95
Licensed under MIT. Permissive — safe for commercial use.
Risk scores are computed from real-time repository data. Higher scores indicate healthier metrics.