Python语言检测模块langid和langdetect的使用实例_Python

Python语言检测模块langid和langdetect的使用实例

2021-05-30 00:39Together_CZ Python

今天小编就为大家分享一篇关于Python语言检测模块langid和langdetect的使用实例，小编觉得内容挺不错的，现在分享给大家，具有很好的参考价值，需要的朋友一起跟随小编来看看吧

之前使用数据编码风格检测的模块chardet比较多一点，今天提到的两个模块是检测数据的语言类型，比如是：中文还是英文，模块的使用方法也比较简单，我这里只是简单地使用了一下，因为项目中有这个需求，所以拿来用了一下，并没有深入地去研究这两个模块，模块的地址链接我都给出来了，需要的话可以去研究下：

				?

									def langidFunc():

									  '''

									  https://github.com/yishuihanhan/langid.py

									  '''

									  print langid.classify("We Are Family")

									  print langid.classify("Questa e una prova")

									  print langid.classify("我们都有一个家")

									  identifier=LanguageIdentifier.from_modelstring(model,norm_probs=True)

									  print identifier.classify("We Are Family")

									def langdetectFunc():

									  '''

									  https://github.com/yishuihanhan/langdetect

									  '''

									  s1=u"本篇文章主要介绍两款语言探测工具，用于区分文本到底是什么语言，"

									  s2=u'We are pleased to introduce today a new technology'

									  print detect(s1)

									  print detect(s2)

									  print detect_langs(s2)  # detect_langs()输出探测出的所有语言类型及其所占的比例

									  print detect_langs("Otec matka syn.")

结果如下：

('en', 9.061840057373047)
('it', -35.41771221160889)
('zh', -85.79573845863342)
('en', 0.16946150595865334)
zh-cn
en
[en:0.999998109575]
[pl:0.571426592237, fi:0.428568772028]

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，谢谢大家对服务器之家的支持。如果你想了解更多相关内容请查看下面相关链接

原文链接：https://blog.csdn.net/Together_CZ/article/details/86678423