PHP读取文件,解决中文乱码UTF-8的方法分析_PHP教程

PHP读取文件,解决中文乱码UTF-8的方法分析

2021-09-28 11:10luyaran PHP教程

这篇文章主要介绍了PHP读取文件,解决中文乱码UTF-8的方法,结合实例形式对比分析了PHP文件读取及编码转换相关操作技巧,需要的朋友可以参考下

本文实例讲述了PHP读取文件,解决中文乱码UTF-8的方法。分享给大家供大家参考，具体如下：

				?

									$opts = array(

									  'file' => array(

									    'encoding' => "utf-8"

									  )

									);

									$opts = array('http' => array('encoding' => 'utf-8'));

									$ctxt = stream_context_create($opts);

									$content = file_get_contents($filePath, FILE_TEXT, $ctxt);

最简单的就是将GF2312→UTF-8

				?

									$str = iconv("gb2312", "utf-8", $str);

不管用的

				?

									$content = mb_convert_encoding($content, "UTF-8", "auto");

******************************************丑陋的分割线来告诉大家上面的不好的：下面的才是正确的方法···哈哈···**********************************************************

				?

									define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));

									define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));

									define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));

									define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));

									define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));

									$text = file_get_contents($newPath);

									$first2 = substr($text, 0, 2);

									$first3 = substr($text, 0, 3);

									$first4 = substr($text, 0, 3);

									$encodType = "";

									if ($first3 == UTF8_BOM)

									  $encodType = 'UTF-8 BOM';

									else if ($first4 == UTF32_BIG_ENDIAN_BOM)

									  $encodType = 'UTF-32BE';

									else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)

									  $encodType = 'UTF-32LE';

									else if ($first2 == UTF16_BIG_ENDIAN_BOM)

									  $encodType = 'UTF-16BE';

									else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)

									  $encodType = 'UTF-16LE';

									$content = file_get_contents($newPath);

									$content = iconv($encodType, "utf-8", $content);

终极版·····

				?

									$text = file_get_contents($filePath);

									//$encodType = mb_detect_encoding($text);

									define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));

									define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));

									define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));

									define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));

									define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));

									$first2 = substr($text, 0, 2);

									$first3 = substr($text, 0, 3);

									$first4 = substr($text, 0, 3);

									$encodType = "";

									if ($first3 == UTF8_BOM)

									  $encodType = 'UTF-8 BOM';

									else if ($first4 == UTF32_BIG_ENDIAN_BOM)

									  $encodType = 'UTF-32BE';

									else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)

									  $encodType = 'UTF-32LE';

									else if ($first2 == UTF16_BIG_ENDIAN_BOM)

									  $encodType = 'UTF-16BE';

									else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)

									  $encodType = 'UTF-16LE';

									//下面的判断主要还是判断ANSI编码的·

									if ($encodType == '') {//即默认创建的txt文本-ANSI编码的

									  $content = iconv("GBK", "UTF-8", $text);

									} else if ($encodType == 'UTF-8 BOM') {//本来就是UTF-8不用转换

									  $content = $text;

									} else {//其他的格式都转化为UTF-8就可以了

									  $content = iconv($encodType, "UTF-8", $text);

									}