本次我们选择的安卓游戏对象叫“单词英雄”,大家可以先下载这个游戏。
游戏的界面是这样的:
通过选择单词的意思进行攻击,选对了就正常攻击,选错了就象征性的攻击一下。玩了一段时间之后琢磨可以做成自动的,通过PIL识别图片里的单词和选项,然后翻译英文成中文意思,根据中文模糊匹配选择对应的选项。
查找了N多资料以后开始动手,程序用到以下这些东西:
PIL:Python Imaging Library 大名鼎鼎的图片处理模块
pytesser:Python下用来驱动tesseract-ocr来进行识别的模块
Tesseract-OCR:图像识别引擎,用来把图像识别成文字,可以识别英文和中文,以及其它语言
autopy:Python下用来模拟操作鼠标和键盘的模块。
安装步骤(win7环境):
(1)安装PIL,下载地址:http://www.pythonware.com/products/pil/,安装Python Imaging Library 1.1.7 for Python 2.7。
(2)安装pytesser,下载地址:http://code.google.com/p/pytesser/,下载解压后直接放在
C:\Python27\Lib\site-packages下,在文件夹下建立pytesser.pth文件,内容为C:\Python27\Lib\site-packages\pytesser_v0.0.1
(3)安装Tesseract OCR engine,下载:https://github.com/tesseract-ocr/tesseract/wiki/Downloads,下载Windows installer of tesseract-ocr 3.02.02 (including English language data)的安装文件,进行安装。
(4)安装语言包,在https://github.com/tesseract-ocr/tessdata下载chi_sim.traineddata简体中文语言包,放到安装的Tesseract OCR目标下的tessdata文件夹内,用来识别简体中文。
(5)修改C:\Python27\Lib\site-packages\pytesser_v0.0.1下的pytesser.py的函数,将原来的image_to_string函数增加语音选择参数language,language='chi_sim'就可以用来识别中文,默认为eng英文。
改好后的pytesser.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
|
"""OCR in Python using the Tesseract engine from Google http://code.google.com/p/pytesser/ by Michael J.T. O'Kelly V 0.0.1, 3/10/07""" import Image import subprocess import util import errors tesseract_exe_name = 'tesseract' # Name of executable to be called at command line scratch_image_name = "temp.bmp" # This file must be .bmp or other Tesseract-compatible format scratch_text_name_root = "temp" # Leave out the .txt extension cleanup_scratch_flag = True # Temporary files cleaned up after OCR operation def call_tesseract(input_filename, output_filename, language): """Calls external tesseract.exe on input file (restrictions on types), outputting output_filename+'txt'""" args = [tesseract_exe_name, input_filename, output_filename, "-l", language] proc = subprocess.Popen(args) retcode = proc.wait() if retcode!=0: errors.check_for_errors() def image_to_string(im, cleanup = cleanup_scratch_flag, language = "eng"): """Converts im to file, applies tesseract, and fetches resulting text. If cleanup=True, delete scratch files after operation.""" try: util.image_to_scratch(im, scratch_image_name) call_tesseract(scratch_image_name, scratch_text_name_root,language) text = util.retrieve_text(scratch_text_name_root) finally: if cleanup: util.perform_cleanup(scratch_image_name, scratch_text_name_root) return text def image_file_to_string(filename, cleanup = cleanup_scratch_flag, graceful_errors=True, language = "eng"): """Applies tesseract to filename; or, if image is incompatible and graceful_errors=True, converts to compatible format and then applies tesseract. Fetches resulting text. If cleanup=True, delete scratch files after operation.""" try: try: call_tesseract(filename, scratch_text_name_root, language) text = util.retrieve_text(scratch_text_name_root) except errors.Tesser_General_Exception: if graceful_errors: im = Image.open(filename) text = image_to_string(im, cleanup) else: raise finally: if cleanup: util.perform_cleanup(scratch_image_name, scratch_text_name_root) return text if __name__=='__main__': im = Image.open('phototest.tif') text = image_to_string(im) print text try: text = image_file_to_string('fnord.tif', graceful_errors=False) except errors.Tesser_General_Exception, value: print "fnord.tif is incompatible filetype. Try graceful_errors=True" print value text = image_file_to_string('fnord.tif', graceful_errors=True) print "fnord.tif contents:", text text = image_file_to_string('fonts_test.png', graceful_errors=True) print text |
(6)安装autopy,下载地址:https://pypi.python.org/pypi/autopy,下载autopy-0.51.win32-py2.7.exe进行安装,用来模拟鼠标操作。
说下程序的思路:
1. 首先是通过模拟器在WINDOWS下执行安卓的程序,然后用PicPick进行截图,将战斗画面中需要用到的区域进行测量,记录下具体在屏幕上的位置区域,用图中1来判断战斗是否开始(保存下来用作比对),用2,3,4,5,6的区域抓取识别成文字。
计算图片指纹的程序:
1
2
3
4
5
6
|
def get_hash(self, img): #计算图片的hash值 image = img.convert("L") pixels = list(image.getdata()) avg = sum(pixels) / len(pixels) return "".join(map(lambda p : "1" if p > avg else "0", pixels)) |
图片识别成字符:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
#识别出对应位置图像成字符,把字符交给chose处理 def getWordMeaning(self): pic_up = ImageGrab.grab((480,350, 480+300, 350+66)) pic_aws1 = ImageGrab.grab((463,456, 463+362, 456+45)) pic_aws2 = ImageGrab.grab((463,530, 463+362, 530+45)) pic_aws3 = ImageGrab.grab((463,601, 463+362, 601+45)) pic_aws4 = ImageGrab.grab((463,673, 463+362, 673+45)) str_up = image_to_string(pic_up).strip().lower() #判断当前单词和上次识别单词相同,就不继续识别 if str_up <> self.lastWord: #如果题目单词是英文,选项按中文进行识别 if str_up.isalpha(): eng_up = self.dt[str_up].decode('gbk') if self.dt.has_key(str_up) else '' chs1 = image_to_string(pic_aws1, language='chi_sim').decode('utf-8').strip() chs2 = image_to_string(pic_aws2, language='chi_sim').decode('utf-8').strip() chs3 = image_to_string(pic_aws3, language='chi_sim').decode('utf-8').strip() chs4 = image_to_string(pic_aws4, language='chi_sim').decode('utf-8').strip() print str_up, ':', eng_up self.chose(eng_up, (chs1, chs2, chs3, chs4)) #如果题目单词是中文,选项按英文进行识别 else: chs_up = image_to_string(pic_up, language='chi_sim').decode('utf-8').strip() eng1 = image_to_string(pic_aws1).strip() eng2 = image_to_string(pic_aws2).strip() eng3 = image_to_string(pic_aws3).strip() eng4 = image_to_string(pic_aws4).strip() e2c1 = self.dt[eng1].decode('gbk') if self.dt.has_key(eng1) else '' e2c2 = self.dt[eng2].decode('gbk') if self.dt.has_key(eng2) else '' e2c3 = self.dt[eng3].decode('gbk') if self.dt.has_key(eng3) else '' e2c4 = self.dt[eng4].decode('gbk') if self.dt.has_key(eng4) else '' print chs_up self.chose(chs_up, (e2c1, e2c2, e2c3, e2c4)) self.lastWord = str_up return str_up |
2. 对于1位置的图片提前截一个保存下来,然后通过计算当前画面和保存下来的图片的距离,判断如果小于40的就表示已经到了选择界面,然后识别2,3,4,5,6成字符,判断如果2位置识别成英文字符的,就用2解析出来的英文在字典中获取中文意思,然后再通过2的中文意思和3,4,5,6文字进行匹配,匹配上汉字最多的就做选择,如果匹配不上默认返回最后一个。之前本来考虑是用Fuzzywuzzy来进行模糊匹配算相似度的,不过后来测试了下对于中文匹配的效果不好,就改成按汉字单个进行匹配计算相似度。
匹配文字进行选择:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
#根据传入的题目和选项进行匹配选择 def chose(self, g, chs_list): j, max_score = -1, 0 same_list = None #替换掉题目里的特殊字符 re_list = [u'~', u',', u'.', u';', u' ', u'a', u'V', u'v', u'i', u'n', u'【', u')', u'_', u'W', u'd', u'j', u'-', u't'] for i in re_list: g = g.replace(i, '') print type(g) #判断2个字符串中相同字符,相同字符最多的为最佳答案 for i, chsWord in enumerate(chs_list): print type(chsWord) l = [x for x in g if x in chsWord and len(x)>0] score = len(l) if l else 0 if score > max_score: max_score = score j = i same_list = l #如果没有匹配上默认选最后一个 if j ==-1: print '1. %s; 2. %s; 3. %s; 4. %s; Not found choice.' % (chs_list[0], chs_list[1], chs_list[2], chs_list[3]) else: print '1. %s; 2. %s; 3. %s; 4. %s; choice: %s' % (chs_list[0], chs_list[1], chs_list[2], chs_list[3], chs_list[j]) for k, v in enumerate(same_list): print str(k) + '.' + v, order = j + 1 self.mouseMove(order) return order |
3.最后通过mouseMove调用autopy操作鼠标点击对应位置进行选择。
程序运行的录像:http://v.youku.com/v_show/id_XMTYxNTAzMDUwNA==.html
程序完成后使用正常,因为图片识别准确率和字典的问题,正确率约为70%左右,效果还是比较满意。程序总体来说比较简单,做出来也就是纯粹娱乐一下,串联使用了图片识别、中文模糊匹配、鼠标模拟操作,算是个简单的小外挂吧,源程序和用到的文件如下:
http://git.oschina.net/highroom/My-Project/tree/master/Word%20Hero