python學習主題: 參考，但搞笑不能運行

自己实现游戏按键脚本

2017-07-02 20:12

本文发自 http://www.binss.me/blog/how-i-realize-quick-macro-by-python/，转载请注明出处。

玩某手游已经一年了，作为咸鱼，对于游戏资源的看法是够用就好，因此资源始终维持在低保线上。这样的悠闲生活一直维持到最新出的纲领为止。该纲领丧心病狂地需要50发大建，逼得我这条咸鱼开始手动挂短时间后勤，每隔一两个小时点几下。

某次被小伙伴看到，他惊呼：“现在还手动肝的人已经不多啦！”。请教之，答曰按键精灵大法。

这让我回想起初中年代用按键精灵挂回合制游戏的时光，那么多年过去了，没想到依然健在。但今时已非彼日，作为一名Programmer，能否自己写一个按键脚本呢？

说干就干，语言选择老朋友Python。

分析

模拟点击

首要解决的问题是模拟点击，经过一番搜索后发现 autopy 这个库不错，能够模拟鼠标移动和点击。

项目主页为：

https://github.com/msanders/autopy

pip装的就是这个版本。但由于项目已经年久失修，目前版本的Mac OS会安装失败，建议使用其他人维护的版本：


pip3 install git+https://github.com/potpath/autopy.git

安装完毕后，使用很简单：


import autopy


autopy.mouse.smooth_move(x, y)
autopy.mouse.click()

图像匹配

在实现了模拟鼠标进行点击后，我们还需要对要点击目标进行定位，这需要用到图像匹配。

autopy 集成了一套图像匹配组件，但由于是按像素进行匹配的，速率巨慢且准确率低，考虑用更专业的 opencv 配合PIL来做。

安装


pip3 install Pillow
pip3 install imutils
pip3 install opencv-python

从大图里面匹配小图的思路如下：

转成灰度图
提取边缘
模版匹配
选取相似度最高的结果
得到小图在大图中的起始点和匹配区域的大小

代码如下：


def match(small_path, large_path):
    small = cv2.imread(small_path)
    small = cv2.cvtColor(small, cv2.COLOR_BGR2GRAY)
    small = cv2.Canny(small, 50, 200)


    large = cv2.imread(large_path)
    large = cv2.cvtColor(large, cv2.COLOR_BGR2GRAY)
    large = cv2.Canny(large, 50, 200)
    result = cv2.matchTemplate(large, small, cv2.TM_CCOEFF)
    _, max_value, _, max_loc = cv2.minMaxLoc(result)
    return (max_value, max_loc, 1, result)

其中 max_loc 为起始点坐标，匹配的大小和 small 大小一样，即 height, width = small.shape[:2]

缩放图像匹配

然而以上方法只能用于匹配从大图里面扣出来的子图，一旦大图大小发生变化，比如游戏窗口缩小了一点，就会匹配失败。我们需要一套能在图片被缩放后依然能够匹配的方法。

经过一番搜索后，找到了这篇文章：

http://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/

其基本思路就是对大图按比例进行缩小，每次缩小后执行一次匹配，直到大图足够小为止，返回历次匹配中相似度最大的那个匹配。

代码如下：


def scale_match(small_path, large_path):
    small = cv2.imread(small_path)
    small = cv2.cvtColor(small, cv2.COLOR_BGR2GRAY)
    small = cv2.Canny(small, 50, 200)
    height, width = small.shape[:2]


    large = cv2.imread(large_path)
    large = cv2.cvtColor(large, cv2.COLOR_BGR2GRAY)
    current_max = None


    for scale in numpy.linspace(0.2, 1.0, 20)[::-1]:
        resized = imutils.resize(large, width=int(large.shape[1] * scale))
        r = large.shape[1] / float(resized.shape[1])
        # if the resized image is smaller than the small, then break
        if resized.shape[0] < height or resized.shape[1] < width:
            break


        resized = cv2.Canny(resized, 50, 200)
        result = cv2.matchTemplate(resized, small, cv2.TM_CCOEFF_NORMED)
        _, max_value, _, max_loc = cv2.minMaxLoc(result)
        if current_max is None or max_value > current_max[0]:
            current_max = (max_value, max_loc, r, result)
    return current_max

多次匹配

考虑如果在大图中，小图出现了多次，而我们需要对所有出现进行匹配，怎么办呢？

opencv官方教程中给出的方案是设置一个threshold，和前面的图像匹配一节取近似度最大的作为匹配不同，所有大于threshold的都会作为匹配。

代码如下：


points = []
loc = numpy.where(result >= threshold)
for point in zip(*loc[::-1]):
    points.append((numpy.float32(point[0]), numpy.float32(point[1])))

然而这个方法有个缺点，就是几乎不可能找到这样的一个threshold，能够保证近似度大于该值的就是匹配。并且会存在对同一个区域进行重复多次匹配的问题，这样的直观结果就是某些区域被匹配了多次，而某些区域却没有被匹配到。

在我的应用场景中，要匹配的区域数是确定的(4)，回想本科的数据挖掘课，k-means不就是干这个的嘛～于是通过设置一个较小的threshold以获得大量的匹配，然后对匹配做k-means求出4个中心点。

幸运的是，opencv实现了k-means，不用另外去找一个库了：


points = numpy.array(points)
term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)
ret, labels, centers = cv2.kmeans(points, 4, None, term_crit, 10, 0)

得到的 centers 即为所需的4个中心点，即我们要匹配4个区域。

OCR

在匹配了图片后，有时我们还希望对其进行OCR，以获得当前画面上的文字信息，这里采用Google的开源OCR组件 tesseract

安装：


brew install tesseract
pip3 install pytesseract

到 https://github.com/tesseract-ocr/tesseract/wiki/Data-Files下载字库，如简体中文字库字库文件为 https://github.com/tesseract-ocr/tessdata/raw/4.00/chi_sim.traineddata

找到 tesseract 的安装目录：


brew info tesseract
tesseract: stable 3.05.01 (bottled), HEAD
OCR (Optical Character Recognition) engine
https://github.com/tesseract-ocr/
/usr/local/Cellar/tesseract/3.05.01 (80 files, 98.6MB) *
...

放到相对目录 ./share/tessdata/ 下

匹配流程：

根据匹配结果，计算截取区域
对大图进行截取
送入tesseract

代码如下：


def ocr(self, matchings):
    import pytesseract
    pytesseract.pytesseract.tesseract_cmd = "/usr/local/bin/tesseract"
    texts = []
    if type(matchings) is not list:
        matchings = [matchings]
    for m in matchings:
        start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
        end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
        clip = self.large_gray[start_y:end_y, start_x:end_x]
        image = Image.fromarray(clip)
        texts.append(pytesseract.image_to_string(image, lang='chi_sim'))
    return texts

由于OCR是可选组件，这里把 import 放到函数里面。

实现

将以上函数进行封装，得到类 Recognizer ：


import cv2
import imutils
import numpy


from PIL import ImageGrab, Image
from time import sleep




class Recognizer():
    def __init__(self, large):
        if isinstance(large, str):
            large = cv2.imread(large)
        self.large_origin = large
        self.large_gray = cv2.cvtColor(large, cv2.COLOR_BGR2GRAY)
        self.large_size = large.shape[:2]


    def match(self, small, scale=False):
        if isinstance(small, str):
            small = cv2.imread(small)


        small = cv2.cvtColor(small, cv2.COLOR_BGR2GRAY)
        small = cv2.Canny(small, 50, 200)
        size = small.shape[:2]
        print("match: [{}x{}] in [{}x{}]".format(size[0], size[1], self.large_size[0], self.large_size[1]))


        if scale:
            current_max = None


            for ratio in numpy.linspace(0.2, 1.0, 20)[::-1]:
                resized = imutils.resize(self.large_gray, width=int(self.large_size[1] * ratio))
                r = self.large_size[1] / float(resized.shape[1])
                # if the resized image is smaller than the small, then break
                if resized.shape[0] < size[0] or resized.shape[1] < size[1]:
                    break


                resized = cv2.Canny(resized, 50, 200)
                result = cv2.matchTemplate(resized, small, cv2.TM_CCOEFF_NORMED)
                _, max_value, _, max_loc = cv2.minMaxLoc(result)
                if current_max is None or max_value > current_max['value']:
                    current_max = {"value": max_value, "loc": max_loc, "size": size, "ratio": r, "result": result}


            return current_max
        else:
            large = cv2.Canny(self.large_gray, 50, 200)
            result = cv2.matchTemplate(large, small, cv2.TM_CCOEFF)
            _, max_value, _, max_loc = cv2.minMaxLoc(result)
            return {"value": max_value, "loc": max_loc, "size": size, "ratio": 1, "result": result}






    def multi_match(self, small, scale=False, cluster_num=1, threshold=0.8):
        m = self.match(small, scale)
        matchings = []
        points = []


        loc = numpy.where(m["result"] >= threshold)


        for point in zip(*loc[::-1]):
            points.append((numpy.float32(point[0]), numpy.float32(point[1])))


        points = numpy.array(points)
        term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)
        ret, labels, centers = cv2.kmeans(points, cluster_num, None, term_crit, 10, 0)
        for point in centers:
            matchings.append({"value": m["value"], "loc": point, "size": m["size"], "ratio": m["ratio"], "result": m["result"]})
        print('K-Means: {} -> {}'.format(len(loc[0]), len(matchings)))
        return matchings






    def draw_rect(self, matchings, output_path):
        large_origin = self.large_origin.copy()
        if not isinstance(matchings, list):
            matchings = [matchings]
        for m in matchings:
            start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
            end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
            cv2.rectangle(large_origin, (start_x, start_y), (end_x, end_y), (0, 0, 255), 2)
        cv2.imwrite(output_path, large_origin)


    def draw_clip(self, clips, output_path):
        if type(clips) is not list:
            cv2.imwrite(output_path, clips)
        else:
            for index, clip in enumerate(clips):
                path = output_path.format(index)
                cv2.imwrite(path, clip)






    def clip(self, matchings):
        clips = []


        if not isinstance(matchings, list):
            matchings = [matchings]
        for m in matchings:
            start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
            end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
            clip = self.large_origin[start_y:end_y, start_x:end_x]
            clips.append(clip)
        return clips


    def ocr(self, matchings):
        import pytesseract
        pytesseract.pytesseract.tesseract_cmd = "/usr/local/bin/tesseract"
        texts = []
        if not isinstance(matchings, list):
            matchings = [matchings]
        for m in matchings:
            start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
            end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
            clip = self.large_gray[start_y:end_y, start_x:end_x]
            image = Image.fromarray(clip)
            texts.append(pytesseract.image_to_string(image, lang='chi_sim'))
        return texts




    def center(self, matching):
        x = int((matching["loc"][0] + matching["size"][1] / 2) * matching["ratio"] / 2)
        y = int((matching["loc"][1] + matching["size"][0] / 2) * matching["ratio"] / 2)
        return x, y

提供截图函数 capture_screen：


def capture_screen():
    screenshot = ImageGrab.grab().convert('RGB')
    screenshot = numpy.array(screenshot)
    return cv2.cvtColor(screenshot, cv2.COLOR_RGB2BGR)

注意PIL只能转换到RGB，而opencv用的是BGR，因此需要再进行一次转换。

🌰

从大图中识别 fight 、 clock 、 frame 三个区域：

fight 、 clock 、 frame 定义如下：

在大图中把它们圈出来：


screenshot = capture_screen()
main_rgz = Recognizer(screenshot)


fight_path = '/Users/binss/Desktop/opencv/templates/fight.png'
clock_path = '/Users/binss/Desktop/opencv/templates/clock.png'
frame_path = '/Users/binss/Desktop/opencv/templates/frame.png'
fight = main_rgz.match(fight_path, True)
clock = main_rgz.match(clock_path, True)
frame = main_rgz.match(frame_path, True)


matchings = [fight, clock, frame]
output_path = '/Users/binss/Desktop/debug.png'
main_rgz.draw_rect(matchings, output_path)

得到下图：

单独提取frame区域


clips = main_rgz.clip(frame)
main_rgz.draw_clip(clips[0], '/Users/binss/Desktop/frame_clip.png')

得到下图：

从 frame 区域中匹配多个line，并进行OCR

line 定义如下：

即将其切成四行后，分别进行OCR：


line_path = '/Users/binss/Desktop/opencv/templates/line.png'
time_rgz = Recognizer(clips[0])
matchings = time_rgz.multi_match(line_path, True, 4, 0.2)
texts = time_rgz.ocr(matchings)
print(texts)

结果如下：


K-Means: 8 -> 4
['后勤支援中 8 一 1 OO:19:44', '后勤支援中 8 一 1 OO:19:44', '后勤支援中 1 一 4 01:17:26', '后勤支援中 0 一 1 00:48:57']

总结

由于我完全没修读过计算机视觉等相关领域的课程，折腾opencv、写下本文只是一时兴起，因此这里仅作抛砖引玉，欢迎指点更优的解决方案。

当然这套东西搞下来发现一点也不实用，体现在：

匹配速度太慢，虽然没用多进程，但耗费资源已经很可观，一开始跑CPU占用率咻一声就上去了
尝试移植到windows 10，结果发现无论是opencv还是autopy都各种报错，无法成功跑起来，最终放弃
无法后台执行

最后老老实实滚回去用按键精灵了。以本文纪念我逝去的周末。

python學習主題

2019年9月22日星期日

參考，但搞笑不能運行

分析

模拟点击

图像匹配

缩放图像匹配

多次匹配

OCR

实现

🌰

从大图中识别 fight 、 clock 、 frame 三个区域：

单独提取frame区域

从 frame 区域中匹配多个line，并进行OCR

总结

沒有留言:

張貼留言

	import autopy

	autopy.mouse.smooth_move(x, y)
	autopy.mouse.click()

	pip3 install Pillow
	pip3 install imutils
	pip3 install opencv-python

	def match(small_path, large_path):
	small = cv2.imread(small_path)
	small = cv2.cvtColor(small, cv2.COLOR_BGR2GRAY)
	small = cv2.Canny(small, 50, 200)

	large = cv2.imread(large_path)
	large = cv2.cvtColor(large, cv2.COLOR_BGR2GRAY)
	large = cv2.Canny(large, 50, 200)
	result = cv2.matchTemplate(large, small, cv2.TM_CCOEFF)
	_, max_value, _, max_loc = cv2.minMaxLoc(result)
	return (max_value, max_loc, 1, result)

	def scale_match(small_path, large_path):
	small = cv2.imread(small_path)
	small = cv2.cvtColor(small, cv2.COLOR_BGR2GRAY)
	small = cv2.Canny(small, 50, 200)
	height, width = small.shape[:2]

	large = cv2.imread(large_path)
	large = cv2.cvtColor(large, cv2.COLOR_BGR2GRAY)
	current_max = None

	for scale in numpy.linspace(0.2, 1.0, 20)[::-1]:
	resized = imutils.resize(large, width=int(large.shape[1] * scale))
	r = large.shape[1] / float(resized.shape[1])
	# if the resized image is smaller than the small, then break
	if resized.shape[0] < height or resized.shape[1] < width:
	break

	resized = cv2.Canny(resized, 50, 200)
	result = cv2.matchTemplate(resized, small, cv2.TM_CCOEFF_NORMED)
	_, max_value, _, max_loc = cv2.minMaxLoc(result)
	if current_max is None or max_value > current_max[0]:
	current_max = (max_value, max_loc, r, result)
	return current_max

	points = []
	loc = numpy.where(result >= threshold)
	for point in zip(*loc[::-1]):
	points.append((numpy.float32(point[0]), numpy.float32(point[1])))

	points = numpy.array(points)
	term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)
	ret, labels, centers = cv2.kmeans(points, 4, None, term_crit, 10, 0)

	brew info tesseract
	tesseract: stable 3.05.01 (bottled), HEAD
	OCR (Optical Character Recognition) engine
	https://github.com/tesseract-ocr/
	/usr/local/Cellar/tesseract/3.05.01 (80 files, 98.6MB) *
	...

	def ocr(self, matchings):
	import pytesseract
	pytesseract.pytesseract.tesseract_cmd = "/usr/local/bin/tesseract"
	texts = []
	if type(matchings) is not list:
	matchings = [matchings]
	for m in matchings:
	start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
	end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
	clip = self.large_gray[start_y:end_y, start_x:end_x]
	image = Image.fromarray(clip)
	texts.append(pytesseract.image_to_string(image, lang='chi_sim'))
	return texts

	import cv2
	import imutils
	import numpy

	from PIL import ImageGrab, Image
	from time import sleep


	class Recognizer():
	def __init__(self, large):
	if isinstance(large, str):
	large = cv2.imread(large)
	self.large_origin = large
	self.large_gray = cv2.cvtColor(large, cv2.COLOR_BGR2GRAY)
	self.large_size = large.shape[:2]

	def match(self, small, scale=False):
	if isinstance(small, str):
	small = cv2.imread(small)

	small = cv2.cvtColor(small, cv2.COLOR_BGR2GRAY)
	small = cv2.Canny(small, 50, 200)
	size = small.shape[:2]
	print("match: [{}x{}] in [{}x{}]".format(size[0], size[1], self.large_size[0], self.large_size[1]))

	if scale:
	current_max = None

	for ratio in numpy.linspace(0.2, 1.0, 20)[::-1]:
	resized = imutils.resize(self.large_gray, width=int(self.large_size[1] * ratio))
	r = self.large_size[1] / float(resized.shape[1])
	# if the resized image is smaller than the small, then break
	if resized.shape[0] < size[0] or resized.shape[1] < size[1]:
	break

	resized = cv2.Canny(resized, 50, 200)
	result = cv2.matchTemplate(resized, small, cv2.TM_CCOEFF_NORMED)
	_, max_value, _, max_loc = cv2.minMaxLoc(result)
	if current_max is None or max_value > current_max['value']:
	current_max = {"value": max_value, "loc": max_loc, "size": size, "ratio": r, "result": result}

	return current_max
	else:
	large = cv2.Canny(self.large_gray, 50, 200)
	result = cv2.matchTemplate(large, small, cv2.TM_CCOEFF)
	_, max_value, _, max_loc = cv2.minMaxLoc(result)
	return {"value": max_value, "loc": max_loc, "size": size, "ratio": 1, "result": result}



	def multi_match(self, small, scale=False, cluster_num=1, threshold=0.8):
	m = self.match(small, scale)
	matchings = []
	points = []

	loc = numpy.where(m["result"] >= threshold)

	for point in zip(*loc[::-1]):
	points.append((numpy.float32(point[0]), numpy.float32(point[1])))

	points = numpy.array(points)
	term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)
	ret, labels, centers = cv2.kmeans(points, cluster_num, None, term_crit, 10, 0)
	for point in centers:
	matchings.append({"value": m["value"], "loc": point, "size": m["size"], "ratio": m["ratio"], "result": m["result"]})
	print('K-Means: {} -> {}'.format(len(loc[0]), len(matchings)))
	return matchings



	def draw_rect(self, matchings, output_path):
	large_origin = self.large_origin.copy()
	if not isinstance(matchings, list):
	matchings = [matchings]
	for m in matchings:
	start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
	end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
	cv2.rectangle(large_origin, (start_x, start_y), (end_x, end_y), (0, 0, 255), 2)
	cv2.imwrite(output_path, large_origin)

	def draw_clip(self, clips, output_path):
	if type(clips) is not list:
	cv2.imwrite(output_path, clips)
	else:
	for index, clip in enumerate(clips):
	path = output_path.format(index)
	cv2.imwrite(path, clip)



	def clip(self, matchings):
	clips = []

	if not isinstance(matchings, list):
	matchings = [matchings]
	for m in matchings:
	start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
	end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
	clip = self.large_origin[start_y:end_y, start_x:end_x]
	clips.append(clip)
	return clips

	def ocr(self, matchings):
	import pytesseract
	pytesseract.pytesseract.tesseract_cmd = "/usr/local/bin/tesseract"
	texts = []
	if not isinstance(matchings, list):
	matchings = [matchings]
	for m in matchings:
	start_x, start_y = (int(m["loc"][0] * m["ratio"]), int(m["loc"][1] * m["ratio"]))
	end_x, end_y = (int((m["loc"][0] + m["size"][1]) * m["ratio"]), int((m["loc"][1] + m["size"][0]) * m["ratio"]))
	clip = self.large_gray[start_y:end_y, start_x:end_x]
	image = Image.fromarray(clip)
	texts.append(pytesseract.image_to_string(image, lang='chi_sim'))
	return texts


	def center(self, matching):
	x = int((matching["loc"][0] + matching["size"][1] / 2) * matching["ratio"] / 2)
	y = int((matching["loc"][1] + matching["size"][0] / 2) * matching["ratio"] / 2)
	return x, y

	def capture_screen():
	screenshot = ImageGrab.grab().convert('RGB')
	screenshot = numpy.array(screenshot)
	return cv2.cvtColor(screenshot, cv2.COLOR_RGB2BGR)

	screenshot = capture_screen()
	main_rgz = Recognizer(screenshot)

	fight_path = '/Users/binss/Desktop/opencv/templates/fight.png'
	clock_path = '/Users/binss/Desktop/opencv/templates/clock.png'
	frame_path = '/Users/binss/Desktop/opencv/templates/frame.png'
	fight = main_rgz.match(fight_path, True)
	clock = main_rgz.match(clock_path, True)
	frame = main_rgz.match(frame_path, True)

	matchings = [fight, clock, frame]
	output_path = '/Users/binss/Desktop/debug.png'
	main_rgz.draw_rect(matchings, output_path)

	clips = main_rgz.clip(frame)
	main_rgz.draw_clip(clips[0], '/Users/binss/Desktop/frame_clip.png')

	line_path = '/Users/binss/Desktop/opencv/templates/line.png'
	time_rgz = Recognizer(clips[0])
	matchings = time_rgz.multi_match(line_path, True, 4, 0.2)
	texts = time_rgz.ocr(matchings)
	print(texts)

	K-Means: 8 -> 4
	['后勤支援中 8 一 1 OO:19:44', '后勤支援中 8 一 1 OO:19:44', '后勤支援中 1 一 4 01:17:26', '后勤支援中 0 一 1 00:48:57']

2019年9月22日 星期日

參考，但搞笑不能運行

分析

模拟点击

图像匹配

缩放图像匹配

多次匹配

OCR

实现

🌰

从大图中识别 fight 、 clock 、 frame 三个区域：

单独提取frame区域

从 frame 区域中匹配多个line，并进行OCR

总结

沒有留言:

張貼留言

2019年9月22日星期日