Zi 字媒體

繼圍棋之後，AI能否在電競上再勝人類一籌

2021/12/25

文/Maxwell Kao

中英對照閱讀練習

新聞來源:DeepMind.com

人工智慧的時代即將到來。工程師們建造的AlphaGo在挫敗了人類圍棋高手們之後，又想進一步挑戰人類於更加艱巨的領域——電子競技。AlphaGo的製作團隊、Google的姊妹公司DeepMind稱，他們正在和著名的Blizzard遊戲公司著手開發能在《星際爭霸II》上與人類高手一決高下的人工智慧：SC2LE。在討論星際爭霸之前，我們還是先回顧一下AlphaGo戰勝人類的原理和方式吧。

戰勝人類圍棋手？

1

AlphaGo is the first computer program to defeat a professional human Go player, the first program to defeat a Go world champion, and arguably the strongest Go player in history. ——From DeepMind.com

AlphaGo是第一個戰勝專業人類圍棋選手的電腦程序，也是第一個擊敗世界圍棋冠軍的電腦程序，基本上可以說他是史上最強的圍棋選手了。

（圖片來源：google.com）

2

The game of Go originated in China 3,000 years ago. The rules of the game are simple: players take turns to place black or white stones on a board, trying to capture the opponent's stones or surround empty space to make points of territory. As simple as the rules are, Go is a game of profound complexity. There are an astonishing 10 to the power of 170 possible board configurations - more than the number of atoms in the known universe - making Go a googol times more complex than Chess. ——From DeepMind.com

圍棋遊戲起源於3000年前的天朝。圍棋的規則其實超簡單：圍棋棋盤上有縱橫各19條直線將棋盤分成361個交叉點，玩家輪流在棋盤的交叉點上著白子與黑子，試圖圍住對手的子或用自己的子圍出一塊領土。儘管規則很簡單，圍棋卻是一個極具複雜性的遊戲。在圍棋中一共有10的170次方種組合（比已知宇宙中的原子還多），要比國際象棋複雜10的100次方倍（一億億億億億億億億億億億億億萬倍，反正多到你數不完）

（圖片來源：google.com）

3

The complexity of Go means it has long been viewed as the most challenging of classical games for artificial intelligence. Despite decades of work, the strongest computer Go programs were only able to play at the level of human amateurs.

Traditional AI methods, which construct a search tree over all possible positions, don』t have a chance in Go. This is because of the sheer number of possible moves and the difficulty of evaluating the strength of each possible board position. ——From DeepMind.com

圍棋的複雜性印證了它長久以來被視為對於人工智慧最具挑戰性的傳統遊戲。數十年以來，最牛逼的電腦也僅僅能戰勝業餘人類棋手。由於可能的行動數量眾多和計算每個可能的著棋點的效益的困難，所以分析通過建立「搜索樹」來涵蓋所有可能性的暴力解法在圍棋中是沒有勝算的。

（圖片來源：google.com）

4

In order to capture the intuitive aspect of the game, we knew that we would need to take a novel approach. AlphaGo therefore combines an advanced tree search with deep neural networks. These neural networks take a description of the Go board as an input and process it through a number of different network layers containing millions of neuron-like connections. One neural network, the 「policy network」, selects the next move to play. The other neural network, the 「value network」, predicts the winner of the game.

We showed AlphaGo a large number of strong amateur games to help it develop its own understanding of what reasonable human play looks like. Then we had it play against different versions of itself thousands of times, each time learning from its mistakes and incrementally improving until it became immensely strong, through a process known as reinforcement learning. ——From DeepMind.com

為了能從直觀層面了解圍棋，我們（DeepMind）覺得要找個新的方法。因此，AlphaGo結合了高級搜索樹和深度神經網路兩種方式。這些神經網路把圍棋棋盤的結構作為一個輸入項，然後把這個輸入項通過包含數百萬個神經元樣連接的許多不同的網路層進行處理。其中一個神經網路——「政策網路」負責選擇下一步的行動。還有一個神經網路——「值網路」負責預測遊戲的贏家。

我們（DeepMind）讓AlphaGo與非常多強力的業餘選手切磋，藉此來讓它建立起對有理性的人類玩家的認知。然後我們（DeepMind）讓它通過一個叫「強化學習」的過程，也就是與數千個版本的自己相互切磋，AlphaGo每次都會從它的錯誤中逐漸地提升，直到它變得巨強無比為止。

5

總之，AlphaGo避免了那條利用計算機的高速進行暴力破解的老路，相反通過上述這些原理巧妙地不斷變強，最終碾壓了人類的圍棋手。但是圍棋畢竟只是在桌面上進行的文雅的對決，如果遇上驚心動魄、戰火橫飛的遊戲，人工智慧還能戰勝人類高手嗎？

StarCraft and StarCraft II are among the biggest and most successful games of all time, with players competing in tournaments for more than 20 years. The original game is also already used by AI and ML researchers, who compete annually in the AIIDE bot competition. Part of StarCraft』s longevity is down to the rich, multi-layered gameplay, which also makes it an ideal environment for AI research. ——From DeepMind.com

星際爭霸1與2是史上最大和最成功的遊戲之一，眾多玩家們在這個遊戲中競爭比賽了20餘年。星際爭霸原本就已經在被人工智慧研究者使用了，他們每年都舉行AIIDE人工智慧比賽。星際爭霸長久火熱很大程度上是歸因於它豐富、多層次的遊戲體驗，這同時也給對人工智慧的研究創造了一個理想的環境。

小編吐槽：但是星際爭霸2遊戲自帶的AI敵人弱得一逼，我這個白銀的菜雞都能完虐。。。

（圖片來源：星際爭霸II遊戲截圖）

For example, while the objective of the game is to beat the opponent, the player must also carry out and balance a number of sub-goals, such as gathering resources or building structures. In addition, a game can take from a few minutes to one hour to complete, meaning actions taken early in the game may not pay-off for a long time. Finally, the map is only partially observed, meaning agents must use a combination of memory and planning to succeed. ——From DeepMind.com

舉些栗子，在完成打敗敵人的主要目的的同時，玩家還必須完成並制衡一些次要目標，例如收集資源或建設基地。另外，一場遊戲可能花費僅幾分鐘到一個小時，這意味著前期的行動不一定會在後期得到收益。其次，地圖的視野是有限的，這說明玩家必須有牢靠的記憶和完美的計劃以達成勝利。（是的，星際玩家不需要視力）

The game also has other qualities that appeal to researchers, such as the large pool of avid players that compete online every day. This ensures that there is a large quantity of replay data to learn from - as well as a large quantity of extremely talented opponents for AI agents. ——From DeepMind.com

星際爭霸還有一些其他令研究者青睞的特質，比如每天都有大量的玩家在線進行比賽。這確保了有大量的數據重播以及有實力足夠強的對手和人工智慧練習。

（圖片來源：google.com）

2

小編找到了一個討論策略遊戲的網站，其中該網站對比較主流的一些遊戲在策略性（橫軸）與緊張性（縱軸）做了個歸納。（也就是吃操作和吃智商兩點）根據圖表，星際爭霸屬於兩種性質兼顧且都有較高要求的綜合性遊戲，這也許也是星際爭霸被選作人工智慧研究對象的原因之一。

（圖片來源：quanticfoundry.com/）

3

（圖片來源：DeepMind.com）

Even StarCraft』s action space presents a challenge with a choice of more than 300 basic actions that can be taken. On top of this, actions in StarCraft are hierarchical, can be modified and augmented, with many of them requiring a point on the screen. Even assuming a small screen size of 84x84 there are roughly 100 million possible actions available. ——From DeepMind.com

星際爭霸的行動方式也給人工智慧帶來了一個超過300種基本行動的挑戰。除此之外，星際爭霸中的動作是分級的，可以進行修改和擴充，其中許多操作在屏幕上需要一個點。即使是在84x84的小屏幕尺寸，也可能有大約1億個可行的動作。

4

說了這麼多，還有一點很重要。就是，星際爭霸 很！酷！炫！（儘管這跟AI一點關係都沒有）

（圖片來源：星際爭霸II遊戲截圖）

是不是很帥！

（圖片來源：星際爭霸II遊戲截圖）

良心暴雪爸爸的好遊戲！

（圖片來源：星際爭霸II遊戲截圖）

5

然而，星際圈子裡有著一個聞名各界、能使用因果律武器的毒奶解說——黃旭東。SC2LE能不能打贏人類選手，還是聽命於他的毒奶，再強的AI也遭不住老仙的一口奶。（霧

（圖片來源：baidu.com）

AI未來的走向

最後，在講些正經的。小編認為，照這個趨勢下去，星際爭霸不久就會淪陷到AI的魔掌之中。那接下來又會是什麼呢？很多人害怕AI這樣發展下去可能對人類造成威脅，甚至超越並統治人類。小編個人認為這個觀點是比較荒謬的。現今的一切電子機械都是以二進位進行思考，僅能做出機械判斷，在某種程度上是無法與人腦相提並論的，更別提什麼超越和統治了；而且，小編覺得未來人工智慧的發展肯定是會給人類帶來極大的益處的，如果人腦能與電腦結合，兩者優異的性能相結合，就很可能對人類社會帶來難以想象的提升。

桃園 qq 地點貓咪桃園市 taoyuan xuan 根部尾巴有大桃園旅遊景點