Periodic Table (Mine) - 元素周期表

ある意味、いままでで一番苦労した問題。
前に、CheckiOのツライ点として、英語、数学、アルゴリズムを挙げたが、化学も追加すべきかもしれない。周期表と言えば「スイヘーリーベー」ぐらいしか知らない私には難しすぎた。

どんな問題？

Periodic Table
http://www.checkio.org/mission/periodic-table/

元素名から、原子番号、電子配置、軌道モデルを求めよ。

引数は、元素名を文字列で渡される。
戻り値は、[原子番号, 電子配置, 軌道モデル] の list of str で返す。

例えば、酸素の場合、原子番号は、8、
電子配置は、'[He] 2s² 2p⁴' のように希ガス型で表記する。
軌道モデルは、'2 2 211' のように表記する。

例題：

assert( checkio( 'H' ) == [ "1", u"1s¹", "1" ] ), "First Test - 1s¹"
assert( checkio( 'He' ) == [ "2", u"1s²", "2" ] ), "Second Test - 1s²"
assert( checkio( 'Al' ) == [ "13", u"[Ne] 3s² 3p¹", "2 2 222 2 100" ] ), "Third Test - 1s² 2s² 2p6 3s² 3p¹"
assert( checkio( 'O' ) == ["8", u"[He] 2s² 2p⁴", "2 2 211"] ), "Fourth Test - 1s² 2s² 2p⁴"
assert( checkio( 'Li' ) == [ "3", u"[He] 2s¹", "2 1" ] ), "Fifth Test - 1s² 2s¹"

どうやって解く？

そもそも、電子配置とか軌道とかがわからねぇ。いろいろ調べて、なんとなく理解したこと。

原子番号が電子（陽子？）の数である。
電子は原子核の周りを飛んでいて、s,p,d…の軌道がある。
s軌道には1s,2s,3s…、p軌道には1p,2p,3p…という種類があって、それぞれ収容できる電子数が決まっている。
電子配置は、最も近い希ガス＋αの形で表記できる。
- 希ガスとは、He,Ne,Ar,Kr,Xe,Rn,Uuo
- 例えば、Heは「1s²」、Liは「1s² 2s¹」だから、Liを「[He] 2s¹」と表記する。
どの軌道に何個の電子があるかは、ほとんどは規則的だが、たまに変則的なものもある。

この問題に関しては、まともに解くのは早々に諦めた。

すべての元素の答えを辞書（dict）で作成しておいて、checkio()関数では、辞書を参照するだけにしよう。しかし、辞書を作成するにも、そもそも答えがわからないぞ・・・。

あれこれググって、このサイトを見つけた。
Electron configurations of the elements (data page) - Wikipedia, the free encyclopedia
原子番号と電子配置は、そのまま答えが載っているし、軌道モデルも、それっぽいものが書いてある。酸素(O)なら、1s² 2s² sp⁴ とあるので、指数の 2, 2, 4 から、'2 2 211'に変形させればよさそうだ。よくわからんが、4なら211、5なら221、6なら222 というルールらしい。

後は、このサイトからデータを取得して辞書を作ればいい。手作業でやるのは面倒なので、BeautifulSoupを使ってHTMLからデータを取得することにしよう。もはやCheckiOとは関係なくなってきた。

ページのソースを眺めると、

最初のTABLEタグがターゲット
2行目までは不要（スキップ）
3行目以後は、3行ごとに
- 原子番号、元素記号、名称、電子配置　の順番で並んでいる。
- 1s,2s,2p,3s,3p,3d…の順番で、指数に電子の数が書かれている。→軌道モデル
- 1,2,3…ごとの電子数の和が書かれている。（今回は不要）

となっていた。

from bs4 import BeautifulSoup
from urllib.request import urlopen

URL = "https://en.wikipedia.org/wiki/Electron_configurations_of_the_elements_(data_page)"
soup = BeautifulSoup(urlopen(URL))

tr_list = soup.find('table').find_all('tr')[2:]  # 最初の2行をスキップ
for tr1, tr2 in zip(tr_list[::3], tr_list[1::3]):
    words = tr1.th.text.split()
    number, symbol = words[0], words[1]
    if int(number) <= 118:  # 今回必要なのは 118 まで
        noble_gas_notation = ' '.join(words[4:])  # 電子配置
        orbital_model = tr2.text  # 軌道モデル
        print(number, symbol, noble_gas_notation, orbital_model)

これで必要な情報が取り出せたが、いくつか修正する必要があった

電子配置は '[H3] 2s2 2p4' 形式から、'[He] 2s² 2p⁴' の指数表記に変換する
軌道モデルは、'1s2 2s2 2p4' 形式から、'2 2 211' 形式に変換する。

また、軌道モデルでは、Wikipediaでの並び順と、答えで求められる並び順が違うようだ。
Wikipediaでは、1s,2s,2p,3s,3p,3d,4s,4p,4d,4f,5s… だが、
この問題では、1s,2s,2p,3s,3p,4s,3d,4p,5s,4d,5p… となっている。
よくわからんが、Aufbau Principleというルールらしい。

from bs4 import BeautifulSoup
from urllib.request import urlopen
import re

SUP = ['\u2070', '\u00B9', '\u00B2', '\u00B3', '\u2074', '\u2075', '\u2076', '\u2077', '\u2078', '\u2079']
ORDER = ['1s', '2s', '2p', '3s', '3p', '4s', '3d', '4p', '5s', '4d', '5p', '6s', '4f', '5d', '6p', '7s', '5f', '6d', '7p']
SIZE = {'s': 2, 'p': 6, 'd': 10, 'f': 14}

def split_sup(s):
    # '1s2' -> ('1s', '2')
    m = re.match('\d+[a-z]+', s)
    return (m.group(), s[m.end():]) if m else (s, '')

def replace_sup(s):
    # '1s2' -> '1s²'
    o, sup = split_sup(s)
    return o + ''.join(SUP[int(c)] for c in sup)

def to_orbital_text(size, n):
    # (6, 4) -> '211'
    length = size // 2
    return '1' * n + '0' * (length - n) if n < length else '2' * (n - length) + '1' * (size - n)

def get_orbital_model(tokens):
    # '1s2 2s2 2p4' -> '2 2 211'
    orbitals = sorted([split_sup(t) for t in tokens], key=lambda p: ORDER.index(p[0]))
    return ' '.join(to_orbital_text(SIZE[o[-1]], int(n)) for o, n in orbitals)

URL = "https://en.wikipedia.org/wiki/Electron_configurations_of_the_elements_(data_page)"
soup = BeautifulSoup(urlopen(URL))
tr_list = soup.find('table').find_all('tr')[2:]
for tr1, tr2 in zip(tr_list[::3], tr_list[1::3]):
    words = tr1.th.text.split()
    number, symbol = words[0], words[1]
    if int(number) <= 118:
        noble_gas_notation = ' '.join(replace_sup(t) for t in words[4:])
        orbital_model = get_orbital_model(tr2.text.split())
        print('    "{}": ["{}", "{}", "{}"],'.format(symbol, number, noble_gas_notation, orbital_model))

実行すると、

    'H': ['1', '1s¹', '1'],
    'He': ['2', '1s²', '2'],
    'Li': ['3', '[He] 2s¹', '2 1'],
    （中略）
    'Lv': ['116', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁴', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 211'],
    'Uus': ['117', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁵', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 221'],
    'Uuo': ['118', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁶', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 222'],

と表示されるので、前後に少し加えれば、解答ができあがる。

dic = {
    'H': ['1', '1s¹', '1'],
    'He': ['2', '1s²', '2'],
    'Li': ['3', '[He] 2s¹', '2 1'],
    （中略）
    'Lv': ['116', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁴', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 211'],
    'Uus': ['117', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁵', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 221'],
    'Uuo': ['118', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁶', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 222'],
}
checkio = dic.get

これをCheckiOで実行すると、無事にクリアできた。しかし、公開するのはちょっとなぁ。

でも、BeautifulSoupで取得する部分まで含めて公開すれば面白いかもしれない。というわけで、公開用にまとめてみた。

まとめ

try:
    from bs4 import BeautifulSoup
    from urllib.request import urlopen
    import re
 
    SUP = ['\u2070', '\u00B9', '\u00B2', '\u00B3', '\u2074', '\u2075', '\u2076', '\u2077', '\u2078', '\u2079']
    ORDER = ['1s', '2s', '2p', '3s', '3p', '4s', '3d', '4p', '5s', '4d', '5p', '6s', '4f', '5d', '6p', '7s', '5f', '6d', '7p']
    SIZE = {'s': 2, 'p': 6, 'd': 10, 'f': 14}
 
    def split_sup(s):
        # '1s2' -> ('1s', '2')
        m = re.match('\d+[a-z]+', s)
        return (m.group(), s[m.end():]) if m else (s, '')
 
    def replace_sup(s):
        # '1s2' -> '1s²'
        o, sup = split_sup(s)
        return o + ''.join(SUP[int(c)] for c in sup)
 
    def to_orbital_text(size, n):
        # (6, 4) -> '211'
        length = size // 2
        return '1' * n + '0' * (length - n) if n < length else '2' * (n - length) + '1' * (size - n)
 
    def get_orbital_model(tokens):
        # '1s2 2s2 2p4' -> '2 2 211'
        orbitals = sorted([split_sup(t) for t in tokens], key=lambda p: ORDER.index(p[0]))
        return ' '.join(to_orbital_text(SIZE[o[-1]], int(n)) for o, n in orbitals)
 
    # parse HTML table with BeautifulSoup
    URL = "https://en.wikipedia.org/wiki/Electron_configurations_of_the_elements_(data_page)"
    soup = BeautifulSoup(urlopen(URL))
    tr_list = soup.find('table').find_all('tr')[2:]
    dic = {}
    for tr1, tr2 in zip(tr_list[::3], tr_list[1::3]):
        words = tr1.th.text.split()
        number, symbol = words[0], words[1]
        if int(number) <= 118:
            noble_gas_notation = ' '.join(replace_sup(t) for t in words[4:])
            orbital_model = get_orbital_model(tr2.text.split())
            dic[symbol] = [number, noble_gas_notation, orbital_model]
 
except ImportError:
    dic = {
        'H': ['1', '1s¹', '1'],
        'He': ['2', '1s²', '2'],
        'Li': ['3', '[He] 2s¹', '2 1'],
        （中略）
        'Lv': ['116', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁴', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 211'],
        'Uus': ['117', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁵', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 221'],
        'Uuo': ['118', '[Rn] 5f¹⁴ 6d¹⁰ 7s² 7p⁶', '2 2 222 2 222 2 22222 222 2 22222 222 2 2222222 22222 222 2 2222222 22222 222'],
    }

checkio = dic.get

http://www.checkio.org/mission/periodic-table/publications/natsuki/python-3/beautifulsoup/

BeautifulSoupに対応していれば Wikipedia からdicを作成するし、対応してなければ ImportErrro が発生するので、あらかじめ用意したdicを使用する。（もちろん、CheckiOはBeautifulSoupに対応していない。）

出題者の求める答えとは全然違うだろうけど、わりと面白い答えができたかな？

summer_tree_home

Check iOでPython3をマスターするぜっ

Periodic Table (Mine) - 元素周期表

どんな問題？

どうやって解く？

まとめ