04-06
04. 元素記号
"Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can."という文を単語に分解し,1, 5, 6, 7, 8, 9, 15, 16, 19番目の単語は先頭の1文字,それ以外の単語は先頭に2文字を取り出し,取り出した文字列から単語の位置(先頭から何番目の単語か)への連想配列(辞書型もしくはマップ型)を作成せよ.
04.hs
import qualified Data.Map as M main = print $ M.fromList $ zip (map f $ zip (words s) [1..]) [1..] where s = "Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can." f (s,n) | elem n ns = take 1 s | otherwise = take 2 s where ns = [1, 5, 6, 7, 8, 9, 15, 16, 19]
> runghc 04.hs fromList [("Al",13),("Ar",18),("B",5),("Be",4),("C",6),("Ca",20),("Cl",17),("F",9),("H",1),("He",2),("K",19),("Li",3),(" Mi",12),("N",7),("Na",11),("Ne",10),("O",8),("P",15),("S",16),("Si",14)]
ghciで簡単に書く方法を思いつかなかったのでファイルにした。
辞書は久しぶりに使った。
05. n-gram
与えられたシーケンス(文字列やリストなど)からn-gramを作る関数を作成せよ.この関数を用い,"I am an NLPer"という文から単語bi-gram,文字bi-gramを得よ.
Prelude> import Data.List Prelude Data.List> let f s = filter (\x -> length x == 2) $ subsequences s Prelude Data.List> f $ words "I am an NLPer" [["I","am"],["I","an"],["am","an"],["I","NLPer"],["am","NLPer"],["an","NLPer"]] Prelude Data.List> f $ "I am an NLPer" ["I ","Ia"," a","Im"," m","am","I "," ","a ","m ","Ia"," a","aa","ma"," a","In"," n","an","mn"," n","an","I "," ","a " ,"m "," ","a ","n ","IN"," N","aN","mN"," N","aN","nN"," N","IL"," L","aL","mL"," L","aL","nL"," L","NL","IP"," P","aP" ,"mP"," P","aP","nP"," P","NP","LP","Ie"," e","ae","me"," e","ae","ne"," e","Ne","Le","Pe","Ir"," r","ar","mr"," r","ar" ,"nr"," r","Nr","Lr","Pr","er"] Prelude Data.List>
06. 集合
"paraparaparadise"と"paragraph"に含まれる文字bi-gramの集合を,それぞれ, XとYとして求め,XとYの和集合,積集合,差集合を求めよ.さらに,'se'というbi-gramがXおよびYに含まれるかどうかを調べよ.
06.hs
import Data.List main = do print $ union x y print $ intersect x y print $ x \\ y where x = nub $ bigram "paraparaparadise" y = nub $ bigram "paragraph" bigram = filter (\x -> length x == 2 ) . subsequences
> runghc 06.hs ["pa","pr","ar","aa","ra","pp","ap","rp","rr","pd","ad","rd","pi","ai","ri","di","ps","as","rs","ds","is","pe","ae","re" ,"de","ie","se","pg","ag","rg","gr","ga","gp","ph","ah","rh","gh"] ["pa","pr","ar","aa","ra","pp","ap","rp","rr"] ["pd","ad","rd","pi","ai","ri","di","ps","as","rs","ds","is","pe","ae","re","de","ie","se"]