docker-ml-python-sandbox - Dockerfile for machine learning environment(scikit-learn, chainer, gensim, tensorflow, jupyter) zuqqhi2/docker-ml-python-sandbox - GitHub |
### Docker Pull docker pull zuqqhi2/ml-python-sandbox:latest docker images #REPOSITORY TAG IMAGE ID CREATED SIZE #zuqqhi2/ml-python-sandbox latest 4402825ff756 2 hours ago 12.9 GB ### Run jupyter without login to container sudo docker run -it -p 8888:8888 zuqqhi2/ml-python-sandbox # access to the host using browser with 8888 port like http://sample.com:8888 ### Login to container sudo docker run -it -p 8888:8888 zuqqhi2/ml-python-sandbox /bin/bash source ~/.bash_profile mlact # Set Japanese env export LANG=ja_JP.UTF-8 export LC_ALL=ja_JP.UTF-8 export LC_CTYPE=ja_JP.UTF-8 ### Run jupyter notebook after login to container jupyter notebook --ip=0.0.0.0 --port=8888
このコードを実行して、以下のような結果が得られるので、scikit-learnは大丈夫ですね。 次はmecabとjuman++です。”すもももももももものうち”というフレーズの分かち書きを両方の解析器に解析させるプログラムを実行させてみます。import numpy as np import pandas as pd from sklearn.cross_validation import ShuffleSplit, train_test_split from sklearn.tree import DecisionTreeClassifier, export_graphviz from sklearn.metrics import f1_score, make_scorer, accuracy_score from sklearn.grid_search import GridSearchCV from sklearn import datasets from pydotplus import graph_from_dot_data from IPython.display import Image # Load data iris = datasets.load_iris() features = iris.data categories = iris.target # Cross-Validation setting X_train, X_test, y_train, y_test = train_test_split(features, categories, test_size=0.2, random_state=42) cv_sets = ShuffleSplit(X_train.shape[0], n_iter = 10, test_size = 0.20, random_state = 0) params = {'max_depth': np.arange(2,11), 'min_samples_leaf': np.array([5])} # Learning def performance_metric(y_true, y_predict): score = f1_score(y_true, y_predict, average='micro') return score classifier = DecisionTreeClassifier() scoring_fnc = make_scorer(performance_metric) grid = GridSearchCV(classifier, params, cv=cv_sets, scoring=scoring_fnc) best_clf = grid.fit(X_train, y_train) # Plot decision tree dot_data = export_graphviz(best_clf.best_estimator_, out_file=None, feature_names=iris.feature_names, class_names=iris.target_names, filled=True, rounded=True, special_characters=True) graph = graph_from_dot_data(dot_data) Image(graph.create_png())
結果は以下のようになるので、mecabとjuman++についてもちゃんとpythonバインディングが入っていて使えます。from MeCab import Tagger from pyknp import Juman target_text = u"すもももももももものうち" m = Tagger("-Owakati") print("***** Mecab *****") print(m.parse(target_text)) juman = Juman() result = juman.analysis(target_text) print("***** Juman *****") print(' '.join([mrph.midasi for mrph in result.mrph_list()]))
はじめにtensorflowやtflearnでモデルを作っていると、ネットワークを図にしたり学習の進み具合をさくっとグラフにしてくれるツールがあると便利だな、と思ったことがあると思います。Googleで検索をしているとtensorboardなるものを見つけたので、こちらを使って見たいと思います。ちなみに、この記事の内容は以下のDockerイメージを使うことでさくっと試すことができます。環境ニューラルネットワークを簡単に作りたかったので、tflearnを使いました。OS : Ubuntu 16.04python : 3.5.2tensorflow : 1.1.0tfLearn : 0.3tensorboard : ... Tensorboardを使ってニューラルネットワークと学習の状況を可視化する - zuqqhi2 Tech Memo |
View Comments
I do like the way you have presented this challenge plus it really does supply me a lot of fodder for thought. Nevertheless, from everything that I have personally seen, I really trust when the feed-back stack on that people today continue to be on point and don't embark on a tirade associated with the news du jour. Anyway, thank you for this exceptional piece and even though I can not really agree with this in totality, I respect your perspective.