김성주

android implementation update & readme

......@@ -12,4 +12,63 @@
- 2017103967 김성주
- 2015104213 장수창
### 준비 사항
tensorflow-android 라이브러리의 최신 버전이 (2020.06.01 기준) 1.13.1입니다.
따라서 android implementation까지 구현하는 경우에는
상위 버전과 호환이 되도록 라이브러리를 빌드하거나, 학습 혹은 pb 파일 생성 또한 tensorflow v1.13.1 이하로 진행하셔야 합니다.
annotation에는 labelImg 툴을 이용하여 xml을 생성하였습니다.
학습에는 TFrecord 형태로 저장된 파일을 사용합니다.
데이터 하나의 형식은 {data index, image binary, image width, image height, boxes}이며
boxes의 형식은 {label1, xmin, ymin, xmax, ymax, label2, xmin, ...}입니다.
TFRecord 파일 작성은 code/tfrecord_writer.py를 참고하시기 바랍니다.
tfrecord_writer.py에서 입력으로 받는 txt 파일은
각 라인마다 {data index, image path, image width, image height, boxes} 형태로 저장되어 있습니다.
txt 파일 생성은 code/annotation_xml_parser.py를 참고하시기 바랍니다.
이 학습에서는 train/eval/test 데이터셋을 구분하여 사용합니다.
txt 파일에 대한 데이터셋 분리는 code/dataset_splitter.py를 참고하기시 바랍니다.
annotation_xml_parser.py에서 입력으로 받는 xml 파일은
labelImg 툴로 생성된 Pascal VOC format XML 파일을 기준으로 합니다.
학습을 위해서 anchor 파일이 필요합니다.
anchor 파일 생성에는 code/yolov3/get_kmeans.py를 참고하시기 바랍니다.
출력된 anchor를 code/yolov3/args.py의 anchor_path에 맞는 위치에 저장하시면 됩니다.
이 학습에서는 pretrained model을 불러와 fine tuning을 이용합니다.
따라서 pretrained model 파일을 준비해야 합니다.
pretrained model은 [링크](https://pjreddie.com/media/files/yolov3.weights)에서 다운로드할 수 있습니다.
이 파일은 darknet weights 파일이므로, tensorflow model로 변환하려면 code/yolov3/convert_weights.py를 참고하시기 바랍니다.
(git에는 이미 변환된 yolov3.ckpt만이 업로드되어 있습니다. 다른 데이터셋 혹은 다른 용도로 학습을 진행하려면 새로 생성하셔야 합니다.)
학습에는 train.py (train/eval dataset)를, 평가에는 eval.py (test dataset)를 사용하시면 됩니다.
학습에 사용하는 파일의 경로 및 hyper parameter 설정은 args.py를 참고하시기 바랍니다.
평가에 대한 경로 설정은 eval.py에서 할 수 있습니다.
data/trained에 임시 테스트용 trained model 파일이 업로드되어 있습니다.
android implementation을 하는 경우에는 학습된 모델에 대한 pb 파일을 생성해야 합니다.
code/pb/pbCreator.py를 참고하시기 바랍니다. (code/yolov3/test_single_image.py를 약간 수정한 파일입니다)
android에서는 freeze된 model만 사용할 수 있습니다.
code/pb/freeze_pb.py를 참고하시기 바랍니다.
android_App/assets에 pb file을 저장한 후, DetectorActivity.java에서 YOLO_MODEL_FILE의 값을 알맞게 수정하시면 됩니다.
이 학습 코드로 생성된 모델의 input, output node name은
각각 input_data, {yolov3/yolov3_head/feature_map_1,yolov3/yolov3_head/feature_map_2,yolov3/yolov3_head/feature_map_3} 입니다.
모델의 node name 참고에는 Netron 프로그램을 사용하였습니다.
#### Reference
학습 코드는 [링크](https://github.com/wizyoung/YOLOv3_TensorFlow)를 기반으로 작셩하였습니다.
변경점은 code/yolov3/changes.txt를 참고하시기 바랍니다.
android 코드는 [링크](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android)를 기반으로 작성하였습니다.
......
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="RemoteRepositoriesConfiguration">
<remote-repository>
<option name="id" value="central" />
<option name="name" value="Maven Central repository" />
<option name="url" value="https://repo1.maven.org/maven2" />
</remote-repository>
<remote-repository>
<option name="id" value="jboss.community" />
<option name="name" value="JBoss Community repository" />
<option name="url" value="https://repository.jboss.org/nexus/content/repositories/public/" />
</remote-repository>
<remote-repository>
<option name="id" value="C:\Users\Kareus\AppData\Local\Android\Sdk\extras\android\m2repository" />
<option name="name" value="C:\Users\Kareus\AppData\Local\Android\Sdk\extras\android\m2repository" />
<option name="url" value="file:/$USER_HOME$/AppData/Local/Android/Sdk/extras/android/m2repository" />
</remote-repository>
<remote-repository>
<option name="id" value="C:\Users\Kareus\AppData\Local\Android\Sdk\extras\m2repository" />
<option name="name" value="C:\Users\Kareus\AppData\Local\Android\Sdk\extras\m2repository" />
<option name="url" value="file:/$USER_HOME$/AppData/Local/Android/Sdk/extras/m2repository" />
</remote-repository>
<remote-repository>
<option name="id" value="C:\Users\Kareus\AppData\Local\Android\Sdk\extras\google\m2repository" />
<option name="name" value="C:\Users\Kareus\AppData\Local\Android\Sdk\extras\google\m2repository" />
<option name="url" value="file:/$USER_HOME$/AppData/Local/Android/Sdk/extras/google/m2repository" />
</remote-repository>
<remote-repository>
<option name="id" value="BintrayJCenter" />
<option name="name" value="BintrayJCenter" />
<option name="url" value="https://jcenter.bintray.com/" />
</remote-repository>
<remote-repository>
<option name="id" value="Google" />
<option name="name" value="Google" />
<option name="url" value="https://dl.google.com/dl/android/maven2/" />
</remote-repository>
</component>
</project>
\ No newline at end of file
......@@ -25,21 +25,10 @@
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<application android:allowBackup="true"
android:debuggable="true"
android:label="@string/app_name"
android:icon="@drawable/ic_launcher"
android:theme="@style/MaterialTheme">
<!-- <activity android:name="org.tensorflow.demo.ClassifierActivity"-->
<!-- android:screenOrientation="portrait"-->
<!-- android:label="@string/activity_name_classification">-->
<!-- <intent-filter>-->
<!-- <action android:name="android.intent.action.MAIN" />-->
<!-- <category android:name="android.intent.category.LAUNCHER" />-->
<!-- <category android:name="android.intent.category.LEANBACK_LAUNCHER" />-->
<!-- </intent-filter>-->
<!-- </activity>-->
<activity android:name="org.tensorflow.demo.DetectorActivity"
android:screenOrientation="portrait"
android:label="@string/activity_name_detection">
......@@ -50,25 +39,38 @@
</intent-filter>
</activity>
<!-- <activity android:name="org.tensorflow.demo.StylizeActivity"-->
<!-- android:screenOrientation="portrait"-->
<!-- android:label="@string/activity_name_stylize">-->
<!-- <intent-filter>-->
<!-- <action android:name="android.intent.action.MAIN" />-->
<!-- <category android:name="android.intent.category.LAUNCHER" />-->
<!-- <category android:name="android.intent.category.LEANBACK_LAUNCHER" />-->
<!-- </intent-filter>-->
<!-- </activity>-->
<!--
<activity android:name="org.tensorflow.demo.ClassifierActivity"
android:screenOrientation="portrait"
android:label="@string/activity_name_classification">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
<category android:name="android.intent.category.LEANBACK_LAUNCHER" />
</intent-filter>
</activity>
<activity android:name="org.tensorflow.demo.StylizeActivity"
android:screenOrientation="portrait"
android:label="@string/activity_name_stylize">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
<category android:name="android.intent.category.LEANBACK_LAUNCHER" />
</intent-filter>
</activity>
<activity android:name="org.tensorflow.demo.SpeechActivity"
android:screenOrientation="portrait"
android:label="@string/activity_name_speech">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
<category android:name="android.intent.category.LEANBACK_LAUNCHER" />
</intent-filter>
</activity>
-->
<!-- <activity android:name="org.tensorflow.demo.SpeechActivity"-->
<!-- android:screenOrientation="portrait"-->
<!-- android:label="@string/activity_name_speech">-->
<!-- <intent-filter>-->
<!-- <action android:name="android.intent.action.MAIN" />-->
<!-- <category android:name="android.intent.category.LAUNCHER" />-->
<!-- <category android:name="android.intent.category.LEANBACK_LAUNCHER" />-->
<!-- </intent-filter>-->
<!-- </activity>-->
</application>
</manifest>
......
package(
default_visibility = ["//visibility:public"],
licenses = ["notice"], # Apache 2.0
)
# It is necessary to use this filegroup rather than globbing the files in this
# folder directly the examples/android:tensorflow_demo target due to the fact
# that assets_dir is necessarily set to "" there (to allow using other
# arbitrary targets as assets).
filegroup(
name = "asset_files",
srcs = glob(
["**/*"],
exclude = ["BUILD"],
),
)
This file is too large to display.
......@@ -42,7 +42,7 @@ allprojects {
}
// set to 'bazel', 'cmake', 'makefile', 'none'
def nativeBuildSystem = 'none'
def nativeBuildSystem = 'cmake'
// Controls output directory in APK and CPU type for Bazel builds.
// NOTE: Does not affect the Makefile build target API (yet), which currently
......
org.gradle.jvmargs=-Xmx2048m
\ No newline at end of file
#Sat Nov 18 15:06:47 CET 2017
#Sat May 30 18:49:07 KST 2020
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-4.1-all.zip
distributionUrl=https\://services.gradle.org/distributions/gradle-4.10.1-all.zip
......
......@@ -71,11 +71,11 @@ public class DetectorActivity extends CameraActivity implements OnImageAvailable
// Graphs and models downloaded from http://pjreddie.com/darknet/yolo/ may be converted e.g. via
// DarkFlow (https://github.com/thtrieu/darkflow). Sample command:
// ./flow --model cfg/tiny-yolo-voc.cfg --load bin/tiny-yolo-voc.weights --savepb --verbalise
private static final String YOLO_MODEL_FILE = "file:///android_asset/yolov3.pb";
private static final String YOLO_MODEL_FILE = "file:///android_asset/test_freeze_13.pb";
private static final int YOLO_INPUT_SIZE = 416;
private static final String YOLO_INPUT_NAME = "input";
private static final String YOLO_OUTPUT_NAMES = "output";
private static final int YOLO_BLOCK_SIZE = 32;
private static final String YOLO_INPUT_NAME = "input_data";
private static final String YOLO_OUTPUT_NAMES = "yolov3/yolov3_head/feature_map_1,yolov3/yolov3_head/feature_map_2,yolov3/yolov3_head/feature_map_3";
private static final int YOLO_BLOCK_SIZE = 16;
// Which detection model to use: by default uses Tensorflow Object Detection API frozen
// checkpoints. Optionally use legacy Multibox (trained using an older version of the API)
......@@ -131,6 +131,7 @@ public class DetectorActivity extends CameraActivity implements OnImageAvailable
int cropSize = TF_OD_API_INPUT_SIZE;
if (MODE == DetectorMode.YOLO) {
detector =
TensorFlowYoloDetector.create(
getAssets(),
YOLO_MODEL_FILE,
......
......@@ -32,7 +32,7 @@ public class TensorFlowYoloDetector implements Classifier {
private static final Logger LOGGER = new Logger();
// Only return this many results with at least this confidence.
private static final int MAX_RESULTS = 5;
private static final int MAX_RESULTS = 10;
private static final int NUM_CLASSES = 1;
......@@ -41,17 +41,14 @@ public class TensorFlowYoloDetector implements Classifier {
// TODO(andrewharp): allow loading anchors and classes
// from files.
private static final double[] ANCHORS = {
1.08, 1.19,
3.42, 4.41,
6.63, 11.38,
9.42, 5.11,
16.62, 10.52
35,37, 75,48, 57,87, 116,73, 83,138, 119,110, 154,184, 250,216, 317,362
};
private static final String[] LABELS = {
"dog"
"dog"
};
// Config values.
private String inputName;
private int inputSize;
......
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Converts checkpoint variables into Const ops in a standalone GraphDef file.
This script is designed to take a GraphDef proto, a SaverDef proto, and a set of
variable values stored in a checkpoint file, and output a GraphDef with all of
the variable ops converted into const ops containing the values of the
variables.
It's useful to do this when we need to load a single file in C++, especially in
environments like mobile or embedded where we may not have access to the
RestoreTensor ops and file loading calls that they rely on.
An example of command-line usage is:
bazel build tensorflow/python/tools:freeze_graph && \
bazel-bin/tensorflow/python/tools/freeze_graph \
--input_graph=some_graph_def.pb \
--input_checkpoint=model.ckpt-8361242 \
--output_graph=/tmp/frozen_graph.pb --output_node_names=softmax
You can also look at freeze_graph_test.py for an example of how to use it.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from google.protobuf import text_format
from tensorflow.python.framework import graph_util
FLAGS = tf.app.flags.FLAGS
tf.app.flags.DEFINE_string("input_graph", "",
"""TensorFlow 'GraphDef' file to load.""")
tf.app.flags.DEFINE_string("input_saver", "",
"""TensorFlow saver file to load.""")
tf.app.flags.DEFINE_string("input_checkpoint", "",
"""TensorFlow variables file to load.""")
tf.app.flags.DEFINE_string("output_graph", "",
"""Output 'GraphDef' file name.""")
tf.app.flags.DEFINE_boolean("input_binary", False,
"""Whether the input files are in binary format.""")
tf.app.flags.DEFINE_string("output_node_names", "",
"""The name of the output nodes, comma separated.""")
tf.app.flags.DEFINE_string("restore_op_name", "save/restore_all",
"""The name of the master restore operator.""")
tf.app.flags.DEFINE_string("filename_tensor_name", "save/Const:0",
"""The name of the tensor holding the save path.""")
tf.app.flags.DEFINE_boolean("clear_devices", True,
"""Whether to remove device specifications.""")
tf.app.flags.DEFINE_string("initializer_nodes", "", "comma separated list of "
"initializer nodes to run before freezing.")
def freeze_graph(input_graph, input_saver, input_binary, input_checkpoint,
output_node_names, restore_op_name, filename_tensor_name,
output_graph, clear_devices, initializer_nodes):
"""Converts all variables in a graph and checkpoint into constants."""
if not tf.gfile.Exists(input_graph):
print("Input graph file '" + input_graph + "' does not exist!")
return -1
if input_saver and not tf.gfile.Exists(input_saver):
print("Input saver file '" + input_saver + "' does not exist!")
return -1
if not tf.gfile.Glob(input_checkpoint):
print("Input checkpoint '" + input_checkpoint + "' doesn't exist!")
return -1
if not output_node_names:
print("You need to supply the name of a node to --output_node_names.")
return -1
input_graph_def = tf.GraphDef()
mode = "rb" if input_binary else "r"
with tf.gfile.FastGFile(input_graph, mode) as f:
if input_binary:
input_graph_def.ParseFromString(f.read())
else:
text_format.Merge(f.read(), input_graph_def)
# Remove all the explicit device specifications for this node. This helps to
# make the graph more portable.
if clear_devices:
for node in input_graph_def.node:
node.device = ""
_ = tf.import_graph_def(input_graph_def, name="")
with tf.Session() as sess:
if input_saver:
with tf.gfile.FastGFile(input_saver, mode) as f:
saver_def = tf.train.SaverDef()
if input_binary:
saver_def.ParseFromString(f.read())
else:
text_format.Merge(f.read(), saver_def)
saver = tf.train.Saver(saver_def=saver_def)
saver.restore(sess, input_checkpoint)
else:
sess.run([restore_op_name], {filename_tensor_name: input_checkpoint})
if initializer_nodes:
sess.run(initializer_nodes)
output_graph_def = graph_util.convert_variables_to_constants(
sess, input_graph_def, output_node_names.split(","))
with tf.gfile.GFile(output_graph, "wb") as f:
f.write(output_graph_def.SerializeToString())
print("%d ops in the final graph." % len(output_graph_def.node))
def main(unused_args):
freeze_graph(FLAGS.input_graph, FLAGS.input_saver, FLAGS.input_binary,
FLAGS.input_checkpoint, FLAGS.output_node_names,
FLAGS.restore_op_name, FLAGS.filename_tensor_name,
FLAGS.output_graph, FLAGS.clear_devices, FLAGS.initializer_nodes)
if __name__ == "__main__":
tf.app.run()
\ No newline at end of file
from tensorflow.python.tools import freeze_graph
ckpt_filepath = '../../data/pb/pb.ckpt'
pbtxt_filename = 'model.pbtxt'
pbtxt_filepath = '../../data/pb/model.pbtxt'
pb_filepath = '../../data/pb/freeze.pb'
freeze_graph.freeze_graph(input_graph=pbtxt_filepath, input_saver='', input_binary=False, input_checkpoint=ckpt_filepath, output_node_names='yolov3/yolov3_head/feature_map_1,yolov3/yolov3_head/feature_map_2,yolov3/yolov3_head/feature_map_3', restore_op_name='save/restore_all', filename_tensor_name='save/Const:0', output_graph=pb_filepath, clear_devices=True, initializer_nodes='')
from __future__ import division, print_function
import tensorflow as tf
import numpy as np
import argparse
import cv2
from misc_utils import parse_anchors, read_class_names
from nms_utils import gpu_nms
from plot_utils import get_color_table, plot_one_box
from data_utils import letterbox_resize
from model import yolov3
parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
parser.add_argument("input_image", type=str,
help="The path of the input image.")
parser.add_argument("--anchor_path", type=str, default="../../data/yolo_anchors.txt",
help="The path of the anchor txt file.")
parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
help="Resize the input image with `new_size`, size format: [width, height]")
parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
help="Whether to use the letterbox resize.")
parser.add_argument("--class_name_path", type=str, default="../../data/classes.txt",
help="The path of the class names.")
parser.add_argument("--restore_path", type=str, default="../../data/darknet_weights/yolov3.ckpt",
help="The path of the weights to restore.")
parser.add_argument("--pb_path", type=str, default="../../data/pb",
help="The directory of pb files")
args = parser.parse_args()
args.anchors = parse_anchors(args.anchor_path)
args.classes = read_class_names(args.class_name_path)
args.num_class = len(args.classes)
color_table = get_color_table(args.num_class)
img_ori = cv2.imread(args.input_image)
if args.letterbox_resize:
img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
else:
height_ori, width_ori = img_ori.shape[:2]
img = cv2.resize(img_ori, tuple(args.new_size))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.asarray(img, np.float32)
img = img[np.newaxis, :] / 255.
graph = tf.Graph()
with tf.Session(graph=graph) as sess:
input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
yolo_model = yolov3(args.num_class, args.anchors)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(input_data, False)
pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
pred_scores = pred_confs * pred_probs
boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
saver = tf.train.Saver()
saver.restore(sess, args.restore_path)
boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
if args.letterbox_resize:
boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
else:
boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
print("box coords:")
print(boxes_)
print('*' * 30)
print("scores:")
print(scores_)
print('*' * 30)
print("labels:")
print(labels_)
for i in range(len(boxes_)):
x0, y0, x1, y1 = boxes_[i]
plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
cv2.imshow('Detection result', img_ori)
cv2.imwrite('detection_result.jpg', img_ori)
cv2.waitKey(0)
saver.save(sess, args.pb_path+'/pb.ckpt')
tf.io.write_graph(sess.graph_def, args.pb_path, 'model.pb', as_text=False)
tf.io.write_graph(sess.graph_def, args.pb_path, 'model.pbtxt', as_text=True)
\ No newline at end of file
......@@ -15,15 +15,15 @@ from model import yolov3
parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
parser.add_argument("input_image", type=str,
help="The path of the input image.")
parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
parser.add_argument("--anchor_path", type=str, default="../../data/yolo_anchors.txt",
help="The path of the anchor txt file.")
parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
help="Resize the input image with `new_size`, size format: [width, height]")
parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
help="Whether to use the letterbox resize.")
parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
parser.add_argument("--class_name_path", type=str, default="../../data/classes.txt",
help="The path of the class names.")
parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt",
parser.add_argument("--restore_path", type=str, default="../../data/darknet_weights/yolov3.ckpt",
help="The path of the weights to restore.")
args = parser.parse_args()
......
This file is too large to display.