4주차

Notice

Recent Posts

Recent Comments

Link

Mad for Simplicity

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

save the world

4주차 본문

Deep Learning Diary

4주차

함안조씨 2018. 2. 6. 17:20

<2018.02.06. 화>

Model Output Check
학습모델의 최종결과인 l5를 확인하기위해 l5를 numpy.array로 받았는데 몇번의 iteration 후 모든 픽셀의 값이 0으로 나왔다. 원인을 찾기위해 weight 값 (커널)을 확인하려고 tensorboard를 사용했다(아래문단).

Tensorboard: Weight Histogram
텐서보드에서 첫번째 레이어의 weight값이 한번 학습했을 때 아주 크게 변하는 것을 확인하고 이를 장기적인 측면에서 확인하고자 텐서보드를 사용하여 그래프로 보고싶었다. weight값을 텐서보드로 넘기기 위해서 모든 weight값을 저장하기위해 tf.summary.histogram("w1_summ", w1) 명령어를 사용하고 실행하였더니 InvalidArgumentError: Nan in summary histogram for: 이런 에러메세지가 발생하였다. python에서 Nan은 Not a number 로서 "숫자가 아니다"라는 의미를 가지며 숫자가 아니기 때문에 histogram에 등록되지 않았던것이다. 문제를 해결하기위해 googling 한 결과(https://github.com/tflearn/tflearn/issues/304) learning rate를 낮춘다면 문제를 해결할 수 있다고 하여 step_size를 0.5에서 0.01로 내림으로써 문제를 해결하였다.

~~Tensorboard: checking(looking) hidden layer output~~

def model(X, w, w2, w3, w4, w_o, p_keep_conv, p_keep_hidden): l1a = tf.nn.relu(tf.nn.conv2d(X, w, # l1a shape=(?, 28, 28, 32) strides=[1, 1, 1, 1], padding='SAME')) l1 = tf.nn.max_pool(l1a, ksize=[1, 2, 2, 1], # l1 shape=(?, 14, 14, 32) strides=[1, 2, 2, 1], padding='SAME') l1 = tf.nn.dropout(l1, p_keep_conv) l2a = tf.nn.relu(tf.nn.conv2d(l1, w2, # l2a shape=(?, 14, 14, 64) strides=[1, 1, 1, 1], padding='SAME')) l2 = tf.nn.max_pool(l2a, ksize=[1, 2, 2, 1], # l2 shape=(?, 7, 7, 64) strides=[1, 2, 2, 1], padding='SAME') l2 = tf.nn.dropout(l2, p_keep_conv) return l2
Hidden layer의 output을 보기위해서 _, l1_out = sess.run([train_op, l1]) 을 수행하면 오류가 난다. l1이라는 변수가 없기 때문인 것 같다. 그래서 l1 = tf.placeholder('float', [None, 14, 14, 32]) 로 선언후 sess.run([train_op, l1])을 수행하면 l1의 히든레이어 결과가 l1_out에 numpy.array 값으로 저장된다.

Python dimension switch
tensorflow 사용중 weight의 dimension이 3(행)x3(열)x1(흑백색깔차원)x32(커널수)인 경우가 있었는데 weight를 보기위해 디버깅모드에 들어가 봤더니 [0][0][:][:]식의 값만 나오는 것이었다. 3x3 의 이미지인 [:][:][0][0]으로 보고싶어서 w[:][:][0][0]를 해 보았으나 전혀 변동이 없어서 np.swapaxes()라는 것을 사용했다.
w = np.swapaxes(w, 0, 2) 명령후 w의 dimension은 1x3(열)x3(행)x32(커널수)
w = np.swapaxes(w, 1, 3) 명령후 w의 dimension은 1x32(커널수)x3(행)x3(열)
으로 바뀌면서 디버깅모드에서도 3x3의 매트릭스 값을 확인 할 수 있었다.
※Tensorflow 에서는 tf.transpose() 함수가 np.swapaxes()와 동일한 작동을 함.

현재까지 python code

# -*- coding: utf-8 -*- # Typical setup to include TensorFlow. import tensorflow as tf import os import time import matplotlib.pyplot as plt import numpy as np t = time.time() # tic toc 시작 # BatchSize numOflatent = 64 numOfkernel = 1 stepSize = 1 #learning_rate cwd = os.getcwd() kernel_path = cwd + '\\kernel31\\' data_path = cwd + '\\latent\\' # Make a queue of file names including all the JPEG images files in the relative # image directory. latent_filename_queue = tf.train.string_input_producer( tf.train.match_filenames_once(data_path + "*.jpg"), shuffle=True) kernel_filename_queue = tf.train.string_input_producer( tf.train.match_filenames_once(kernel_path + "*.png"), shuffle=True) # Read an entire image file which is required since they're JPEGs, if the images # are too large they could be split in advance to smaller files or use the Fixed # reader to split up the file. image_reader = tf.WholeFileReader() # Read a whole file from the queue, the first returned value in the tuple is the # filename which we are ignoring. _, latent_file = image_reader.read(latent_filename_queue) _, kernel_file = image_reader.read(kernel_filename_queue) latent = tf.image.decode_jpeg(latent_file, channels=1) latent = tf.cast(latent, tf.float32) cropped_image = tf.random_crop(latent, [51, 51, 1]) cropped_center = tf.image.central_crop(cropped_image, 1/51) cropped_center.set_shape([1, 1, 1]) latent_batch = tf.train.batch([cropped_image], batch_size=numOflatent) cropped_batch = tf.train.batch([cropped_center], batch_size=numOflatent) kernel = tf.image.decode_jpeg(kernel_file, channels=0) kernel = tf.cast(kernel, tf.float32) kernel = tf.reshape(kernel, [31, 31, 1]) kernel_batch = tf.train.batch([kernel], batch_size=numOfkernel) kernel_batch = tf.reshape(kernel_batch, [31, 31, -1, 1]) def init_weights(shape): return tf.Variable(tf.random_normal(shape, stddev=0.01)) #def model(Blurred_data, w1, w2, w3, w4, w_o, p_keep_conv, p_keep_hidden): # return l5 w1 = tf.get_variable("w1", shape=[5, 5, 1, 32], initializer=tf.contrib.layers.xavier_initializer()) w2 = tf.get_variable("w2", shape=[5, 5, 32, 64], initializer=tf.contrib.layers.xavier_initializer()) w3 = tf.get_variable("w3", shape=[7, 7, 64, 128], initializer=tf.contrib.layers.xavier_initializer()) w4 = tf.get_variable("w4", shape=[5, 5, 128, 256], initializer=tf.contrib.layers.xavier_initializer()) w5 = tf.get_variable("w5", shape=[3, 3, 256, 1], initializer=tf.contrib.layers.xavier_initializer()) Blurred_data = tf.nn.conv2d(latent_batch, kernel_batch, strides=[1, 1, 1, 1], padding='VALID') Blurred_data = Blurred_data//(tf.reduce_max(Blurred_data)/255) # Blurred_data shape=(?, 21, 21, 1) p_keep_conv = tf.placeholder("float") p_keep_hidden = tf.placeholder("float") learning_rate = tf.placeholder("float") l1a = tf.nn.conv2d(Blurred_data, w1, # l1a shape=(?, 17, 17, 32) strides=[1, 1, 1, 1], padding='VALID') l1 = tf.nn.relu(l1a) l2a = tf.nn.conv2d(l1, w2, # l2a shape=(?, 13, 13, 64) strides=[1, 1, 1, 1], padding='VALID') l2 = tf.nn.relu(l2a) l3a = tf.nn.conv2d(l2, w3, # l3a shape=(?, 7, 7, 128) strides=[1, 1, 1, 1], padding='VALID') l3 = tf.nn.relu(l3a) l4a = tf.nn.conv2d(l3, w4, # l4a shape=(?, 3, 3, 256) strides=[1, 1, 1, 1], padding='VALID') l4 = tf.nn.relu(l4a) l5a = tf.nn.conv2d(l4, w5, # l5a shape=(?, 1, 1, 1) strides=[1, 1, 1, 1], padding='VALID') l5b = tf.nn.relu(l5a) l5 = tf.nn.dropout(l5b, p_keep_hidden) cost = tf.contrib.losses.mean_squared_error(l5, cropped_batch) train_op = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.01, use_nesterov=True).minimize(cost) predict_op = tf.contrib.losses.mean_squared_error(l5, cropped_batch) # Start a new session to show example output. with tf.Session() as sess: # Required to get the filename matching to run. sess.run(tf.local_variables_initializer()) # Coordinate the loading of image files. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) # Get an image tensor and print its value. tf.global_variables_initializer().run() k = 0 for j in range(100): for i in range(10): _, l1aarr, l1arr, l2aarr, l2arr, l3aarr, l3arr, l4aarr, l4arr, l5aarr, l5barr, l5arr, w1o, w2o, w3o, w4o, w5o, B = \ sess.run([train_op, l1a, l1, l2a, l2, l3a, l3, l4a, l4, l5a, l5b, l5, w1, w2, w3, w4, w5, Blurred_data], feed_dict={learning_rate: stepSize, p_keep_conv: 0.8, p_keep_hidden: 0.5}) l1aarr = np.swapaxes(l1aarr, 1, 2) l1aarr = np.swapaxes(l1aarr, 1, 3) l1arr = np.swapaxes(l1arr, 1, 2) l1arr = np.swapaxes(l1arr, 1, 3) l2aarr = np.swapaxes(l2aarr, 1, 2) l2aarr = np.swapaxes(l2aarr, 1, 3) l2arr = np.swapaxes(l2arr, 1, 2) l2arr = np.swapaxes(l2arr, 1, 3) l3aarr = np.swapaxes(l3aarr, 1, 2) l3aarr = np.swapaxes(l3aarr, 1, 3) l3arr = np.swapaxes(l3arr, 1, 2) l3arr = np.swapaxes(l3arr, 1, 3) l4aarr = np.swapaxes(l4aarr, 1, 2) l4aarr = np.swapaxes(l4aarr, 1, 3) l4arr = np.swapaxes(l4arr, 1, 2) l4arr = np.swapaxes(l4arr, 1, 3) l5aarr = np.swapaxes(l5aarr, 1, 2) l5aarr = np.swapaxes(l5aarr, 1, 3) l5barr = np.swapaxes(l5barr, 1, 2) l5barr = np.swapaxes(l5barr, 1, 3) l5arr = np.swapaxes(l5arr, 1, 2) l5arr = np.swapaxes(l5arr, 1, 3) w1o = np.swapaxes(w1o, 0, 2) w1o = np.swapaxes(w1o, 1, 3) w2o = np.swapaxes(w2o, 0, 2) w2o = np.swapaxes(w2o, 1, 3) w3o = np.swapaxes(w3o, 0, 2) w3o = np.swapaxes(w3o, 1, 3) w4o = np.swapaxes(w4o, 0, 2) w4o = np.swapaxes(w4o, 1, 3) w5o = np.swapaxes(w5o, 0, 2) w5o = np.swapaxes(w5o, 1, 3) B = np.swapaxes(B, 1, 2) B = np.swapaxes(B, 1, 3) k += 1 sum = 0 for n in range(1): for i in range(5): for j in range(5): result = B[0][n][i][j] * w1o[n][0][i][j] sum += result print(sum) if j*10+i == (1000/2)-1: stepSize = 0.1 * stepSize elif j*10+i == (1000*0.75)-1: stepSize = 0.1 * stepSize cost_val = sess.run(predict_op, feed_dict={learning_rate: stepSize, p_keep_conv: 1, p_keep_hidden: 1}) print("iter: ", j*10+i, "cost_val: ", cost_val) # Finish off the filename queue coordinator. coord.request_stop() coord.join(threads) elapsed = time.time() - t # tic toc 종료 print("running time:", elapsed)

69~71번째 줄은 Blurred image가 만들어지기위해 convolution을 수행하다 보니 0~255사이의 값을 가지던 픽셀값들이 0~255범위를 넘어서게되므로 이를 0~255 사이의 값으로 normalize 시키기 위해 추가한 코드이다. normalize를 위해 blurred image의 min, max 값을 이용하여 0~255사이의 정수로 나타냈다. (소수점이하는 버림)
또 다른 normalize 방법으로는 scikit-learn 라이브러리에 포함된 함수를 사용하여 편하게 해결할 수 있다. 아래의 코드는 normalize 뿐만 아니라 다른 편리한 preprocessing 함수들이 예제로 나와있다.
출처: http://iostream.tistory.com/111

# MNIST 데이터 관련 import from keras.datasets import mnist # MNIST 데이터 Loader from keras.utils.np_utils import to_categorical # One-hot 포맷 변환 import numpy as np # float type casting # Feature scaling 관련 import from sklearn.preprocessing import minmax_scale # [0-1] Scaling # Model 구축 관련 import from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.optimizers import adam # 데이터 Load 및 전처리 과정 # Train, Test 데이터 Load (X_train, y_train), (X_test, y_test) = mnist.load_data() # Train 데이터 포맷 변환 # 60000(Train Sample 수) * 28(가로) * 28(세로) 포맷을 # 60000(Train Sample 수) * 784(= 28 * 28) 포맷으로 수정 num_of_train_samples = X_train.shape[0] # Train Sample 수 width = X_train.shape[1] # 가로 길이 height = X_train.shape[2] # 세로 길이 X_train = X_train.reshape(num_of_train_samples, width * height) # Test 데이터 포맷 변환 # width, height는 Train 데이터와 같으므로 재사용 # 10000(Test Sample 수) * 28(가로) * 28(세로) 포맷을 # 10000(Test Sample 수) * 784(= 28 * 28) 포맷으로 수정 num_of_test_samples = X_test.shape[0] # Sample 수 X_test = X_test.reshape(num_of_test_samples, width * height) # Feature Scaling # X_train의 각 원소는 0-255 사이의 값을 가지고 있다 # Overfitting 방지 및 Cost 함수의 빠른 수렴을 위해서 # Feature Scaling 작업을 한다. # 예제에서는 0-255 범위를 0-1 범위로 Scaling # 참고: https://en.wikipedia.org/wiki/Feature_scaling # 나누기 연산이 들어가므로 uint8을 float64로 변환한다 X_train = X_train.astype(np.float64) X_test = X_test.astype(np.float64) # 간단한 방법은 MNIST가 0-255 사이 값만을 가진다는 것을 알기 때문에 # 단순히 255를 나눠도 Feature Scaling이 가능하다. # X_train = X_train / 255.0 # X_test = X_test / 255.0 # 아래 방법은 다소 복잡하지만 다른 데이터에서도 동일하게 적용할 수 있음 # Sample by featre matrix 형태이므로 axis=0로 설정 # axis=1은 축을 바꿔서 scaling, 자세한 내용은 scikit 문서 참조 X_train = minmax_scale(X_train, feature_range=(0, 1), axis=0) X_test = minmax_scale(X_test, feature_range=(0, 1), axis=0) # Lable의 categorical 값을 One-hot 형태로 변환 # 예를 들어 [1, 3, 2, 0] 를 # [[ 0., 1., 0., 0.], # [ 0., 0., 0., 1.], # [ 0., 0., 1., 0.], # [ 1., 0., 0., 0.]] # 로 변환하는 것을 One-hot 형태라고 함 # MNIST Label인 0 ~ 9사이의 10가지 값을 변환한다. y_train = to_categorical(y_train) y_test = to_categorical(y_test)

120번째 줄부터 123번째 줄은 tensorflow를 sess.run 시키고 난 뒤 tensor의 결과를 numpy.arrary로 받아내기 위한 코드이다.
125~158번째 줄 까지는 numpy.array의 차원을 바꿔 debuging 할 때 이미지를 픽셀단위의 값으로 직접 보기위해 차원의 순서를 바꿔준 명령어이다. 기존 차원은 (batchSize, Row, Colum, Channel) 이었으나 그냥 디버깅하면 Colum, Channel 만 보여주기 때문에 이를 (batchSize, Channel, Row, Colum)로 바꿔 픽셀별 값을 직접 확인하였다.
162~168까지는 layer의 convolution이 제대로 동작하고 있는지 확인하기위한 코드이다. 블러이미지 B와 w1o의 convolution 결과를 sum으로 출력해준다.

Convolution 결과분석
Convolution이 제대로 되고있는지 확인하기위해 모든 레이어의 output을 numpy.array로 출력하여 convolution이 제대로 동작하는 것을 확인했으나 convolution의 결과 값이 0이 되도록 Weight가 학습되어 ReLu를 지나면서 모든 픽셀값이 0으로 수렴하므로 학습이 되지 않는(코드에는 문제가 없으나) 것으로 결정지어 Network 구조를 Fully Convolution Network에서 VGGNet으로 바꾸기로 하였다. VGGNet이 Regression문제에 높은 성능을 보이기 때문이다.
개인적인 견해로는 현재까지의 코드가 하나의 픽셀 값이 softmax를 통하여 나오는 것이 아닌 MSE 를 Loss로 삼아 나온 결과이기 때문에 softmax를 이용하여 결과가 나오도록 한다면 가능성이 있다고 생각된다.

'Deep Learning Diary' 카테고리의 다른 글

7주차 (0)	2018.03.02
6주차 (0)	2018.02.19
3주차 (0)	2018.01.31
2주차 (0)	2018.01.22
1주차 (0)	2018.01.18

'Deep Learning Diary' Related Articles

save the world

4주차 본문

4주차

'Deep Learning Diary' 카테고리의 다른 글

티스토리툴바