๊ด€๋ฆฌ ๋ฉ”๋‰ด

Done is Better Than Perfect

[๋”ฅ๋Ÿฌ๋‹] 3. ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๊ตฌํ˜„ (์„ ํ˜• ํšŒ๊ท€, ๋น„์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ๊ตฌํ˜„) ๋ณธ๋ฌธ

๐Ÿค– AI/Deep Learning

[๋”ฅ๋Ÿฌ๋‹] 3. ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๊ตฌํ˜„ (์„ ํ˜• ํšŒ๊ท€, ๋น„์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ๊ตฌํ˜„)

jimingee 2024. 6. 8. 23:29

 

' ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๊ตฌํ˜„ ์ˆœ์„œ' ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

1. ๋ฐ์ดํ„ฐ์…‹ ์ค€๋น„ํ•˜๊ธฐ

2. ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ

3. ๋ชจ๋ธ ํ•™์Šต ์‹œํ‚ค๊ธฐ

4. ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธกํ•˜๊ธฐ

 

์•„๋ž˜์—์„œ๋Š” ๊ฐ ๋‹จ๊ณ„์—์„œ ํ•„์š”ํ•œ ๊ฐœ๋…์„ ์ญ‰ ํ›Œ์–ด๋ณธ ํ›„์—, tensorflow ์ฝ”๋“œ๋กœ ์„ ํ˜• ํšŒ๊ท€์™€ ๋น„์„ ํ˜• ํšŒ๊ท€๋ฅผ ๊ตฌํ˜„ํ•ด๋ณด๊ฒ ๋‹ค.

 


 

1. ๋ฐ์ดํ„ฐ์…‹ ์ค€๋น„ํ•˜๊ธฐ

  • epoch : ํ•œ ๋ฒˆ์˜ epoch๋Š” ์ „์ฒด ๋ฐ์ดํ„ฐ ์…‹์— ๋Œ€ํ•ด ํ•œ ๋ฒˆ ํ•™์Šต์„ ์™„๋ฃŒํ•œ ์ƒํƒœ
  • batch : ๋‚˜๋ˆ ์ง„ ๋ฐ์ดํ„ฐ์…‹ (๋ณดํ†ต mini-batch๋ผ ํ‘œํ˜„)
    • iteration๋Š” epoch๋ฅผ ๋‚˜๋ˆ„์–ด์„œ ์‹คํ–‰ํ•˜๋Š” ํšŸ์ˆ˜๋ฅผ ์˜๋ฏธํ•จ

 

 

2. ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ

 

[ keras ์˜ˆ์‹œ ์ฝ”๋“œ - ์•„๋ž˜ ๋‘๊ฐœ๋Š” ๋™์ผํ•œ ์ฝ”๋“œ์ž„ ]

model = tf.keras.models.Sequential([
	tf.keras.layers.Dense(10,input_dim=2, activation='sigmoid'),
    tf.keras.layers.Dense(10, activation='sigmoid'),
    tf.keras.layers.Dense(1, activation='sigmoid')
    ])
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(10,input_dim=2, activation='sigmoid'))
model.add(tf.keras.layers.Dense(10, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

 

 

 

3. ๋ชจ๋ธ ํ•™์Šต ์‹œํ‚ค๊ธฐ

 

[model].compile(optimizer, loss) : ๋ชจ๋ธ์˜ ํ•™์Šต์„ ์„ค์ •ํ•˜๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

  • optimizer : ๋ชจ๋ธ ํ•™์Šต ์ตœ์ ํ™” ๋ฐฉ๋ฒ•
  • loss : ์†์‹ค ํ•จ์ˆ˜ ์„ค์ •

[model].fit(x,y) : ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

  • x : ํ•™์Šต ๋ฐ์ดํ„ฐ
  • y : ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ label

[ ์˜ˆ์‹œ ์ฝ”๋“œ ]

model.compile(loss='mean_squared_error', optimizer='SGD')
model.fit(dataset, epochs=100)

 

 

4. ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธกํ•˜๊ธฐ

 

[model].evaluate(x,y) : ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

  • x : ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ
  • y : ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ label

[model].predict(x) : ๋ชจ๋ธ๋กœ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜

  • x : ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๋ฐ์ดํ„ฐ

[ ์˜ˆ์‹œ ์ฝ”๋“œ ]

# ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์ค€๋น„ํ•˜๊ธฐ
dataset_test = tf.data.Dataset.from_tensor_slices((data_test, labels_test))
dataset_test = dataset.batch(32)

# ๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธกํ•˜๊ธฐ
model.evaluate(dataset_test)
predicted_labels_test = model.predict(data_test)

 


๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๊ตฌํ˜„

tensorflow ์ฝ”๋“œ๋กœ ์„ ํ˜• ํšŒ๊ท€์™€ ๋น„์„ ํ˜• ํšŒ๊ท€๋ฅผ ๊ตฌํ˜„

 

 

1. ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ๊ตฌํ˜„

import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib as mpl 
import numpy as np
import os
np.random.seed(100)

'''
1. ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์˜ ํด๋ž˜์Šค๋ฅผ ๊ตฌํ˜„
   Step01. ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐ๊ฐ’์„ 1.5์˜ ๊ฐ’์„ ๊ฐ€์ง„ ๋ณ€์ˆ˜ ํ…์„œ๋กœ ์„ค์ •
   Step02. Bias ์ดˆ๊ธฐ๊ฐ’์„ 1.5์˜ ๊ฐ’์„ ๊ฐ€์ง„ ๋ณ€์ˆ˜ ํ…์„œ๋กœ ์„ค์ •
   Step03. W, X, b๋ฅผ ์‚ฌ์šฉํ•ด ์„ ํ˜• ๋ชจ๋ธ ๊ตฌํ˜„
'''
class LinearModel:
    
    def __init__(self):
        self.W = tf.Variable(initial_value=1.5)
        self.b = tf.Variable(initial_value=1.5)
        
    def __call__(self, X, Y):
        return X * self.W + self.b

''' 2. MSE๋ฅผ loss function์œผ๋กœ ์‚ฌ์šฉ '''
def loss(y, pred): 
    return tf.reduce_mean(tf.square(y-pred))

'''
3. gradient descent ๋ฐฉ์‹์œผ๋กœ ํ•™์Šตํ•˜๋Š” train ํ•จ์ˆ˜ ์ •์˜ - W(๊ฐ€์ค‘์น˜)์™€ b(Bias) ์—…๋ฐ์ดํŠธ
'''

def train(linear_model, x, y):
    
    with tf.GradientTape() as t:
        current_loss = loss(y, linear_model(x, y))
    
    learning_rate = 0.001
    
    # gradient ๊ฐ’ ๊ณ„์‚ฐ
    delta_W, delta_b = t.gradient(current_loss, [linear_model.W, linear_model.b])
    
    # learning rate์™€ ๊ณ„์‚ฐํ•œ gradient ๊ฐ’์„ ์ด์šฉํ•˜์—ฌ ์—…๋ฐ์ดํŠธํ•  ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€ํ™” ๊ฐ’ ๊ณ„์‚ฐ 
    W_update = (learning_rate * delta_W)
    b_update = (learning_rate * delta_b)
    
    return W_update,b_update
 
def main():
    
    # ๋ฐ์ดํ„ฐ ์ƒ์„ฑ
    x_data = np.linspace(0, 10, 50)
    y_data = 4 * x_data + np.random.randn(*x_data.shape)*4 + 3
    
    # ๋ฐ์ดํ„ฐ ์ถœ๋ ฅ
    plt.scatter(x_data,y_data)
    plt.show()
    
    # ์„ ํ˜• ํ•จ์ˆ˜ ์ ์šฉ
    linear_model = LinearModel()
    
    epochs = 100
    for epoch_count in range(epochs): # epoch ๊ฐ’๋งŒํผ ๋ชจ๋ธ ํ•™์Šต
        
        # ์„ ํ˜• ๋ชจ๋ธ์˜ ์˜ˆ์ธก ๊ฐ’ ์ €์žฅ
        y_pred_data = linear_model(x_data, y_data)
        
        # ์˜ˆ์ธก ๊ฐ’๊ณผ ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ’๊ณผ์˜ loss ํ•จ์ˆ˜ ๊ฐ’ ์ €์žฅ
        real_loss = loss(y_data, linear_model(x_data, y_data))
        
        # ํ˜„์žฌ์˜ ์„ ํ˜• ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ  loss ๊ฐ’์„ ์ค„์ด๋Š” ์ƒˆ๋กœ์šด ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๊ฐฑ์‹ ํ•  ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€ํ™” ๊ฐ’ ๊ณ„์‚ฐ
        update_W, update_b = train(linear_model, x_data, y_data)
        
        # ์„ ํ˜• ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜์™€ Bias ์—…๋ฐ์ดํŠธ
        linear_model.W.assign_sub(update_W)
        linear_model.b.assign_sub(update_b)
        
        if (epoch_count%20==0):
            print(f"Epoch count {epoch_count}: Loss value: {real_loss.numpy()}")
            print('W: {}, b: {}'.format(linear_model.W.numpy(), linear_model.b.numpy()))
            
            fig = plt.figure()
            ax1 = fig.add_subplot(111)
            ax1.scatter(x_data,y_data)
            ax1.plot(x_data,y_pred_data, color='red')
            plt.savefig('prediction.png')
            plt.show()

if __name__ == "__main__":
    main()

 

[ ์ฝ”๋“œ ์‹คํ–‰ ๊ฒฐ๊ณผ ]

  • epoch ์ง„ํ–‰๋  ์ˆ˜๋ก loss ๊ฐ’์ด ๋–จ์–ด์ง€๋ฏ€๋กœ, ํ•™์Šต์ด ์ž˜ ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ์Œ
  • Weight, bias ๊ฐ’์ด ์—…๋ฐ์ดํŠธ ๋จ

## output ##

Epoch count 0: Loss value: 250.49554443359375
W: 1.677997350692749, b: 1.527673602104187
Epoch count 20: Loss value: 28.3988094329834
W: 3.5059425830841064, b: 1.8219205141067505
Epoch count 40: Loss value: 15.571966171264648
W: 3.942619800567627, b: 1.907819151878357
Epoch count 60: Loss value: 14.813202857971191
W: 4.045225143432617, b: 1.9435327053070068
Epoch count 80: Loss value: 14.750727653503418
W: 4.067629814147949, b: 1.9670435190200806

 

 

 

2. ๋น„์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ ๊ตฌํ˜„

import tensorflow as tf
import numpy as np
from visual import *
import os

np.random.seed(100)
tf.random.set_seed(100)

def main():
    
    # ๋น„์„ ํ˜• ๋ฐ์ดํ„ฐ ์ƒ์„ฑ
    x_data = np.linspace(0, 10, 100)
    y_data = 1.5 * x_data**2 -12 * x_data + np.random.randn(*x_data.shape)*2 + 0.5
    
    
    ''' 1. ๋‹ค์ธต ํผ์…‰ํŠธ๋ก  ๋ชจ๋ธ ์ƒ์„ฑ '''
    # units : ๋ ˆ์ด์–ด์•ˆ์˜ ๋…ธ๋“œ ์ˆ˜
    # activation : ์ ์šฉํ•  activation function
    model = tf.keras.models.Sequential([
        tf.keras.layers.Dense(units=20, input_dim=1, activation='relu'),
        tf.keras.layers.Dense(units=20, activation='relu'),
        tf.keras.layers.Dense(units=1)
    ])
    
    ''' 2. ๋ชจ๋ธ ํ•™์Šต ๋ฐฉ๋ฒ• ์„ค์ • '''
    # ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ฌ ์†์‹ค ํ•จ์ˆ˜(loss function) ๊ณ„์‚ฐ ๋ฐฉ๋ฒ•๊ณผ ์ตœ์ ํ™”(optimize) ๋ฐฉ๋ฒ• ์„ค์ • 
    model.compile(loss = 'mean_squared_error', optimizer = 'adam')
    
    
    ''' 3. ๋ชจ๋ธ ํ•™์Šต '''
    # ์ƒ์„ฑํ•œ ๋ชจ๋ธ์„ 500 epochs ๋งŒํผ ํ•™์Šต์‹œํ‚ด. verbose๋Š” ๋ชจ๋ธ ํ•™์Šต ๊ณผ์ • ์ •๋ณด๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž์„ธํžˆ ์ถœ๋ ฅํ• ์ง€๋ฅผ ์„ค์ •ํ•จ
    history = model.fit(x_data, y_data, epochs=500, verbose=2)
    
    ''' 4. ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก๊ฐ’ ์ƒ์„ฑ ๋ฐ ์ €์žฅ '''
    # ํ•™์Šตํ•œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ x_data์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’ ์ƒ์„ฑ
    
    predictions = model.predict(x_data)
    Visualize(x_data, y_data, predictions)
    
    return history, model

if __name__ == '__main__':
    main()

 

[ ์ฝ”๋“œ ์‹คํ–‰ ๊ฒฐ๊ณผ ]

  • epoch ์ง„ํ–‰๋  ์ˆ˜๋ก loss ๊ฐ’์ด ๋–จ์–ด์ง€๋ฏ€๋กœ, ํ•™์Šต์ด ์ž˜ ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ์Œ
  • Weight, bias ๊ฐ’์ด ์—…๋ฐ์ดํŠธ ๋จ

Epoch 1/500
100/100 - 0s - loss: 290.6330
Epoch 250/500
100/100 - 0s - loss: 86.8522
Epoch 500/500
100/100 - 0s - loss: 15.2892

 

Comments